Deep learning

Cross-modality sub-image retrieval using contrastive multimodal image representations

Tue, 2024-08-13 06:00

Sci Rep. 2024 Aug 13;14(1):18798. doi: 10.1038/s41598-024-68800-1.

ABSTRACT

In tissue characterization and cancer diagnostics, multimodal imaging has emerged as a powerful technique. Thanks to computational advances, large datasets can be exploited to discover patterns in pathologies and improve diagnosis. However, this requires efficient and scalable image retrieval methods. Cross-modality image retrieval is particularly challenging, since images of similar (or even the same) content captured by different modalities might share few common structures. We propose a new application-independent content-based image retrieval (CBIR) system for reverse (sub-)image search across modalities, which combines deep learning to generate representations (embedding the different modalities in a common space) with robust feature extraction and bag-of-words models for efficient and reliable retrieval. We illustrate its advantages through a replacement study, exploring a number of feature extractors and learned representations, as well as through comparison to recent (cross-modality) CBIR methods. For the task of (sub-)image retrieval on a (publicly available) dataset of brightfield and second harmonic generation microscopy images, the results show that our approach is superior to all tested alternatives. We discuss the shortcomings of the compared methods and observe the importance of equivariance and invariance properties of the learned representations and feature extractors in the CBIR pipeline. Code is available at: https://github.com/MIDA-group/CrossModal_ImgRetrieval .

PMID:39138271 | DOI:10.1038/s41598-024-68800-1

Categories: Literature Watch

ENTRANT: A Large Financial Dataset for Table Understanding

Tue, 2024-08-13 06:00

Sci Data. 2024 Aug 13;11(1):876. doi: 10.1038/s41597-024-03605-5.

ABSTRACT

Tabular data is a way to structure, organize, and present information conveniently and effectively. Real-world tables present data in two dimensions by arranging cells in matrices that summarize information and facilitate side-by-side comparisons. Recent research efforts aim to train large models to understand structured tables, a process that enables knowledge transfer in various downstream tasks. Model pre-training, though, requires large datasets, conveniently formatted to reflect cell and table characteristics. This paper presents ENTRANT, a financial dataset that comprises millions of tables, which are transformed to reflect cell attributes, as well as positional and hierarchical information. Hence, they facilitate, among other things, pre-training tasks for table understanding with deep learning methods. The dataset provides table and cell information along with the corresponding metadata in a machine-readable format. We have automated all data processing and curation and technically validated the dataset through unit testing of high code coverage. Finally, we demonstrate the use of the dataset in a pre-training task of a state-of-the-art model, which we use for downstream cell classification.

PMID:39138234 | DOI:10.1038/s41597-024-03605-5

Categories: Literature Watch

End-to-end reproducible AI pipelines in radiology using the cloud

Tue, 2024-08-13 06:00

Nat Commun. 2024 Aug 13;15(1):6931. doi: 10.1038/s41467-024-51202-2.

ABSTRACT

Artificial intelligence (AI) algorithms hold the potential to revolutionize radiology. However, a significant portion of the published literature lacks transparency and reproducibility, which hampers sustained progress toward clinical translation. Although several reporting guidelines have been proposed, identifying practical means to address these issues remains challenging. Here, we show the potential of cloud-based infrastructure for implementing and sharing transparent and reproducible AI-based radiology pipelines. We demonstrate end-to-end reproducibility from retrieving cloud-hosted data, through data pre-processing, deep learning inference, and post-processing, to the analysis and reporting of the final results. We successfully implement two distinct use cases, starting from recent literature on AI-based biomarkers for cancer imaging. Using cloud-hosted data and computing, we confirm the findings of these studies and extend the validation to previously unseen data for one of the use cases. Furthermore, we provide the community with transparent and easy-to-extend examples of pipelines impactful for the broader oncology field. Our approach demonstrates the potential of cloud resources for implementing, sharing, and using reproducible and transparent AI pipelines, which can accelerate the translation into clinical solutions.

PMID:39138215 | DOI:10.1038/s41467-024-51202-2

Categories: Literature Watch

Comparative Analysis of the Diagnostic Value of S-Detect Technology in Different Planes Versus the BI-RADS Classification for Breast Lesions

Tue, 2024-08-13 06:00

Acad Radiol. 2024 Aug 12:S1076-6332(24)00568-3. doi: 10.1016/j.acra.2024.08.005. Online ahead of print.

ABSTRACT

RATIONALE AND OBJECTIVES: S-Detect, a deep learning-based Computer-Aided Detection system, is recognized as an important tool for diagnosing breast lesions using ultrasound imaging. However, it may exhibit inconsistent findings across multiple imaging planes. This study aims to evaluate the diagnostic performance of S-Detect in different planes and identify factors contributing to these inconsistencies.

MATERIALS AND METHODS: A retrospective cohort study was conducted on 711 patients with 756 breast lesions between January 2019 and January 2022. S-Detect was utilized to assess lesions in radial and anti-radial planes. BI-RADS classifications were employed for comparative analysis. The diagnostic performance was compared within each group, and p-values were computed for intergroup comparisons. Univariable and multivariable analyses were conducted to identify factors contributing to diagnostic inconsistency in S-Detect across planes.

RESULTS: Among 756 breast lesions, 668 (88.4%) exhibited consistent S-Detect outcomes across planes while 88 (11.6%) were inconsistent. In the consistent group, the diagnostic accuracy and area under the curve (AUC) of S-Detect were significantly higher than those of BI-RADS (accuracy: 91.2% vs. 84.9%, p = 0.045; AUC: 0.916 vs. 0.859, p = 0.036). In the inconsistent group, the diagnostic accuracy and AUC of S-Detect in radial and anti-radial planes were lower than those of BI-RADS (accuracy: 47.7% for radial, 52.2% for anti-radial vs. 69.3% for BI-RADS, p = 0.014, p-anti = 0.039; AUC: 0.503 for radial, 0.497 for anti-radial vs. 0.739 for BI-RADS, p = 0.042, p-anti <0.001). Diagnostic inconsistency in S-Detect across planes was significantly associated with lesion size, indistinct or angular margins, and enhancement posterior acoustic features (p < 0.05).

CONCLUSION: S-Detect has outperformed BI-RADS in diagnostic precision under conditions of inter-planar concordance. However, its diagnostic efficacy is compromised in scenarios of inter-planar discordance. Under these circumstances, the results of S-Detect should be carefully referenced.

PMID:39138111 | DOI:10.1016/j.acra.2024.08.005

Categories: Literature Watch

Reply-letter to the editor

Tue, 2024-08-13 06:00

Clin Nutr. 2024 Aug 7:S0261-5614(24)00275-9. doi: 10.1016/j.clnu.2024.07.046. Online ahead of print.

NO ABSTRACT

PMID:39138078 | DOI:10.1016/j.clnu.2024.07.046

Categories: Literature Watch

MGNDTI: A Drug-Target Interaction Prediction Framework Based on Multimodal Representation Learning and the Gating Mechanism

Tue, 2024-08-13 06:00

J Chem Inf Model. 2024 Aug 13. doi: 10.1021/acs.jcim.4c00957. Online ahead of print.

ABSTRACT

Drug-Target Interaction (DTI) prediction facilitates acceleration of drug discovery and promotes drug repositioning. Most existing deep learning-based DTI prediction methods can better extract discriminative features for drugs and proteins, but they rarely consider multimodal features of drugs. Moreover, learning the interaction representations between drugs and targets needs further exploration. Here, we proposed a simple M ulti-modal G ating N etwork for DTI prediction, MGNDTI, based on multimodal representation learning and the gating mechanism. MGNDTI first learns the sequence representations of drugs and targets using different retentive networks. Next, it extracts molecular graph features of drugs through a graph convolutional network. Subsequently, it devises a multimodal gating network to obtain the joint representations of drugs and targets. Finally, it builds a fully connected network for computing the interaction probability. MGNDTI was benchmarked against seven state-of-the-art DTI prediction models (CPI-GNN, TransformerCPI, MolTrans, BACPI, CPGL, GIFDTI, and FOTF-CPI) using four data sets (i.e., Human, C. elegans, BioSNAP, and BindingDB) under four different experimental settings. Through evaluation with AUROC, AUPRC, accuracy, F1 score, and MCC, MGNDTI significantly outperformed the above seven methods. MGNDTI is a powerful tool for DTI prediction, showcasing its superior robustness and generalization ability on diverse data sets and different experimental settings. It is freely available at https://github.com/plhhnu/MGNDTI.

PMID:39137398 | DOI:10.1021/acs.jcim.4c00957

Categories: Literature Watch

Insertable Glucose Sensor Using a Compact and Cost-Effective Phosphorescence Lifetime Imager and Machine Learning

Tue, 2024-08-13 06:00

ACS Nano. 2024 Aug 13. doi: 10.1021/acsnano.4c06527. Online ahead of print.

ABSTRACT

Optical continuous glucose monitoring (CGM) systems are emerging for personalized glucose management owing to their lower cost and prolonged durability compared to conventional electrochemical CGMs. Here, we report a computational CGM system, which integrates a biocompatible phosphorescence-based insertable biosensor and a custom-designed phosphorescence lifetime imager (PLI). This compact and cost-effective PLI is designed to capture phosphorescence lifetime images of an insertable sensor through the skin, where the lifetime of the emitted phosphorescence signal is modulated by the local concentration of glucose. Because this phosphorescence signal has a very long lifetime compared to tissue autofluorescence or excitation leakage processes, it completely bypasses these noise sources by measuring the sensor emission over several tens of microseconds after the excitation light is turned off. The lifetime images acquired through the skin are processed by neural network-based models for misalignment-tolerant inference of glucose levels, accurately revealing normal, low (hypoglycemia) and high (hyperglycemia) concentration ranges. Using a 1 mm thick skin phantom mimicking the optical properties of human skin, we performed in vitro testing of the PLI using glucose-spiked samples, yielding 88.8% inference accuracy, also showing resilience to random and unknown misalignments within a lateral distance of ∼4.7 mm with respect to the position of the insertable sensor underneath the skin phantom. Furthermore, the PLI accurately identified larger lateral misalignments beyond 5 mm, prompting user intervention for realignment. The misalignment-resilient glucose concentration inference capability of this compact and cost-effective PLI makes it an appealing wearable diagnostics tool for real-time tracking of glucose and other biomarkers.

PMID:39137319 | DOI:10.1021/acsnano.4c06527

Categories: Literature Watch

On-board synthetic 4D MRI generation from 4D CBCT for radiotherapy of abdominal tumors: A feasibility study

Tue, 2024-08-13 06:00

Med Phys. 2024 Aug 13. doi: 10.1002/mp.17347. Online ahead of print.

ABSTRACT

BACKGROUND: Magnetic resonance-guided radiotherapy with an MR-guided LINAC represents potential clinical benefits in abdominal treatments due to the superior soft-tissue contrast compared to kV-based images in conventional treatment units. However, due to the high cost associated with this technology, only a few centers have access to it. As an alternative, synthetic 4D MRI generation based on artificial intelligence methods could be implemented. Nevertheless, appropriate MRI texture generation from CT images might be challenging and prone to hallucinations, compromising motion accuracy.

PURPOSE: To evaluate the feasibility of on-board synthetic motion-resolved 4D MRI generation from prior 4D MRI, on-board 4D cone beam CT (CBCT) images, motion modeling information, and deep learning models using the digital anthropomorphic phantom XCAT.

METHODS: The synthetic 4D MRI corresponds to phases from on-board 4D CBCT. Each synthetic MRI volume in the 4D MRI was generated by warping a reference 3D MRI (MRIref, end of expiration phase from a prior 4D MRI) with a deformation field map (DFM) determined by (I) the eigenvectors from the principal component analysis (PCA) motion-modeling of the prior 4D MRI, and (II) the corresponding eigenvalues predicted by a convolutional neural network (CNN) model using the on-board 4D CBCT images as input. The CNN was trained with 1000 deformations of one reference CT (CTref, same conditions as MRIref) generated by applying 1000 DFMs computed by randomly sampling the original eigenvalues from the prior 4D MRI PCA model. The evaluation metrics for the CNN model were root-mean-square error (RMSE) and mean absolute error (MAE). Finally, different on-board 4D-MRI generation scenarios were assessed by changing the respiratory period, the amplitude of the diaphragm, and the chest wall motion of the 4D CBCT using normalized root-mean-square error (nRMSE) and structural similarity index measure (SSIM) for image-based evaluation, and volume dice coefficient (VDC), volume percent difference (VPD), and center-of-mass shift (COMS) for contour-based evaluation of liver and target volumes.

RESULTS: The RMSE and MAE values of the CNN model reported 0.012 ± 0.001 and 0.010 ± 0.001, respectively for the first eigenvalue predictions. SSIM and nRMSE were 0.96 ± 0.06 and 0.22 ± 0.08, respectively. VDC, VPD, and COMS were 0.92 ± 0.06, 3.08 ± 3.73 %, and 2.3 ± 2.1 mm, respectively, for the target volume. The more challenging synthetic 4D-MRI generation scenario was for one 4D-CBCT with increased chest wall motion amplitude, reporting SSIM and nRMSE of 0.82 and 0.51, respectively.

CONCLUSIONS: On-board synthetic 4D-MRI generation based on predicting actual treatment deformation from on-board 4D-CBCT represents a method that can potentially improve the treatment-setup localization in abdominal radiotherapy treatments with a conventional kV-based LINAC.

PMID:39137256 | DOI:10.1002/mp.17347

Categories: Literature Watch

A hybrid TCN-GRU model for classifying human activities using smartphone inertial signals

Tue, 2024-08-13 06:00

PLoS One. 2024 Aug 13;19(8):e0304655. doi: 10.1371/journal.pone.0304655. eCollection 2024.

ABSTRACT

Recognising human activities using smart devices has led to countless inventions in various domains like healthcare, security, sports, etc. Sensor-based human activity recognition (HAR), especially smartphone-based HAR, has become popular among the research community due to lightweight computation and user privacy protection. Deep learning models are the most preferred solutions in developing smartphone-based HAR as they can automatically capture salient and distinctive features from input signals and classify them into respective activity classes. However, in most cases, the architecture of these models needs to be deep and complex for better classification performance. Furthermore, training these models requires extensive computational resources. Hence, this research proposes a hybrid lightweight model that integrates an enhanced Temporal Convolutional Network (TCN) with Gated Recurrent Unit (GRU) layers for salient spatiotemporal feature extraction without tedious manual feature extraction. Essentially, dilations are incorporated into each convolutional kernel in the TCN-GRU model to extend the kernel's field of view without imposing additional model parameters. Moreover, fewer short filters are applied for each convolutional layer to alleviate excess parameters. Despite reducing computational cost, the proposed model utilises dilations, residual connections, and GRU layers for longer-term time dependency modelling by retaining longer implicit features of the input inertial sequences throughout training to provide sufficient information for future prediction. The performance of the TCN-GRU model is verified on two benchmark smartphone-based HAR databases, i.e., UCI HAR and UniMiB SHAR. The model attains promising accuracy in recognising human activities with 97.25% on UCI HAR and 93.51% on UniMiB SHAR. Since the current study exclusively works on the inertial signals captured by smartphones, future studies will explore the generalisation of the proposed TCN-GRU across diverse datasets, including various sensor types, to ensure its adaptability across different applications.

PMID:39137226 | DOI:10.1371/journal.pone.0304655

Categories: Literature Watch

Prompt-driven Latent Domain Generalization for Medical Image Classification

Tue, 2024-08-13 06:00

IEEE Trans Med Imaging. 2024 Aug 13;PP. doi: 10.1109/TMI.2024.3443119. Online ahead of print.

ABSTRACT

Deep learning models for medical image analysis easily suffer from distribution shifts caused by dataset artifact bias, camera variations, differences in the imaging station, etc., leading to unreliable diagnoses in real-world clinical settings. Domain generalization (DG) methods, which aim to train models on multiple domains to perform well on unseen domains, offer a promising direction to solve the problem. However, existing DG methods assume domain labels of each image are available and accurate, which is typically feasible for only a limited number of medical datasets. To address these challenges, we propose a unified DG framework for medical image classification without relying on domain labels, called Prompt-driven Latent Domain Generalization (PLDG). PLDG consists of unsupervised domain discovery and prompt learning. This framework first discovers pseudo domain labels by clustering the bias-associated style features, then leverages collaborative domain prompts to guide a Vision Transformer to learn knowledge from discovered diverse domains. To facilitate cross-domain knowledge learning between different prompts, we introduce a domain prompt generator that enables knowledge sharing between domain prompts and a shared prompt. A domain mixup strategy is additionally employed for more flexible decision margins and mitigates the risk of incorrect domain assignments. Extensive experiments on three medical image classification tasks and one debiasing task demonstrate that our method can achieve comparable or even superior performance than conventional DG algorithms without relying on domain labels. Our code is publicly available at https://github.com/SiyuanYan1/PLDG/tree/main.

PMID:39137089 | DOI:10.1109/TMI.2024.3443119

Categories: Literature Watch

Deep Learning in Gene Regulatory Network Inference: A Survey

Tue, 2024-08-13 06:00

IEEE/ACM Trans Comput Biol Bioinform. 2024 Aug 13;PP. doi: 10.1109/TCBB.2024.3442536. Online ahead of print.

ABSTRACT

Understanding the intricate regulatory relationships among genes is crucial for comprehending the development, differentiation, and cellular response in living systems. Consequently, inferring gene regulatory networks (GRNs) based on observed data has gained significant attention as a fundamental goal in biological applications. The proliferation and diversification of available data present both opportunities and challenges in accurately inferring GRNs. Deep learning, a highly successful technique in various domains, holds promise in aiding GRN inference. Several GRN inference methods employing deep learning models have been proposed; however, the selection of an appropriate method remains a challenge for life scientists. In this survey, we provide a comprehensive analysis of 12 GRN inference methods that leverage deep learning models. We trace the evolution of these major methods and categorize them based on the types of applicable data. We delve into the core concepts and specific steps of each method, offering a detailed evaluation of their effectiveness and scalability across different scenarios. These insights enable us to make informed recommendations. Moreover, we explore the challenges faced by GRN inference methods utilizing deep learning and discuss future directions, providing valuable suggestions for the advancement of data scientists in this field.

PMID:39137088 | DOI:10.1109/TCBB.2024.3442536

Categories: Literature Watch

GKE-TUNet: Geometry-Knowledge Embedded TransUNet Model for Retinal Vessel Segmentation Considering Anatomical Topology

Tue, 2024-08-13 06:00

IEEE J Biomed Health Inform. 2024 Aug 13;PP. doi: 10.1109/JBHI.2024.3442528. Online ahead of print.

ABSTRACT

Automated retinal vessel segmentation is crucial for computer-aided clinical diagnosis and retinopathy screening. However, deep learning faces challenges in extracting complex intertwined structures and subtle small vessels from densely vascularized regions. To address these issues, we propose a novel segmentation model, called Geometry-Knowledge Embedded TransUNet (GKE-TUNet), which incorporates explicit embedding of topological features of retinal vessel anatomy. In the proposed GKE-TUNet model, a skeleton extraction network is pre-trained to extract the anatomical topology of retinal vessels from refined segmentation labels. During vessel segmentation, the dense skeleton graph is sampled as a graph of key-points and connections and is incorporated into the skip connection layer of TransUNet. The graph vertices are used as node features and correspond to positions in the low-level feature maps. The graph attention network (GAT) is used as the graph convolution backbone network to capture the shape semantics of vessels and the interaction of key locations along the topological direction. Finally, the node features obtained by graph convolution are read out as a sparse feature map based on their corresponding spatial coordinates. To address the problem of sparse feature maps, we employ convolution operators to fuse sparse feature maps with low-level dense feature maps. This fusion is weighted and connected to deep feature maps. Experimental results on the DRIVE, CHASE-DB1, and STARE datasets demonstrate the competitiveness of our proposed method compared to existing ones.

PMID:39137084 | DOI:10.1109/JBHI.2024.3442528

Categories: Literature Watch

RVDLAHA: An RISC-V DLA Hardware Architecture for On-Device Real-Time Seizure Detection and Personalization in Wearable Applications

Tue, 2024-08-13 06:00

IEEE Trans Biomed Circuits Syst. 2024 Aug 13;PP. doi: 10.1109/TBCAS.2024.3442250. Online ahead of print.

ABSTRACT

Epilepsy is a globally distributed chronic neurological disorder that may pose a threat to life without warning. Therefore, the use of wearable devices for real-time detection and treatment of epilepsy is crucial. Additionally, personalizing disease detection algorithms for individual users is also a challenge in clinical applications. Some studies have proposed seizure detection algorithms with convolutional neural networks (CNNs) and programmable hardware architectures for speeding up the process of CNN inference. However, personalizing seizure detection algorithms could still not be performed on these hardware architectures. Consequently, this study proposes three key contributions to address the challenges: a real-time seizure detection and personalization algorithm, a programmable reduced instruction set computer-V (RISC-V) deep learning accelerator (DLA) hardware architecture (RVDLAHA), and a dedicated RISC-V DLA (RVDLA) compiler. In animal experiments with lab rats, the proposed CNN-based seizure detection algorithm obtains an accuracy of 99.5% for a 32-bit floating point and an accuracy of 99.3% for a 16-bit fixed point. Additionally, the proposed personalization algorithm increases the testing accuracy across different databases from 85.0% to 92.9%. The RVDLAHA is implemented on Xilinx PYNQ-Z2, with a power consumption of only 0.107 W at an operating frequency of 1 MHz. Each step, including raw data input, preprocessing, detection, and personalization, requires only 17.8, 1.0, 1.1, and 1.3 ms, respectively. With the hardware architecture, the seizure detection and personalization algorithm can provide on-device real-time monitoring.

PMID:39137083 | DOI:10.1109/TBCAS.2024.3442250

Categories: Literature Watch

A New Brain Network Construction Paradigm for Brain Disorder Via Diffusion-Based Graph Contrastive Learning

Tue, 2024-08-13 06:00

IEEE Trans Pattern Anal Mach Intell. 2024 Aug 13;PP. doi: 10.1109/TPAMI.2024.3442811. Online ahead of print.

ABSTRACT

Brain network analysis plays an increasingly important role in studying brain function and the exploring of disease mechanisms. However, existing brain network construction tools have some limitations, including dependency on empirical users, weak consistency in repeated experiments and time-consuming processes. In this work, a diffusion-based brain network pipeline, DGCL is designed for end-to-end construction of brain networks. Initially, the brain region-aware module (BRAM) precisely determines the spatial locations of brain regions by the diffusion process, avoiding subjective parameter selection. Subsequently, DGCL employs graph contrastive learning to optimize brain connections by eliminating individual differences in redundant connections unrelated to diseases, thereby enhancing the consistency of brain networks within the same group. Finally, the node-graph contrastive loss and classification loss jointly constrain the learning process of the model to obtain the reconstructed brain network, which is then used to analyze important brain connections. Validation on two datasets, ADNI and ABIDE, demonstrates that DGCL surpasses traditional methods and other deep learning models in predicting disease development stages. Significantly, the proposed model improves the efficiency and generalization of brain network construction. In summary, the proposed DGCL can be served as a universal brain network construction scheme, which can effectively identify important brain connections through generative paradigms and has the potential to provide disease interpretability support for neuroscience research.

PMID:39137077 | DOI:10.1109/TPAMI.2024.3442811

Categories: Literature Watch

Microwave detection technique combined with deep learning algorithm facilitates quantitative analysis of heavy metal Pb residues in edible oils

Tue, 2024-08-13 06:00

J Food Sci. 2024 Aug 13. doi: 10.1111/1750-3841.17259. Online ahead of print.

ABSTRACT

The heavy metal content in edible oils is intricately associated with their suitability for human consumption. In this study, standard soybean oil was used as a sample to quantify the specified concentration of heavy metals using microwave sensing technique. In addition, an attention-based deep residual neural network model was developed as an alternative to traditional modeling methods for predicting heavy metals in edible oils. In the process of microwave data processing, this work continued to discuss the impact of depth on convolutional neural networks. The results demonstrated that the proposed attention-based residual network model outperforms all other deep learning models in all metrics. The performance of this model was characterized by a coefficient of determination (R2) of 0.9605, a relative prediction deviation (RPD) of 5.0479, and a root mean square error (RMSE) of 3.1654 mg/kg. The research findings indicate that the combination of microwave detection technology and chemometrics holds significant potential for assessing heavy metal levels in edible oils.

PMID:39136980 | DOI:10.1111/1750-3841.17259

Categories: Literature Watch

The Usefulness of Low-Kiloelectron Volt Virtual Monochromatic Contrast-Enhanced Computed Tomography with Deep Learning Image Reconstruction Technique in Improving the Delineation of Pancreatic Ductal Adenocarcinoma

Tue, 2024-08-13 06:00

J Imaging Inform Med. 2024 Aug 13. doi: 10.1007/s10278-024-01214-7. Online ahead of print.

ABSTRACT

To evaluate the usefulness of low-keV multiphasic computed tomography (CT) with deep learning image reconstruction (DLIR) in improving the delineation of pancreatic ductal adenocarcinoma (PDAC) compared to conventional hybrid iterative reconstruction (HIR). Thirty-five patients with PDAC who underwent multiphasic CT were retrospectively evaluated. Raw data were reconstructed with two energy levels (40 keV and 70 keV) of virtual monochromatic imaging (VMI) using HIR (ASiR-V50%) and DLIR (TrueFidelity-H). Contrast-to-noise ratio (CNRtumor) was calculated from the CT values within regions of interest in tumor and normal pancreas in the pancreatic parenchymal phase images. Lesion conspicuity of PDAC in pancreatic parenchymal phase on 40-keV HIR, 40-keV DLIR, and 70-keV DLIR images was qualitatively rated on a 5-point scale, using 70-keV HIR images as reference (score 1 = poor; score 3 = equivalent to reference; score 5 = excellent) by two radiologists. CNRtumor of 40-keV DLIR images (median 10.4, interquartile range (IQR) 7.8-14.9) was significantly higher than that of the other VMIs (40 keV HIR, median 6.2, IQR 4.4-8.5, P < 0.0001; 70-keV DLIR, median 6.3, IQR 5.1-9.9, P = 0.0002; 70-keV HIR, median 4.2, IQR 3.1-6.1, P < 0.0001). CNRtumor of 40-keV DLIR images were significantly better than those of the 40-keV HIR and 70-keV HIR images by 72 ± 22% and 211 ± 340%, respectively. Lesion conspicuity scores on 40-keV DLIR images (observer 1, 4.5 ± 0.7; observer 2, 3.4 ± 0.5) were significantly higher than on 40-keV HIR (observer 1, 3.3 ± 0.9, P < 0.0001; observer 2, 3.1 ± 0.4, P = 0.013). DLIR is a promising reconstruction method to improve PDAC delineation in 40-keV VMI at the pancreatic parenchymal phase compared to conventional HIR.

PMID:39136827 | DOI:10.1007/s10278-024-01214-7

Categories: Literature Watch

A Deep-Learning-Enabled Electrocardiogram and Chest X-Ray for Detecting Pulmonary Arterial Hypertension

Tue, 2024-08-13 06:00

J Imaging Inform Med. 2024 Aug 13. doi: 10.1007/s10278-024-01225-4. Online ahead of print.

ABSTRACT

The diagnosis and treatment of pulmonary hypertension have changed dramatically through the re-defined diagnostic criteria and advanced drug development in the past decade. The application of Artificial Intelligence for the detection of elevated pulmonary arterial pressure (ePAP) was reported recently. Artificial Intelligence (AI) has demonstrated the capability to identify ePAP and its association with hospitalization due to heart failure when analyzing chest X-rays (CXR). An AI model based on electrocardiograms (ECG) has shown promise in not only detecting ePAP but also in predicting future risks related to cardiovascular mortality. We aimed to develop an AI model integrating ECG and CXR to detect ePAP and evaluate their performance. We developed a deep-learning model (DLM) using paired ECG and CXR to detect ePAP (systolic pulmonary artery pressure > 50 mmHg in transthoracic echocardiography). This model was further validated in a community hospital. Additionally, our DLM was evaluated for its ability to predict future occurrences of left ventricular dysfunction (LVD, ejection fraction < 35%) and cardiovascular mortality. The AUCs for detecting ePAP were as follows: 0.8261 with ECG (sensitivity 76.6%, specificity 74.5%), 0.8525 with CXR (sensitivity 82.8%, specificity 72.7%), and 0.8644 with a combination of both (sensitivity 78.6%, specificity 79.2%) in the internal dataset. In the external validation dataset, the AUCs for ePAP detection were 0.8348 with ECG, 0.8605 with CXR, and 0.8734 with the combination. Furthermore, using the combination of ECGs and CXR, the negative predictive value (NPV) was 98% in the internal dataset and 98.1% in the external dataset. Patients with ePAP detected by the DLM using combination had a higher risk of new-onset LVD with a hazard ratio (HR) of 4.51 (95% CI: 3.54-5.76) in the internal dataset and cardiovascular mortality with a HR of 6.08 (95% CI: 4.66-7.95). Similar results were seen in the external validation dataset. The DLM, integrating ECG and CXR, effectively detected ePAP with a strong NPV and forecasted future risks of developing LVD and cardiovascular mortality. This model has the potential to expedite the early identification of pulmonary hypertension in patients, prompting further evaluation through echocardiography and, when necessary, right heart catheterization (RHC), potentially resulting in enhanced cardiovascular outcomes.

PMID:39136826 | DOI:10.1007/s10278-024-01225-4

Categories: Literature Watch

Deep learning-aided respiratory motion compensation in PET/CT: addressing motion induced resolution loss, attenuation correction artifacts and PET-CT misalignment

Tue, 2024-08-13 06:00

Eur J Nucl Med Mol Imaging. 2024 Aug 13. doi: 10.1007/s00259-024-06872-x. Online ahead of print.

ABSTRACT

PURPOSE: Respiratory motion (RM) significantly impacts image quality in thoracoabdominal PET/CT imaging. This study introduces a unified data-driven respiratory motion correction (uRMC) method, utilizing deep learning neural networks, to solve all the major issues caused by RM, i.e., PET resolution loss, attenuation correction artifacts, and PET-CT misalignment.

METHODS: In a retrospective study, 737 patients underwent [18F]FDG PET/CT scans using the uMI Panorama PET/CT scanner. Ninety-nine patients, who also had respiration monitoring device (VSM), formed the validation set. The remaining data of the 638 patients were used to train neural networks used in the uRMC. The uRMC primarily consists of three key components: (1) data-driven respiratory signal extraction, (2) attenuation map generation, and (3) PET-CT alignment. SUV metrics were calculated within 906 lesions for three approaches, i.e., data-driven uRMC (proposed), VSM-based uRMC, and OSEM without motion correction (NMC). RM magnitude of major organs were estimated.

RESULTS: uRMC enhanced diagnostic capabilities by revealing previously undetected lesions, sharpening lesion contours, increasing SUV values, and improving PET-CT alignment. Compared to NMC, uRMC showed increases of 10% and 17% in SUVmax and SUVmean across 906 lesions. Sub-group analysis showed significant SUV increases in small and medium-sized lesions with uRMC. Minor differences were found between VSM-based and data-driven uRMC methods, with the SUVmax was found statistically marginal significant or insignificant between the two methods. The study observed varied motion amplitudes in major organs, typically ranging from 10 to 20 mm.

CONCLUSION: A data-driven solution for respiratory motion in PET/CT has been developed, validated and evaluated. To the best of our knowledge, this is the first unified solution that compensates for the motion blur within PET, the attenuation mismatch artifacts caused by PET-CT misalignment, and the misalignment between PET and CT.

PMID:39136740 | DOI:10.1007/s00259-024-06872-x

Categories: Literature Watch

Transformers for Molecular Property Prediction: Lessons Learned from the Past Five Years

Tue, 2024-08-13 06:00

J Chem Inf Model. 2024 Aug 13. doi: 10.1021/acs.jcim.4c00747. Online ahead of print.

ABSTRACT

Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill insights from current research on employing transformer models for MPP. We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pretraining data, optimal architecture selections, and promising pretraining objectives. Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field's understanding. Additionally, we address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.

PMID:39136669 | DOI:10.1021/acs.jcim.4c00747

Categories: Literature Watch

Automatic deep learning segmentation of the hippocampus on high-resolution diffusion magnetic resonance imaging and its application to the healthy lifespan

Tue, 2024-08-13 06:00

NMR Biomed. 2024 Aug 13:e5227. doi: 10.1002/nbm.5227. Online ahead of print.

ABSTRACT

Diffusion tensor imaging (DTI) can provide unique contrast and insight into microstructural changes with age or disease of the hippocampus, although it is difficult to measure the hippocampus because of its comparatively small size, location, and shape. This has been markedly improved by the advent of a clinically feasible 1-mm isotropic resolution 6-min DTI protocol at 3 T of the hippocampus with limited brain coverage of 20 axial-oblique slices aligned along its long axis. However, manual segmentation is too laborious for large population studies, and it cannot be automatically segmented directly on the diffusion images using traditional T1 or T2 image-based methods because of the limited brain coverage and different contrast. An automatic method is proposed here that segments the hippocampus directly on high-resolution diffusion images based on an extension of well-known deep learning architectures like UNet and UNet++ by including additional dense residual connections. The method was trained on 100 healthy participants with previously performed manual segmentation on the 1-mm DTI, then evaluated on typical healthy participants (n = 53), yielding an excellent voxel overlap with a Dice score of ~ 0.90 with manual segmentation; notably, this was comparable with the inter-rater reliability of manually delineating the hippocampus on diffusion magnetic resonance imaging (MRI) (Dice score of 0.86). This method also generalized to a different DTI protocol with 36% fewer acquisitions. It was further validated by showing similar age trajectories of volumes, fractional anisotropy, and mean diffusivity from manual segmentations in one cohort (n = 153, age 5-74 years) with automatic segmentations from a second cohort without manual segmentations (n = 354, age 5-90 years). Automated high-resolution diffusion MRI segmentation of the hippocampus will facilitate large cohort analyses and, in future research, needs to be evaluated on patient groups.

PMID:39136393 | DOI:10.1002/nbm.5227

Categories: Literature Watch

Pages