Deep learning
Uncertainty-aware deep-learning model for prediction of supratentorial hematoma expansion from admission non-contrast head computed tomography scan
NPJ Digit Med. 2024 Feb 6;7(1):26. doi: 10.1038/s41746-024-01007-w.
ABSTRACT
Hematoma expansion (HE) is a modifiable risk factor and a potential treatment target in patients with intracerebral hemorrhage (ICH). We aimed to train and validate deep-learning models for high-confidence prediction of supratentorial ICH expansion, based on admission non-contrast head Computed Tomography (CT). Applying Monte Carlo dropout and entropy of deep-learning model predictions, we estimated the model uncertainty and identified patients at high risk of HE with high confidence. Using the receiver operating characteristics area under the curve (AUC), we compared the deep-learning model prediction performance with multivariable models based on visual markers of HE determined by expert reviewers. We randomly split a multicentric dataset of patients (4-to-1) into training/cross-validation (n = 634) versus test (n = 159) cohorts. We trained and tested separate models for prediction of ≥6 mL and ≥3 mL ICH expansion. The deep-learning models achieved an AUC = 0.81 for high-confidence prediction of HE≥6 mL and AUC = 0.80 for prediction of HE≥3 mL, which were higher than visual maker models AUC = 0.69 for HE≥6 mL (p = 0.036) and AUC = 0.68 for HE≥3 mL (p = 0.043). Our results show that fully automated deep-learning models can identify patients at risk of supratentorial ICH expansion based on admission non-contrast head CT, with high confidence, and more accurately than benchmark visual markers.
PMID:38321131 | DOI:10.1038/s41746-024-01007-w
Joint optimization of degradation assessment and remaining useful life prediction for bearings with temporal convolutional auto-encoder
ISA Trans. 2023 Dec 27:S0019-0578(23)00590-6. doi: 10.1016/j.isatra.2023.12.031. Online ahead of print.
ABSTRACT
Remaining useful life (RUL) prediction and degradation assessment are pivotal components of prognostic and health management (PHM) and represent vital tasks in the implementation of predictive maintenance for bearings. In recent years, data-driven PHM techniques for bearings have made substantial progress through the integration of deep learning methods. However, modeling the temporal dependencies inherent in raw vibration signals for both degradation assessment and RUL prediction remains a significant challenge. Hence, we propose a joint optimization architecture that uses a temporal convolutional auto-encoder (TCAE) for the degradation assessment and RUL prediction of bearings. Specifically, the architecture includes a sequence-to-sequence model to extract degradation-sensitive features from the raw signal and utilizes temporal distribution characterization (TDC) and a nonlinear regressor to determine the degradation stages and predict RUL, respectively. Our framework integrates the tasks of degradation assessment and RUL prediction in a unified, end-to-end manner, using raw signals as input, resulting in high RUL prediction accuracy (RMSE = 0.0832) on publicly available and self-built datasets. Our approach outperforms state-of-the-art methods, indicating its potential to significantly advance the field of PHM for bearings.
PMID:38320915 | DOI:10.1016/j.isatra.2023.12.031
Model-Agnostic Binary Patch Grouping for Bone Marrow WSI Representation
Am J Pathol. 2024 Feb 4:S0002-9440(24)00043-9. doi: 10.1016/j.ajpath.2024.01.012. Online ahead of print.
ABSTRACT
Histopathology is the reference standard for pathology diagnosis, and has evolved with the digitization of glass slides, i.e. whole slide images (WSIs). Trained histopathologists can help to diagnose disease by examining WSIs visually, but this process is time-consuming and prone to variability. To address these issues, AI models are being developed to create slide-level representations of WSIs, summarizing the entire slide as a single vector. This enables various computational pathology applications, including inter-slide search, multi-modal training, and slide-level classification. Achieving expressive and robust slide-level representations hinges on patch feature extraction and aggregation steps. We propose integrating an additional Binary Patch Grouping (BPG) step, a plugin that can be integrated into various slide-level representation pipelines to enhance the quality of slide-level representation in bone marrow histopathology. BPG excludes patches with less clinical relevance through minimal interaction with the pathologist: a one-time human intervention for the entire process. We further investigated domain-general versus domain-specific feature extraction models based on convolution and attention and examined two different feature aggregation methods, with and without BPG, showing BPG's generalizability. We show that BPG boosts the performance of WSI retrieval (mAP@10) by 4% and improves WSI classification (weighted-F1) by 5% relative to not using BPG. Additionally, we found that the pipeline with BPG, domain-general large models and parameterized pooling produced the best-quality slide-level representations.
PMID:38320631 | DOI:10.1016/j.ajpath.2024.01.012
DBNet-SI: Dual branch network of shift window attention and inception structure for skin lesion segmentation
Comput Biol Med. 2024 Feb 2;170:108090. doi: 10.1016/j.compbiomed.2024.108090. Online ahead of print.
ABSTRACT
The U-shaped convolutional neural network (CNN) has attained remarkable achievements in the segmentation of skin lesion. However, given the inherent locality of convolution, this architecture cannot capture long-range pixel dependencies and multiscale global contextual information effectively. Moreover, repeated convolutions and downsampling operations can readily result in the omission of intricate local fine-grained details. In this paper, we proposed a U-shaped network (DBNet-SI) equipped with a dual-branch module that combines shift window attention and inception structures. First, we proposed a dual-branch module that combines shift window attention and inception structures (MSI) to better capture multiscale global contextual information and long-range pixel dependencies. Specifically, we have devised a cross-branch bidirectional interaction module within the MSI module to enable information complementarity between the two branches in the channel and spatial dimensions. Therefore, MSI is capable of extracting distinguishing and comprehensive features to accurately identify the skin lesion boundaries. Second, we have devised a progressive feature enhancement and information compensation module (PFEIC), which progressively compensates for fine-grained features through reconstructed skip connections and integrated global context attention modules. The results of the experiment show the superior segmentation performance of DBNet-SI compared with other deep learning models for skin lesion segmentation in the ISIC2017 and ISIC2018 datasets. Ablation studies demonstrate that our model can effectively extract rich multiscale global contextual information and compensate for the loss of local details.
PMID:38320341 | DOI:10.1016/j.compbiomed.2024.108090
Comparison of methods for intravoxel incoherent motion parameter estimation in the brain from flow-compensated and non-flow-compensated diffusion-encoded data
Magn Reson Med. 2024 Feb 6. doi: 10.1002/mrm.30042. Online ahead of print.
ABSTRACT
PURPOSE: Joint analysis of flow-compensated (FC) and non-flow-compensated (NC) diffusion MRI (dMRI) data has been suggested for increased robustness of intravoxel incoherent motion (IVIM) parameter estimation. For this purpose, a set of methods commonly used or previously found useful for IVIM analysis of dMRI data obtained with conventional diffusion encoding were evaluated in healthy human brain.
METHODS: Five methods for joint IVIM analysis of FC and NC dMRI data were compared: (1) direct non-linear least squares fitting, (2) a segmented fitting algorithm with estimation of the diffusion coefficient from higher b-values of NC data, (3) a Bayesian algorithm with uniform prior distributions, (4) a Bayesian algorithm with spatial prior distributions, and (5) a deep learning-based algorithm. Methods were evaluated on brain dMRI data from healthy subjects and simulated data at multiple noise levels. Bipolar diffusion encoding gradients were used with b-values 0-200 s/mm2 and corresponding flow weighting factors 0-2.35 s/mm for NC data and by design 0 for FC data. Data were acquired twice for repeatability analysis.
RESULTS: Measurement repeatability as well as estimation bias and variability were at similar levels or better with the Bayesian algorithm with spatial prior distributions and the deep learning-based algorithm for IVIM parameters D $$ D $$ and f $$ f $$ , and for the Bayesian algorithm only for v d $$ {v}_d $$ , relative to the other methods.
CONCLUSION: A Bayesian algorithm with spatial prior distributions is preferable for joint IVIM analysis of FC and NC dMRI data in the healthy human brain, but deep learning-based algorithms appear promising.
PMID:38321596 | DOI:10.1002/mrm.30042
A reliable diabetic retinopathy grading via transfer learning and ensemble learning with quadratic weighted kappa metric
BMC Med Inform Decis Mak. 2024 Feb 6;24(1):37. doi: 10.1186/s12911-024-02446-x.
ABSTRACT
The most common eye infection in people with diabetes is diabetic retinopathy (DR). It might cause blurred vision or even total blindness. Therefore, it is essential to promote early detection to prevent or alleviate the impact of DR. However, due to the possibility that symptoms may not be noticeable in the early stages of DR, it is difficult for doctors to identify them. Therefore, numerous predictive models based on machine learning (ML) and deep learning (DL) have been developed to determine all stages of DR. However, existing DR classification models cannot classify every DR stage or use a computationally heavy approach. Common metrics such as accuracy, F1 score, precision, recall, and AUC-ROC score are not reliable for assessing DR grading. This is because they do not account for two key factors: the severity of the discrepancy between the assigned and predicted grades and the ordered nature of the DR grading scale. This research proposes computationally efficient ensemble methods for the classification of DR. These methods leverage pre-trained model weights, reducing training time and resource requirements. In addition, data augmentation techniques are used to address data limitations, improve features, and improve generalization. This combination offers a promising approach for accurate and robust DR grading. In particular, we take advantage of transfer learning using models trained on DR data and employ CLAHE for image enhancement and Gaussian blur for noise reduction. We propose a three-layer classifier that incorporates dropout and ReLU activation. This design aims to minimize overfitting while effectively extracting features and assigning DR grades. We prioritize the Quadratic Weighted Kappa (QWK) metric due to its sensitivity to label discrepancies, which is crucial for an accurate diagnosis of DR. This combined approach achieves state-of-the-art QWK scores (0.901, 0.967 and 0.944) in the Eyepacs, Aptos, and Messidor datasets.
PMID:38321416 | DOI:10.1186/s12911-024-02446-x
Artifact suppression for breast specimen imaging in micro CBCT using deep learning
BMC Med Imaging. 2024 Feb 6;24(1):34. doi: 10.1186/s12880-024-01216-5.
ABSTRACT
BACKGROUND: Cone-beam computed tomography (CBCT) has been introduced for breast-specimen imaging to identify a free resection margin of abnormal tissues in breast conservation. As well-known, typical micro CT consumes long acquisition and computation times. One simple solution to reduce the acquisition scan time is to decrease of the number of projections, but this method generates streak artifacts on breast specimen images. Furthermore, the presence of a metallic-needle marker on a breast specimen causes metal artifacts that are prominently visible in the images. In this work, we propose a deep learning-based approach for suppressing both streak and metal artifacts in CBCT.
METHODS: In this work, sinogram datasets acquired from CBCT and a small number of projections containing metal objects were used. The sinogram was first modified by removing metal objects and up sampling in the angular direction. Then, the modified sinogram was initialized by linear interpolation and synthesized by a modified neural network model based on a U-Net structure. To obtain the reconstructed images, the synthesized sinogram was reconstructed using the traditional filtered backprojection (FBP) approach. The remaining residual artifacts on the images were further handled by another neural network model, ResU-Net. The corresponding denoised image was combined with the extracted metal objects in the same data positions to produce the final results.
RESULTS: The image quality of the reconstructed images from the proposed method was improved better than the images from the conventional FBP, iterative reconstruction (IR), sinogram with linear interpolation, denoise with ResU-Net, sinogram with U-Net. The proposed method yielded 3.6 times higher contrast-to-noise ratio, 1.3 times higher peak signal-to-noise ratio, and 1.4 times higher structural similarity index (SSIM) than the traditional technique. Soft tissues around the marker on the images showed good improvement, and the mainly severe artifacts on the images were significantly reduced and regulated by the proposed.
CONCLUSIONS: Our proposed method performs well reducing streak and metal artifacts in the CBCT reconstructed images, thus improving the overall breast specimen images. This would be beneficial for clinical use.
PMID:38321390 | DOI:10.1186/s12880-024-01216-5
A comparison of embedding aggregation strategies in drug-target interaction prediction
BMC Bioinformatics. 2024 Feb 6;25(1):59. doi: 10.1186/s12859-024-05684-y.
ABSTRACT
The prediction of interactions between novel drugs and biological targets is a vital step in the early stage of the drug discovery pipeline. Many deep learning approaches have been proposed over the last decade, with a substantial fraction of them sharing the same underlying two-branch architecture. Their distinction is limited to the use of different types of feature representations and branches (multi-layer perceptrons, convolutional neural networks, graph neural networks and transformers). In contrast, the strategy used to combine the outputs (embeddings) of the branches has remained mostly the same. The same general architecture has also been used extensively in the area of recommender systems, where the choice of an aggregation strategy is still an open question. In this work, we investigate the effectiveness of three different embedding aggregation strategies in the area of drug-target interaction (DTI) prediction. We formally define these strategies and prove their universal approximator capabilities. We then present experiments that compare the different strategies on benchmark datasets from the area of DTI prediction, showcasing conditions under which specific strategies could be the obvious choice.
PMID:38321386 | DOI:10.1186/s12859-024-05684-y
Applied deep learning in neurosurgery: identifying cerebrospinal fluid (CSF) shunt systems in hydrocephalus patients
Acta Neurochir (Wien). 2024 Feb 7;166(1):69. doi: 10.1007/s00701-024-05940-3.
ABSTRACT
BACKGROUND: Over the recent decades, the number of different manufacturers and models of cerebrospinal fluid shunt valves constantly increased. Proper identification of shunt valves on X-ray images is crucial to neurosurgeons and radiologists to derive further details of a specific shunt valve, such as opening pressure settings and MR scanning conditions. The main aim of this study is to evaluate the feasibility of an AI-assisted shunt valve detection system.
METHODS: The dataset used contains 2070 anonymized images of ten different, commonly used shunt valve types. All images were acquired from skull X-rays or scout CT-images. The images were randomly split into a 80% training and 20% validation set. An implementation in Python with the FastAi library was used to train a convolutional neural network (CNN) using a transfer learning method on a pre-trained model.
RESULTS: Overall, our model achieved an F1-score of 99% to predict the correct shunt valve model. F1-scores for individual shunt valves ranged from 92% for the Sophysa Sophy Mini SM8 to 100% for several other models.
CONCLUSION: This technology has the potential to automatically detect different shunt valve models in a fast and precise way and may facilitate the identification of an unknown shunt valve on X-ray or CT scout images. The deep learning model we developed could be integrated into PACS systems or standalone mobile applications to enhance clinical workflows.
PMID:38321344 | DOI:10.1007/s00701-024-05940-3
Empowering PET: harnessing deep learning for improved clinical insight
Eur Radiol Exp. 2024 Feb 7;8(1):17. doi: 10.1186/s41747-023-00413-1.
ABSTRACT
This review aims to take a journey into the transformative impact of artificial intelligence (AI) on positron emission tomography (PET) imaging. To this scope, a broad overview of AI applications in the field of nuclear medicine and a thorough exploration of deep learning (DL) implementations in cancer diagnosis and therapy through PET imaging will be presented. We firstly describe the behind-the-scenes use of AI for image generation, including acquisition (event positioning, noise reduction though time-of-flight estimation and scatter correction), reconstruction (data-driven and model-driven approaches), restoration (supervised and unsupervised methods), and motion correction. Thereafter, we outline the integration of AI into clinical practice through the applications to segmentation, detection and classification, quantification, treatment planning, dosimetry, and radiomics/radiogenomics combined to tumour biological characteristics. Thus, this review seeks to showcase the overarching transformation of the field, ultimately leading to tangible improvements in patient treatment and response assessment. Finally, limitations and ethical considerations of the AI application to PET imaging and future directions of multimodal data mining in this discipline will be briefly discussed, including pressing challenges to the adoption of AI in molecular imaging such as the access to and interoperability of huge amount of data as well as the "black-box" problem, contributing to the ongoing dialogue on the transformative potential of AI in nuclear medicine.Relevance statementAI is rapidly revolutionising the world of medicine, including the fields of radiology and nuclear medicine. In the near future, AI will be used to support healthcare professionals. These advances will lead to improvements in diagnosis, in the assessment of response to treatment, in clinical decision making and in patient management.Key points• Applying AI has the potential to enhance the entire PET imaging pipeline.• AI may support several clinical tasks in both PET diagnosis and prognosis.• Interpreting the relationships between imaging and multiomics data will heavily rely on AI.
PMID:38321340 | DOI:10.1186/s41747-023-00413-1
Deep learning for differentiation of osteolytic osteosarcoma and giant cell tumor around the knee joint on radiographs: a multicenter study
Insights Imaging. 2024 Feb 7;15(1):35. doi: 10.1186/s13244-024-01610-1.
ABSTRACT
OBJECTIVES: To develop a deep learning (DL) model for differentiating between osteolytic osteosarcoma (OS) and giant cell tumor (GCT) on radiographs.
METHODS: Patients with osteolytic OS and GCT proven by postoperative pathology were retrospectively recruited from four centers (center A, training and internal testing; centers B, C, and D, external testing). Sixteen radiologists with different experiences in musculoskeletal imaging diagnosis were divided into three groups and participated with or without the DL model's assistance. DL model was generated using EfficientNet-B6 architecture, and the clinical model was trained using clinical variables. The performance of various models was compared using McNemar's test.
RESULTS: Three hundred thirty-three patients were included (mean age, 27 years ± 12 [SD]; 186 men). Compared to the clinical model, the DL model achieved a higher area under the curve (AUC) in both the internal (0.97 vs. 0.77, p = 0.008) and external test set (0.97 vs. 0.64, p < 0.001). In the total test set (including the internal and external test sets), the DL model achieved higher accuracy than the junior expert committee (93.1% vs. 72.4%; p < 0.001) and was comparable to the intermediate and senior expert committee (93.1% vs. 88.8%, p = 0.25; 87.1%, p = 0.35). With DL model assistance, the accuracy of the junior expert committee was improved from 72.4% to 91.4% (p = 0.051).
CONCLUSION: The DL model accurately distinguished osteolytic OS and GCT with better performance than the junior radiologists, whose own diagnostic performances were significantly improved with the aid of the model, indicating the potential for the differential diagnosis of the two bone tumors on radiographs.
CRITICAL RELEVANCE STATEMENT: The deep learning model can accurately distinguish osteolytic osteosarcoma and giant cell tumor on radiographs, which may help radiologists improve the diagnostic accuracy of two types of tumors.
KEY POINTS: • The DL model shows robust performance in distinguishing osteolytic osteosarcoma and giant cell tumor. • The diagnosis performance of the DL model is better than junior radiologists'. • The DL model shows potential for differentiating osteolytic osteosarcoma and giant cell tumor.
PMID:38321327 | DOI:10.1186/s13244-024-01610-1
Deep Learning-Assisted Diffusion Tensor Imaging for Evaluation of the Physis and Metaphysis
J Imaging Inform Med. 2024 Feb 6. doi: 10.1007/s10278-024-00993-3. Online ahead of print.
ABSTRACT
Diffusion tensor imaging of physis and metaphysis can be used as a biomarker to predict height change in the pediatric population. Current application of this technique requires manual segmentation of the physis which is time-consuming and introduces interobserver variability. UNET Transformers (UNETR) can be used for automatic segmentation to optimize workflow. Three hundred and eighty-five DTI scans from 191 subjects with mean age of 12.6 years ± 2.01 years were retrospectively used for training and validation. The mean Dice correlation coefficient was 0.81 for the UNETR model and 0.68 for the UNET. Manual extraction and segmentation took 15 min per volume, whereas both deep learning segmentation techniques took < 1 s per volume and were deterministic, always producing the same result for a given input. Intraclass correlation coefficient (ICC) for ROI-derived femur diffusion metrics was excellent for tract count (0.95), volume (0.95), and FA (0.97), and good for tract length (0.87). The results support the hypothesis that a hybrid UNETR model can be trained to replace the manual segmentation of physeal DTI images, therefore automating the process.
PMID:38321313 | DOI:10.1007/s10278-024-00993-3
The communication of artificial intelligence and deep learning in computer tomography image recognition of epidemic pulmonary infectious diseases
PLoS One. 2024 Feb 6;19(2):e0297578. doi: 10.1371/journal.pone.0297578. eCollection 2024.
ABSTRACT
The objectives are to improve the diagnostic efficiency and accuracy of epidemic pulmonary infectious diseases and to study the application of artificial intelligence (AI) in pulmonary infectious disease diagnosis and public health management. The computer tomography (CT) images of 200 patients with pulmonary infectious disease are collected and input into the AI-assisted diagnosis software based on the deep learning (DL) model, "UAI, pulmonary infectious disease intelligent auxiliary analysis system", for lesion detection. By analyzing the principles of convolutional neural networks (CNN) in deep learning (DL), the study selects the AlexNet model for the recognition and classification of pulmonary infection CT images. The software automatically detects the pneumonia lesions, marks them in batches, and calculates the lesion volume. The result shows that the CT manifestations of the patients are mainly involved in multiple lobes and density, the most common shadow is the ground-glass opacity. The detection rate of the manual method is 95.30%, the misdetection rate is 0.20% and missed diagnosis rate is 4.50%; the detection rate of the DL-based AI-assisted lesion method is 99.76%, the misdetection rate is 0.08%, and the missed diagnosis rate is 0.08%. Therefore, the proposed model can effectively identify pulmonary infectious disease lesions and provide relevant data information to objectively diagnose pulmonary infectious disease and manage public health.
PMID:38319912 | DOI:10.1371/journal.pone.0297578
Predicting Protein Functions Based on Heterogeneous Graph Attention Technique
IEEE J Biomed Health Inform. 2024 Feb 6;PP. doi: 10.1109/JBHI.2024.3357834. Online ahead of print.
ABSTRACT
In bioinformatics, protein function prediction stands as a fundamental area of research and plays a crucial role in addressing various biological challenges, such as the identification of potential targets for drug discovery and the elucidation of disease mechanisms. However, known functional annotation databases usually provide positive experimental annotations that proteins carry out a given function, and rarely record negative experimental annotations that proteins do not carry out a given function. Therefore, existing computational methods based on deep learning models focus on these positive annotations for prediction and ignore these scarce but informative negative annotations, leading to an underestimation of precision. To address this issue, we introduce a deep learning method that utilizes a heterogeneous graph attention technique. The method first constructs a heterogeneous graph that covers the protein-protein interaction network, ontology structure, and positive and negative annotation information. Then, it learns embedding representations of proteins and ontology terms by using the heterogeneous graph attention technique. Finally, it leverages these learned representations to reconstruct the positive protein-term associations and score unobserved functional annotations. It can enhance the predictive performance by incorporating these known limited negative annotations into the constructed heterogeneous graph. Experimental results on three species (i.e., Human, Mouse, and Arabidopsis) demonstrate that our method can achieve better performance in predicting new protein annotations than state-of-the-art methods.
PMID:38319781 | DOI:10.1109/JBHI.2024.3357834
Uncertainty-aware Health Diagnostics via Class-balanced Evidential Deep Learning
IEEE J Biomed Health Inform. 2024 Feb 6;PP. doi: 10.1109/JBHI.2024.3360002. Online ahead of print.
ABSTRACT
Uncertainty quantification is critical for ensuring the safety of deep learning-enabled health diagnostics, as it helps the model account for unknown factors and reduces the risk of misdiagnosis. However, existing uncertainty quantification studies often overlook the significant issue of class imbalance, which is common in medical data. In this paper, we propose a class-balanced evidential deep learning framework to achieve fair and reliable uncertainty estimates for health diagnostic models. This framework advances the state-of-the-art uncertainty quantification method of evidential deep learning with two novel mechanisms to address the challenges posed by class imbalance. Specifically, we introduce a pooling loss that enables the model to learn less biased evidence among classes and a learnable prior to regularize the posterior distribution that accounts for the quality of uncertainty estimates. Extensive experiments using benchmark data with varying degrees of imbalance and various naturally imbalanced health data demonstrate the effectiveness and superiority of our method. Our work pushes the envelope of uncertainty quantification from theoretical studies to realistic healthcare application scenarios. By enhancing uncertainty estimation for class-imbalanced data, we contribute to the development of more reliable and practical deep learning-enabled health diagnostic systems1.
PMID:38319779 | DOI:10.1109/JBHI.2024.3360002
Flex-DLD: Deep Low-rank Decomposition Model with Flexible Priors for Hyperspectral Image Denoising and Restoration
IEEE Trans Image Process. 2024 Feb 6;PP. doi: 10.1109/TIP.2024.3360902. Online ahead of print.
ABSTRACT
Hyperspectral images (HSIs) are composed of hundreds of contiguous waveband images, offering a wealth of spatial and spectral information. However, the practical use of HSIs is often hindered by the presence of complicated noise caused by various factors such as non-uniform sensor response and dark current. Traditional methods for denoising HSIs rely on constrained optimization approaches, where selecting appropriate prior knowledge is critical for achieving satisfactory results. Nevertheless, these traditional algorithms are limited by hand-crafted priors, leaving room for improvement in their denoising performance. Recently, the supervised deep learning technique has emerged as a promising approach for HSI denoising. However, their requirement for paired training data and poor generalization ability on untrained noise distributions pose challenges in practical applications. In this paper, we design a novel algorithm by the synergism of optimization-based methods and deep learning techniques. Specifically, we introduce a plug-and-play Deep Low-rank Decomposition (DLD) model into the optimization framework. Furthermore, we propose an effective mechanism to incorporate traditional prior knowledge into the DLD model. Finally, we provide a detailed analysis of the optimization process and convergence of the proposed method. Empirical evaluations on various tasks, including hyperspectral image denoising and spectral compressive imaging, demonstrate the superiority of our approach over state-of-the-art methods.
PMID:38319770 | DOI:10.1109/TIP.2024.3360902
ECGVEDNET: A Variational Encoder-Decoder Network for ECG Delineation in Morphology Variant ECGs
IEEE Trans Biomed Eng. 2024 Feb 6;PP. doi: 10.1109/TBME.2024.3363077. Online ahead of print.
ABSTRACT
Electrocardiogram (ECG) delineation to identify the fiducial points of ECG segments, plays an important role in cardiovascular diagnosis and care. Whilst deep delineation frameworks have been deployed within the literature, several factors still hinder their development: (a) data availability: the capacity of deep learning models to generalise is limited by the amount of available data; (b) morphology variations: ECG complexes vary, even within the same person, which degrades the performance of conventional deep learning models. To address these concerns, we present a large-scale 12-leads ECG dataset, ICDIRS, to train and evaluate a novel deep delineation model-ECGVEDNET. ICDIRS is a large-scale ECG dataset with 156,145 QRS onset annotations and 156,145 T peak annotations. ECGVEDNET is a novel variational encoder-decoder network designed to address morphology variations. In ECGVEDNET, we construct a well-regularized latent space, in which the latent features of ECG follow a regular distribution and present smaller morphology variations than in the raw data space. Finally, a transfer learning framework is proposed to transfer the knowledge learned on ICDIRS to smaller datasets. On ICDIRS, ECGVEDNET achieves accuracy of 86.28%/88.31% within 5/10 ms tolerance for QRS onset and accuracy of 89.94%/91.16% within 5/10 ms tolerance for T peak. On QTDB, the average time errors computed for QRS onset and T peak are -1.86 ± 8.02 ms and -0.50 ± 12.96 ms, respectively, achieving state-of-the-art performances on both large and small-scale datasets. We will release the source code and the pre-trained model on ICDIRS once accepted.
PMID:38319768 | DOI:10.1109/TBME.2024.3363077
Multi-instance Multi-task Learning for Joint Clinical Outcome and Genomic Profile Predictions from the Histopathological Images
IEEE Trans Med Imaging. 2024 Feb 6;PP. doi: 10.1109/TMI.2024.3362852. Online ahead of print.
ABSTRACT
With the remarkable success of digital histopathology and the deep learning technology, many whole-slide pathological images (WSIs) based deep learning models are designed to help pathologists diagnose human cancers. Recently, rather than predicting categorical variables as in cancer diagnosis, several deep learning studies are also proposed to estimate the continuous variables such as the patients' survival or their transcriptional profile. However, most of the existing studies focus on conducting these predicting tasks separately, which overlooks the useful intrinsic correlation among them that can boost the prediction performance of each individual task. In addition, it is sill challenge to design the WSI-based deep learning models, since a WSI is with huge size but annotated with coarse label. In this study, we propose a general multi-instance multi-task learning framework (HistMIMT) for multi-purpose prediction from WSIs. Specifically, we firstly propose a novel multi-instance learning module (TMICS) considering both common and specific task information across different tasks to generate bag representation for each individual task. Then, a soft-mask based fusion module with channel attention (SFCA) is developed to leverage useful information from the related tasks to help improve the prediction performance on target task. We evaluate our method on three cancer cohorts derived from the Cancer Genome Atlas (TCGA). For each cohort, our multi-purpose prediction tasks range from cancer diagnosis, survival prediction and estimating the transcriptional profile of gene TP53. The experimental results demonstrated that HistMIMT can yield better outcome on all clinical prediction tasks than its competitors.
PMID:38319755 | DOI:10.1109/TMI.2024.3362852
Technical note: Generalizable and promptable artificial intelligence model to augment clinical delineation in radiation oncology
Med Phys. 2024 Feb 6. doi: 10.1002/mp.16965. Online ahead of print.
ABSTRACT
BACKGROUND: Efficient and accurate delineation of organs at risk (OARs) is a critical procedure for treatment planning and dose evaluation. Deep learning-based auto-segmentation of OARs has shown promising results and is increasingly being used in radiation therapy. However, existing deep learning-based auto-segmentation approaches face two challenges in clinical practice: generalizability and human-AI interaction. A generalizable and promptable auto-segmentation model, which segments OARs of multiple disease sites simultaneously and supports on-the-fly human-AI interaction, can significantly enhance the efficiency of radiation therapy treatment planning.
PURPOSE: Meta's segment anything model (SAM) was proposed as a generalizable and promptable model for next-generation natural image segmentation. We further evaluated the performance of SAM in radiotherapy segmentation.
METHODS: Computed tomography (CT) images of clinical cases from four disease sites at our institute were collected: prostate, lung, gastrointestinal, and head & neck. For each case, we selected the OARs important in radiotherapy treatment planning. We then compared both the Dice coefficients and Jaccard indices derived from three distinct methods: manual delineation (ground truth), automatic segmentation using SAM's 'segment anything' mode, and automatic segmentation using SAM's 'box prompt' mode that implements manual interaction via live prompts during segmentation.
RESULTS: Our results indicate that SAM's segment anything mode can achieve clinically acceptable segmentation results in most OARs with Dice scores higher than 0.7. SAM's box prompt mode further improves Dice scores by 0.1∼0.5. Similar results were observed for Jaccard indices. The results show that SAM performs better for prostate and lung, but worse for gastrointestinal and head & neck. When considering the size of organs and the distinctiveness of their boundaries, SAM shows better performance for large organs with distinct boundaries, such as lung and liver, and worse for smaller organs with less distinct boundaries, like parotid and cochlea.
CONCLUSIONS: Our results demonstrate SAM's robust generalizability with consistent accuracy in automatic segmentation for radiotherapy. Furthermore, the advanced box-prompt method enables the users to augment auto-segmentation interactively and dynamically, leading to patient-specific auto-segmentation in radiation therapy. SAM's generalizability across different disease sites and different modalities makes it feasible to develop a generic auto-segmentation model in radiotherapy.
PMID:38319676 | DOI:10.1002/mp.16965
Deep learning-based PET image denoising and reconstruction: a review
Radiol Phys Technol. 2024 Feb 6. doi: 10.1007/s12194-024-00780-3. Online ahead of print.
ABSTRACT
This review focuses on positron emission tomography (PET) imaging algorithms and traces the evolution of PET image reconstruction methods. First, we provide an overview of conventional PET image reconstruction methods from filtered backprojection through to recent iterative PET image reconstruction algorithms, and then review deep learning methods for PET data up to the latest innovations within three main categories. The first category involves post-processing methods for PET image denoising. The second category comprises direct image reconstruction methods that learn mappings from sinograms to the reconstructed images in an end-to-end manner. The third category comprises iterative reconstruction methods that combine conventional iterative image reconstruction with neural-network enhancement. We discuss future perspectives on PET imaging and deep learning technology.
PMID:38319563 | DOI:10.1007/s12194-024-00780-3