Deep learning

Deep Learning Reconstruction of Prospectively Accelerated MRI of the Pancreas: Clinical Evaluation of Shortened Breath-Hold Examinations With Dixon Fat Suppression

Tue, 2024-07-23 06:00

Invest Radiol. 2024 Jul 23. doi: 10.1097/RLI.0000000000001110. Online ahead of print.

ABSTRACT

OBJECTIVE: Deep learning (DL)-enabled magnetic resonance imaging (MRI) reconstructions can enable shortening of breath-hold examinations and improve image quality by reducing motion artifacts. Prospective studies with DL reconstructions of accelerated MRI of the upper abdomen in the context of pancreatic pathologies are lacking. In a clinical setting, the purpose of this study is to investigate the performance of a novel DL-based reconstruction algorithm in T1-weighted volumetric interpolated breath-hold examinations with partial Fourier sampling and Dixon fat suppression (hereafter, VIBE-DixonDL). The objective is to analyze its impact on acquisition time, image sharpness and quality, diagnostic confidence, pancreatic lesion conspicuity, signal-to-noise ratio (SNR), and contrast-to-noise ratio (CNR).

METHODS: This prospective single-center study included participants with various pancreatic pathologies who gave written consent from January 2023 to September 2023. During the same session, each participant underwent 2 MRI acquisitions using a 1.5 T scanner: conventional precontrast and postcontrast T1-weighted VIBE acquisitions with Dixon fat suppression (VIBE-Dixon, reference standard) using 4-fold parallel imaging acceleration and 6-fold accelerated VIBE-Dixon acquisitions with partial Fourier sampling utilizing a novel DL reconstruction tailored to the acquisition. A qualitative image analysis was performed by 4 readers. Acquisition time, image sharpness, overall image quality, image noise and artifacts, diagnostic confidence, as well as pancreatic lesion conspicuity and size were compared. Furthermore, a quantitative analysis of SNR and CNR was performed.

RESULTS: Thirty-two participants were evaluated (mean age ± SD, 62 ± 19 years; 20 men). The VIBE-DixonDL method enabled up to 52% reduction in average breath-hold time (7 seconds for VIBE-DixonDL vs 15 seconds for VIBE-Dixon, P < 0.001). A significant improvement of image sharpness, overall image quality, diagnostic confidence, and pancreatic lesion conspicuity was observed in the images recorded using VIBE-DixonDL (P < 0.001). Furthermore, a significant reduction of image noise and motion artifacts was noted in the images recorded using the VIBE-DixonDL technique (P < 0.001). In addition, for all readers, there was no evidence of a difference in lesion size measurement between VIBE-Dixon and VIBE-DixonDL. Interreader agreement between VIBE-Dixon and VIBE-DixonDL regarding lesion size was excellent (intraclass correlation coefficient, >90). Finally, a statistically significant increase of pancreatic SNR in VIBE-DIXONDL was observed in both the precontrast (P = 0.025) and postcontrast images (P < 0.001). Also, an increase of splenic SNR in VIBE-DIXONDL was observed in both the precontrast and postcontrast images, but only reaching statistical significance in the postcontrast images (P = 0.34 and P = 0.003, respectively). Similarly, an increase of pancreas CNR in VIBE-DIXONDL was observed in both the precontrast and postcontrast images, but only reaching statistical significance in the postcontrast images (P = 0.557 and P = 0.026, respectively).

CONCLUSIONS: The prospectively accelerated, DL-enhanced VIBE with Dixon fat suppression was clinically feasible. It enabled a 52% reduction in breath-hold time and provided superior image quality, diagnostic confidence, and pancreatic lesion conspicuity. This technique might be especially useful for patients with limited breath-hold capacity.

PMID:39043213 | DOI:10.1097/RLI.0000000000001110

Categories: Literature Watch

Development of a deep learning-based fully automated segmentation of rotator cuff muscles from clinical MR scans

Tue, 2024-07-23 06:00

Acta Radiol. 2024 Jul 23:2841851241262325. doi: 10.1177/02841851241262325. Online ahead of print.

ABSTRACT

BACKGROUND: The fatty infiltration and atrophy in the muscle after a rotator cuff (RC) tear are important in surgical decision-making and are linked to poor clinical outcomes after rotator cuff repair. An accurate and reliable quantitative method should be developed to assess the entire RC muscles.

PURPOSE: To develop a fully automated approach based on a deep neural network to segment RC muscles from clinical magnetic resonance imaging (MRI) scans.

MATERIAL AND METHODS: In total, 94 shoulder MRI scans (mean age = 62.3 years) were utilized for the training and internal validation datasets, while an additional 20 MRI scans (mean age = 62.6 years) were collected from another institution for external validation. An orthopedic surgeon and a radiologist manually segmented muscles and bones as reference masks. Segmentation performance was evaluated using the Dice score, sensitivities, precision, and percent difference in muscle volume (%). In addition, the segmentation performance was assessed based on sex, age, and the presence of a RC tendon tear.

RESULTS: The average Dice score, sensitivities, precision, and percentage difference in muscle volume of the developed algorithm were 0.920, 0.933, 0.912, and 4.58%, respectively, in external validation. There was no difference in the prediction of shoulder muscles, with the exception of teres minor, where significant prediction errors were observed (0.831, 0.854, 0.835, and 10.88%, respectively). The segmentation performance of the algorithm was generally unaffected by age, sex, and the presence of RC tears.

CONCLUSION: We developed a fully automated deep neural network for RC muscle and bone segmentation with excellent performance from clinical MRI scans.

PMID:39043149 | DOI:10.1177/02841851241262325

Categories: Literature Watch

DMSPS: Dynamically mixed soft pseudo-label supervision for scribble-supervised medical image segmentation

Tue, 2024-07-23 06:00

Med Image Anal. 2024 Jul 15;97:103274. doi: 10.1016/j.media.2024.103274. Online ahead of print.

ABSTRACT

High performance of deep learning on medical image segmentation rely on large-scale pixel-level dense annotations, which poses a substantial burden on medical experts due to the laborious and time-consuming annotation process, particularly for 3D images. To reduce the labeling cost as well as maintain relatively satisfactory segmentation performance, weakly-supervised learning with sparse labels has attained increasing attentions. In this work, we present a scribble-based framework for medical image segmentation, called Dynamically Mixed Soft Pseudo-label Supervision (DMSPS). Concretely, we extend a backbone with an auxiliary decoder to form a dual-branch network to enhance the feature capture capability of the shared encoder. Considering that most pixels do not have labels and hard pseudo-labels tend to be over-confident to result in poor segmentation, we propose to use soft pseudo-labels generated by dynamically mixing the decoders' predictions as auxiliary supervision. To further enhance the model's performance, we adopt a two-stage approach where the sparse scribbles are expanded based on predictions with low uncertainties from the first-stage model, leading to more annotated pixels to train the second-stage model. Experiments on ACDC dataset for cardiac structure segmentation, WORD dataset for 3D abdominal organ segmentation and BraTS2020 dataset for 3D brain tumor segmentation showed that: (1) compared with the baseline, our method improved the average DSC from 50.46% to 89.51%, from 75.46% to 87.56% and from 52.61% to 76.53% on the three datasets, respectively; (2) DMSPS achieved better performance than five state-of-the-art scribble-supervised segmentation methods, and is generalizable to different segmentation backbones. The code is available online at: https://github.com/HiLab-git/DMSPS.

PMID:39043109 | DOI:10.1016/j.media.2024.103274

Categories: Literature Watch

SYSTCM: A systemic web platform for objective identification of pharmacological effects based on interplay of "traditional Chinese Medicine-components-targets"

Tue, 2024-07-23 06:00

Comput Biol Med. 2024 Jul 22;179:108878. doi: 10.1016/j.compbiomed.2024.108878. Online ahead of print.

ABSTRACT

Mechanism analysis is essential for the use and promotion of Traditional Chinese Medicine (TCM). Traditional methods of network analysis relying on expert experience lack an explanatory framework, prompting the application of deep learning and machine learning for objective identification of TCM pharmacological effects. A dataset was used to construct an interacted network graph between 424 molecular descriptors and 465 pharmacological targets to represent the relationship between components and pharmacological effects. Subsequently, the optimal identification model of pharmacological effects (IPE) was established through convolution neural networks of GoogLeNet structure. The AUC values are greater than 0.8, MCC values are greater than 0.7, and ACC values are greater than 0.85 across various test datasets. Subsequently, 18 recognition models of TCM efficacy (RTE) were created using support vector machines (SVM). Integration of pharmacological effects and efficacies led to the development of the systemic web platform for identification of pharmacological effects (SYSTCM). The platform, comprising 70,961 terms, including 636 Traditional Chinese Medicines (TCMs), 8190 components, 40 pharmacological effects, and 18 efficacies. Through the SYSTCM platform, (1) Total 100 components were predicted from TCMs with anti-inflammatory pharmacological effects. (2) The pharmacological effects of complete constituents were predicted from Coptidis Rhizoma (Huang Lian). (3) The principal components, pharmacological effects, and efficacies were elucidated from Salviae Miltiorrhizae radix et rhizome (Dan Shen). SYSTCM addresses subjectivity in pharmacological effect determination, offering a potential avenue for advancing TCM drug development and clinical applications. Access SYSTCM at http://systcm.cn.

PMID:39043107 | DOI:10.1016/j.compbiomed.2024.108878

Categories: Literature Watch

An ensemble model for accurate prediction of key water quality parameters in river based on deep learning methods

Tue, 2024-07-23 06:00

J Environ Manage. 2024 Jul 22;366:121932. doi: 10.1016/j.jenvman.2024.121932. Online ahead of print.

ABSTRACT

Deep learning models provide a more powerful method for accurate and stable prediction of water quality in rivers, which is crucial for the intelligent management and control of the water environment. To increase the accuracy of predicting the water quality parameters and learn more about the impact of complex spatial information based on deep learning models, this study proposes two ensemble models TNX (with temporal attention) and STNX (with spatio-temporal attention) based on seasonal and trend decomposition (STL) method to predict water quality using geo-sensory time series data. Dissolved oxygen, total phosphorus, and ammonia nitrogen were predicted in short-step (1 h, and 2 h) and long-step (12 h, and 24 h) with seven water quality monitoring sites in a river. The ensemble model TNX improved the performance by 2.1%-6.1% and 4.3%-22.0% relative to the best baseline deep learning model for the short-step and long-step water quality prediction, and it can capture the variation pattern of water quality parameters by only predicting the trend component of raw data after STL decomposition. The STNX model, with spatio-temporal attention, obtained 0.5%-2.4% and 2.3%-5.7% higher performance compared to the TNX model for the short-step and long-step water quality prediction, and such improvement was more effective in mitigating the prediction shift patterns of long-step prediction. Moreover, the model interpretation results consistently demonstrated positive relationship patterns across all monitoring sites. However, the significance of seven specific monitoring sites diminished as the distance between the predicted and input monitoring sites increased. This study provides an ensemble modeling approach based on STL decomposition for improving short-step and long-step prediction of river water quality parameter, and understands the impact of complex spatial information on deep learning model.

PMID:39043087 | DOI:10.1016/j.jenvman.2024.121932

Categories: Literature Watch

Estimating the synthetic accessibility of molecules with building block and reaction-aware SAScore

Tue, 2024-07-23 06:00

J Cheminform. 2024 Jul 23;16(1):83. doi: 10.1186/s13321-024-00879-0.

ABSTRACT

Synthetic accessibility prediction is a task to estimate how easily a given molecule might be synthesizable in the laboratory, playing a crucial role in computer-aided molecular design. Although synthesis planning programs can determine synthesis routes, their slow processing times make them impractical for large-scale molecule screening. On the other hand, existing rapid synthesis accessibility estimation methods offer speed but typically lack integration with actual synthesis routes and building block information. In this work, we introduce BR-SAScore, an enhanced version of SAScore that integrates the available building block information (B) and reaction knowledge (R) from synthesis planning programs into the scoring process. In particular, we differentiate fragments inherent in building blocks and fragments to be derived from synthesis (reactions) when scoring synthetic accessibility. Compared to existing methods, our experimental findings demonstrate that BR-SAScore offers more accurate and precise identification of a molecule's synthetic accessibility by the synthesis planning program with a fast calculation time. Moreover, we illustrate how BR-SAScore provides chemically interpretable results, aligning with the capability of the synthesis planning program embedded with the same reaction knowledge and available building blocks.Scientific contributionWe introduce BR-SAScore, an extension of SAScore, to estimate the synthetic accessibility of molecules by leveraging known building-block and reactivity information. In our experiments, BR-SAScore shows superior prediction performance on predicting molecule synthetic accessibility compared to previous methods, including SAScore and deep-learning models, while requiring significantly less computation time. In addition, we show that BR-SAScore is able to precisely identify the chemical fragment contributing to the synthetic infeasibility, holding great potential for future molecule synthesizability optimization.

PMID:39044299 | DOI:10.1186/s13321-024-00879-0

Categories: Literature Watch

Improving lung nodule segmentation in thoracic CT scans through the ensemble of 3D U-Net models

Tue, 2024-07-23 06:00

Int J Comput Assist Radiol Surg. 2024 Jul 23. doi: 10.1007/s11548-024-03222-y. Online ahead of print.

ABSTRACT

PURPOSE: The current study explores the application of 3D U-Net architectures combined with Inception and ResNet modules for precise lung nodule detection through deep learning-based segmentation technique. This investigation is motivated by the objective of developing a Computer-Aided Diagnosis (CAD) system for effective diagnosis and prognostication of lung nodules in clinical settings.

METHODS: The proposed method trained four different 3D U-Net models on the retrospective dataset obtained from AIIMS Delhi. To augment the training dataset, affine transformations and intensity transforms were utilized. Preprocessing steps included CT scan voxel resampling, intensity normalization, and lung parenchyma segmentation. Model optimization utilized a hybrid loss function that combined Dice Loss and Focal Loss. The model performance of all four 3D U-Nets was evaluated patient-wise using dice coefficient and Jaccard coefficient, then averaged to obtain the average volumetric dice coefficient (DSCavg) and average Jaccard coefficient (IoUavg) on a test dataset comprising 53 CT scans. Additionally, an ensemble approach (Model-V) was utilized featuring 3D U-Net (Model-I), ResNet (Model-II), and Inception (Model-III) 3D U-Net architectures, combined with two distinct patch sizes for further investigation.

RESULTS: The ensemble of models obtained the highest DSCavg of 0.84 ± 0.05 and IoUavg of 0.74 ± 0.06 on the test dataset, compared against individual models. It mitigated false positives, overestimations, and underestimations observed in individual U-Net models. Moreover, the ensemble of models reduced average false positives per scan in the test dataset (1.57 nodules/scan) compared to individual models (2.69-3.39 nodules/scan).

CONCLUSIONS: The suggested ensemble approach presents a strong and effective strategy for automatically detecting and delineating lung nodules, potentially aiding CAD systems in clinical settings. This approach could assist radiologists in laborious and meticulous lung nodule detection tasks in CT scans, improving lung cancer diagnosis and treatment planning.

PMID:39044036 | DOI:10.1007/s11548-024-03222-y

Categories: Literature Watch

A scheme combining feature fusion and hybrid deep learning models for epileptic seizure detection and prediction

Tue, 2024-07-23 06:00

Sci Rep. 2024 Jul 23;14(1):16916. doi: 10.1038/s41598-024-67855-4.

ABSTRACT

Epilepsy is one of the most well-known neurological disorders globally, leading to individuals experiencing sudden seizures and significantly impacting their quality of life. Hence, there is an urgent necessity for an efficient method to detect and predict seizures in order to mitigate the risks faced by epilepsy patients. In this paper, a new method for seizure detection and prediction is proposed, which is based on multi-class feature fusion and the convolutional neural network-gated recurrent unit-attention mechanism (CNN-GRU-AM) model. Initially, the Electroencephalography (EEG) signal undergoes wavelet decomposition through the Discrete Wavelet Transform (DWT), resulting in six subbands. Subsequently, time-frequency domain and nonlinear features are extracted from each subband. Finally, the CNN-GRU-AM further extracts features and performs classification. The CHB-MIT dataset is used to validate the proposed approach. The results of tenfold cross validation show that our method achieved a sensitivity of 99.24% and 95.47%, specificity of 99.51% and 94.93%, accuracy of 99.35% and 95.16%, and an AUC of 99.34% and 95.15% in seizure detection and prediction tasks, respectively. The results show that the method proposed in this paper can effectively achieve high-precision detection and prediction of seizures, so as to remind patients and doctors to take timely protective measures.

PMID:39043914 | DOI:10.1038/s41598-024-67855-4

Categories: Literature Watch

Exceptional performance with minimal data using a generative adversarial network for alzheimer's disease classification

Tue, 2024-07-23 06:00

Sci Rep. 2024 Jul 24;14(1):17037. doi: 10.1038/s41598-024-66874-5.

ABSTRACT

The classification of Alzheimer's disease (AD) using deep learning models is hindered by the limited availability of data. Medical image datasets are scarce due to stringent regulations on patient privacy, preventing their widespread use in research. Moreover, although open-access databases such as the Open Access Series of Imaging Studies (OASIS) are available publicly for providing medical image data for research, they often suffer from imbalanced classes. Thus, to address the issue of insufficient data, this study proposes the integration of a generative adversarial network (GAN) that can achieve comparable accuracy with a reduced data requirement. GANs are unsupervised deep learning networks commonly used for data augmentation that generate high-quality synthetic data to overcome data scarcity. Experimental data from the OASIS database are used in this research to train the GAN model in generating synthetic MRI data before being included in a pretrained convolutional neural network (CNN) model for multistage AD classification. As a result, this study has demonstrated that a multistage AD classification accuracy above 80% can be achieved even with a reduced dataset. The exceptional performance of GANs positions them as a solution for overcoming the challenge of insufficient data in AD classification.

PMID:39043757 | DOI:10.1038/s41598-024-66874-5

Categories: Literature Watch

Human gender estimation from CT images of skull using deep feature selection and feature fusion

Tue, 2024-07-23 06:00

Sci Rep. 2024 Jul 23;14(1):16879. doi: 10.1038/s41598-024-65521-3.

ABSTRACT

This research endeavors to prognosticate gender by harnessing the potential of skull computed tomography (CT) images, given the seminal role of gender identification in the realm of identification. The study encompasses a corpus of CT images of cranial structures derived from 218 male and 203 female subjects, constituting a total cohort of 421 individuals within the age bracket of 25 to 65 years. Employing deep learning, a prominent subset of machine learning algorithms, the study deploys convolutional neural network (CNN) models to excavate profound attributes inherent in the skull CT images. In pursuit of the research objective, the focal methodology involves the exclusive application of deep learning algorithms to image datasets, culminating in an accuracy rate of 96.4%. The gender estimation process exhibits a precision of 96.1% for male individuals and 96.8% for female individuals. The precision performance varies across different selections of feature numbers, namely 100, 300, and 500, alongside 1000 features without feature selection. The respective precision rates for these selections are recorded as 95.0%, 95.5%, 96.2%, and 96.4%. It is notable that gender estimation via visual radiography mitigates the discrepancy in measurements between experts, concurrently yielding an expedited estimation rate. Predicated on the empirical findings of this investigation, it is inferred that the efficacy of the CNN model, the configurational intricacies of the classifier, and the judicious selection of features collectively constitute pivotal determinants in shaping the performance attributes of the proposed methodology.

PMID:39043755 | DOI:10.1038/s41598-024-65521-3

Categories: Literature Watch

Multimodal deep learning using on-chip diffractive optics with in situ training capability

Tue, 2024-07-23 06:00

Nat Commun. 2024 Jul 23;15(1):6189. doi: 10.1038/s41467-024-50677-3.

ABSTRACT

Multimodal deep learning plays a pivotal role in supporting the processing and learning of diverse data types within the realm of artificial intelligence generated content (AIGC). However, most photonic neuromorphic processors for deep learning can only handle a single data modality (either vision or audio) due to the lack of abundant parameter training in optical domain. Here, we propose and demonstrate a trainable diffractive optical neural network (TDONN) chip based on on-chip diffractive optics with massive tunable elements to address these constraints. The TDONN chip includes one input layer, five hidden layers, and one output layer, and only one forward propagation is required to obtain the inference results without frequent optical-electrical conversion. The customized stochastic gradient descent algorithm and the drop-out mechanism are developed for photonic neurons to realize in situ training and fast convergence in the optical domain. The TDONN chip achieves a potential throughput of 217.6 tera-operations per second (TOPS) with high computing density (447.7 TOPS/mm2), high system-level energy efficiency (7.28 TOPS/W), and low optical latency (30.2 ps). The TDONN chip has successfully implemented four-class classification in different modalities (vision, audio, and touch) and achieve 85.7% accuracy on multimodal test sets. Our work opens up a new avenue for multimodal deep learning with integrated photonic processors, providing a potential solution for low-power AI large models using photonic technology.

PMID:39043669 | DOI:10.1038/s41467-024-50677-3

Categories: Literature Watch

A multi-classifier system integrated by clinico-histology-genomic analysis for predicting recurrence of papillary renal cell carcinoma

Tue, 2024-07-23 06:00

Nat Commun. 2024 Jul 23;15(1):6215. doi: 10.1038/s41467-024-50369-y.

ABSTRACT

Integrating genomics and histology for cancer prognosis demonstrates promise. Here, we develop a multi-classifier system integrating a lncRNA-based classifier, a deep learning whole-slide-image-based classifier, and a clinicopathological classifier to accurately predict post-surgery localized (stage I-III) papillary renal cell carcinoma (pRCC) recurrence. The multi-classifier system demonstrates significantly higher predictive accuracy for recurrence-free survival (RFS) compared to the three single classifiers alone in the training set and in both validation sets (C-index 0.831-0.858 vs. 0.642-0.777, p < 0.05). The RFS in our multi-classifier-defined high-risk stage I/II and grade 1/2 groups is significantly worse than in the low-risk stage III and grade 3/4 groups (p < 0.05). Our multi-classifier system is a practical and reliable predictor for recurrence of localized pRCC after surgery that can be used with the current staging system to more accurately predict disease course and inform strategies for individualized adjuvant therapy.

PMID:39043664 | DOI:10.1038/s41467-024-50369-y

Categories: Literature Watch

Deep Learning for Predicting the Difficulty Level of Removing the Impacted Mandibular Third Molar

Tue, 2024-07-23 06:00

Int Dent J. 2024 Jul 22:S0020-6539(24)00193-X. doi: 10.1016/j.identj.2024.06.021. Online ahead of print.

ABSTRACT

BACKGROUND: Preoperative assessment of the impacted mandibular third molar (LM3) in a panoramic radiograph is important in surgical planning. The aim of this study was to develop and evaluate a computer-aided visualisation-based deep learning (DL) system using a panoramic radiograph to predict the difficulty level of surgical removal of an impacted LM3.

METHODS: The study included 1367 LM3 images from 784 patients who presented from 2021-2023 to the University Dental Hospital; images were collected retrospectively. The difficulty level of surgically removing impacted LM3s was assessed via our newly developed DL system, which seamlessly integrated 3 distinct DL models. ResNet101V2 handled binary classification for identifying impacted LM3s in panoramic radiographs, RetinaNet detected the precise location of the impacted LM3, and Vision Transformer performed multiclass image classification to evaluate the difficulty levels of removing the detected impacted LM3.

RESULTS: The ResNet101V2 model achieved a classification accuracy of 0.8671. The RetinaNet model demonstrated exceptional detection performance, with a mean average precision of 0.9928. Additionally, the Vision Transformer model delivered an average accuracy of 0.7899 in predicting removal difficulty levels.

CONCLUSIONS: The development of a 3-phase computer-aided visualisation-based DL system has yielded a very good performance in using panoramic radiographs to predict the difficulty level of surgically removing an impacted LM3.

PMID:39043529 | DOI:10.1016/j.identj.2024.06.021

Categories: Literature Watch

Development and Validation of a Biparametric MRI Deep Learning Radiomics Model with Clinical Characteristics for Predicting Perineural Invasion in Patients with Prostate Cancer

Tue, 2024-07-23 06:00

Acad Radiol. 2024 Jul 22:S1076-6332(24)00447-1. doi: 10.1016/j.acra.2024.07.013. Online ahead of print.

ABSTRACT

RATIONALE AND OBJECTIVES: Perineural invasion (PNI) is an important prognostic biomarker for prostate cancer (PCa). This study aimed to develop and validate a predictive model integrating biparametric MRI-based deep learning radiomics and clinical characteristics for the non-invasive prediction of PNI in patients with PCa.

MATERIALS AND METHODS: In this prospective study, 557 PCa patients who underwent preoperative MRI and radical prostatectomy were recruited and randomly divided into the training and the validation cohorts at a ratio of 7:3. Clinical model for predicting PNI was constructed by univariate and multivariate regression analyses on various clinical indicators, followed by logistic regression. Radiomics and deep learning methods were used to develop different MRI-based radiomics and deep learning models. Subsequently, the clinical, radiomics, and deep learning signatures were combined to develop the integrated deep learning-radiomics-clinical model (DLRC). The performance of the models was assessed by plotting the receiver operating characteristic (ROC) curves and precision-recall (PR) curves, as well as calculating the area under the ROC and PR curves (ROC-AUC and PR-AUC). The calibration curve and decision curve were used to evaluate the model's goodness of fit and clinical benefit.

RESULTS: The DLRC model demonstrated the highest performance in both the training and the validation cohorts, with ROC-AUCs of 0.914 and 0.848, respectively, and PR-AUCs of 0.948 and 0.926, respectively. The DLRC model showed good calibration and clinical benefit in both cohorts.

CONCLUSION: The DLRC model, which integrated clinical, radiomics, and deep learning signatures, can serve as a robust tool for predicting PNI in patients with PCa, thus aiding in developing effective treatment strategies.

PMID:39043515 | DOI:10.1016/j.acra.2024.07.013

Categories: Literature Watch

A DEEP LEARNING APPROACH TO DETECTION OF ORAL CANCER LESIONS FROM INTRA ORAL PATIENT IMAGES: A PRELIMINARY RETROSPECTIVE STUDY

Tue, 2024-07-23 06:00

J Stomatol Oral Maxillofac Surg. 2024 Jul 21:101975. doi: 10.1016/j.jormas.2024.101975. Online ahead of print.

ABSTRACT

INTRODUCTION: Oral squamous cell carcinomas (OSCC) seen in the oral cavity are a category of diseases for which dentists may diagnose and even cure. This study evaluated the performance of diagnostic computer software developed to detect oral cancer lesions in intra-oral retrospective patient images.

MATERIALS AND METHODS: Oral cancer lesions were labeled with CranioCatch labeling program (CranioCatch, Eskişehir, Turkey) and polygonal type labeling method on a total of 65 anonymous retrospective intraoral patient images of oral mucosa that were diagnosed with oral cancer histopathologically by incisional biopsy from individuals in our clinic. All images have been rechecked and verified by experienced experts. This data set was divided into training (n = 53), validation (n = 6) and test (n = 6) sets. Artificial intelligence model was developed using YOLOv5 architecture, which is a deep learning approach. Model success was evaluated with confusion matrix.

RESULTS: When the success rate in estimating the images reserved for the test not used in education was evaluated, the F1, sensitivity and precision results of the artificial intelligence model obtained using the YOLOv5 architecture were found to be 0.667, 0.667 and 0.667, respectively.

CONCLUSIONS: Our study reveals that OCSCC lesions carry discriminative visual appearances, which can be identified by deep learning algorithm. Artificial intelligence shows promise in the prediagnosis of oral cancer lesions. The success rates will increase in the training models of the data set that will be formed with more images.

PMID:39043293 | DOI:10.1016/j.jormas.2024.101975

Categories: Literature Watch

X-ray lens figure errors retrieved by deep learning from several beam intensity images

Tue, 2024-07-23 06:00

J Synchrotron Radiat. 2024 Sep 1. doi: 10.1107/S1600577524004958. Online ahead of print.

ABSTRACT

The phase problem in the context of focusing synchrotron beams with X-ray lenses is addressed. The feasibility of retrieving the surface error of a lens system by using only the intensity of the propagated beam at several distances is demonstrated. A neural network, trained with a few thousand simulations using random errors, can predict accurately the lens error profile that accounts for all aberrations. It demonstrates the feasibility of routinely measuring the aberrations induced by an X-ray lens, or another optical system, using only a few intensity images.

PMID:39042577 | DOI:10.1107/S1600577524004958

Categories: Literature Watch

Comprehensive Production Index Prediction Using Dual-Scale Deep Learning in Mineral Processing

Tue, 2024-07-23 06:00

IEEE Trans Neural Netw Learn Syst. 2024 Jul 23;PP. doi: 10.1109/TNNLS.2024.3421570. Online ahead of print.

ABSTRACT

In mineral processing, the dynamic nature of industrial data poses challenges for decision-makers in accurately assessing current production statuses. To enhance the decision-making process, it is crucial to predict comprehensive production indices (CPIs), which are influenced by both human operators and industrial processes, and demonstrate a strong dual-scale property. To improve the accuracy of CPIs' prediction, we introduce the high-frequency (HF) unit and low-frequency (LF) unit within our proposed dual-scale deep learning (DL) network. This architecture enables the exploration of nonlinear dynamic mapping in dual-scale industrial data. By integrating the Cloud-Edge collaboration mechanism with DL, our training strategy mitigates the dominance of HF data and guides networks to prioritize different frequency information. Through self-tuning training via Cloud-Edge collaboration, the optimal model structure and parameters on the cloud server are adjusted, with the edge model self-updating accordingly. Validated through online industrial experiments, our method significantly enhances CPIs' prediction accuracy compared to the baseline approaches.

PMID:39042548 | DOI:10.1109/TNNLS.2024.3421570

Categories: Literature Watch

SaccpaNet: A Separable Atrous Convolution-based Cascade Pyramid Attention Network to Estimate Body Landmarks Using Cross-modal Knowledge Transfer for Under-blanket Sleep Posture Classification

Tue, 2024-07-23 06:00

IEEE J Biomed Health Inform. 2024 Jul 23;PP. doi: 10.1109/JBHI.2024.3432195. Online ahead of print.

ABSTRACT

The accuracy of sleep posture assessment in standard polysomnography might be compromised by the unfamiliar sleep lab environment. In this work, we aim to develop a depth camera-based sleep posture monitoring and classification system for home or community usage and tailor a deep learning model that can account for blanket interference. Our model included a joint coordinate estimation network (JCE) and sleep posture classification network (SPC). SaccpaNet (Separable Atrous Convolution-based Cascade Pyramid Attention Network) was developed using a combination of pyramidal structure of residual separable atrous convolution unit to reduce computational cost and enlarge receptive field. The Saccpa attention unit served as the core of JCE and SPC, while different backbones for SPC were also evaluated. The model was cross-modally pretrained by RGB images from the COCO whole body dataset and then trained/tested using dept image data collected from 150 participants performing seven sleep postures across four blanket conditions. Besides, we applied a data augmentation technique that used intra-class mix-up to synthesize blanket conditions; and an overlaid flip-cut to synthesize partially covered blanket conditions for a robustness that we referred to as the Post-hoc Data Augmentation Robustness Test (PhD-ART). Our model achieved an average precision of estimated joint coordinate (in terms of PCK@0.1) of 0.652 and demonstrated adequate robustness. The overall classification accuracy of sleep postures (F1-score) was 0.885 and 0.940, for 7- and 6-class classification, respectively. Our system was resistant to the interference of blanket, with a spread difference of 2.5%.

PMID:39042546 | DOI:10.1109/JBHI.2024.3432195

Categories: Literature Watch

Learning a Hand Model from Dynamic Movements Using High-Density EMG and Convolutional Neural Networks

Tue, 2024-07-23 06:00

IEEE Trans Biomed Eng. 2024 Jul 23;PP. doi: 10.1109/TBME.2024.3432800. Online ahead of print.

ABSTRACT

OBJECTIVE: Surface electromyography (sEMG) can sense the motor commands transmitted to the muscles. This work presents a deep learning method that can decode the electrophysiological activity of the forearm muscles into the movements of the human hand.

METHODS: We have recorded the kinematics and kinetics of the hand during a wide range of grasps and individual digit movements that cover 22 degrees of freedom of the hand at slow (0.5 Hz) and comfortable (1.5 Hz) movement speeds in 13 healthy participants. The input of the model consists of 320 non-invasive EMG sensors placed on the extrinsic hand muscles.

RESULTS: Our network achieves accurate continuous estimation of both kinematics and kinetics, surpassing the performance of comparable networks reported in the literature. By examining the latent space of the network, we find evidence that it mapped EMG activity into the anatomy of the hand at the individual digit level. In contrast to what is observed from the low-pass filtered EMG and linear decoding approaches, we found that the full-bandwidth EMG (monopolar unfiltered) signals during synergistic and individual digit movements contain distinct neural embeddings that encode each movement of the human hand. These manifolds consistently represent the anatomy of the hand and are generalized across participants. Moreover, we found a task-specific distribution of the embeddings without any presence of correlated activations during multi- and individual-digit tasks.

CONCLUSION/SIGNIFICANCE: The proposed method could advance the control of assistive hand devices by providing a robust and intuitive interface between muscle signals and hand movements.

PMID:39042539 | DOI:10.1109/TBME.2024.3432800

Categories: Literature Watch

DifFace: Blind Face Restoration with Diffused Error Contraction

Tue, 2024-07-23 06:00

IEEE Trans Pattern Anal Mach Intell. 2024 Jul 23;PP. doi: 10.1109/TPAMI.2024.3432651. Online ahead of print.

ABSTRACT

While deep learning-based methods for blind face restoration have achieved unprecedented success, they still suffer from two major limitations. First, most of them deteriorate when facing complex degradations out of their training data. Second, these methods require multiple constraints, e.g., fidelity, perceptual, and adversarial losses, which require laborious hyper-parameter tuning to stabilize and balance their influences. In this work, we propose a novel method named DifFace that is capable of coping with unseen and complex degradations more gracefully without complicated loss designs. The key of our method is to establish a posterior distribution from the observed low-quality (LQ) image to its high-quality (HQ) counterpart. In particular, we design a transition distribution from the LQ image to the intermediate state of a pre-trained diffusion model and then gradually transmit from this intermediate state to the HQ target by recursively applying a pre-trained diffusion model. The transition distribution only relies on a restoration backbone that is trained with L1 loss on some synthetic data, which favorably avoids the cumbersome training process in existing methods. Moreover, the transition distribution can contract the error of the restoration backbone and thus makes our method more robust to unknown degradations. Comprehensive experiments show that DifFace is superior to current state-of-the-art methods, especially in cases with severe degradations. Code and model are available at https://github.com/zsyOAOA/DifFace.

PMID:39042531 | DOI:10.1109/TPAMI.2024.3432651

Categories: Literature Watch

Pages