Deep learning
DRA-UNet for Coal Mining Ground Surface Crack Delineation with UAV High-Resolution Images
Sensors (Basel). 2024 Sep 4;24(17):5760. doi: 10.3390/s24175760.
ABSTRACT
Coal mining in the Loess Plateau can very easily generate ground cracks, and these cracks can immediately result in ventilation trouble under the mine shaft, runoff disturbance, and vegetation destruction. Advanced UAV (Unmanned Aerial Vehicle) high-resolution mapping and DL (Deep Learning) are introduced as the key methods to quickly delineate coal mining ground surface cracks for disaster prevention. Firstly, the dataset named the Ground Cracks of Coal Mining Area Unmanned Aerial Vehicle (GCCMA-UAV) is built, with a ground resolution of 3 cm, which is suitable to make a 1:500 thematic map of the ground crack. This GCCMA-UAV dataset includes 6280 images of ground cracks, and the size of the imagery is 256 × 256 pixels. Secondly, the DRA-UNet model is built effectively for coal mining ground surface crack delineation. This DRA-UNet model is an improved UNet DL model, which mainly includes the DAM (Dual Dttention Dechanism) module, the RN (residual network) module, and the ASPP (Atrous Spatial Pyramid Pooling) module. The DRA-UNet model shows the highest recall rate of 77.29% when the DRA-UNet was compared with other similar DL models, such as DeepLabV3+, SegNet, PSPNet, and so on. DRA-UNet also has other relatively reliable indicators; the precision rate is 84.92% and the F1 score is 78.87%. Finally, DRA-UNet is applied to delineate cracks on a DOM (Digital Orthophoto Map) of 3 km2 in the mining workface area, with a ground resolution of 3 cm. There were 4903 cracks that were delineated from the DOM in the Huojitu Coal Mine Shaft. This DRA-UNet model effectively improves the efficiency of crack delineation.
PMID:39275672 | DOI:10.3390/s24175760
Moving sampling physics-informed neural networks induced by moving mesh PDE
Neural Netw. 2024 Sep 10;180:106706. doi: 10.1016/j.neunet.2024.106706. Online ahead of print.
ABSTRACT
In this work, we propose an end-to-end adaptive sampling framework based on deep neural networks and the moving mesh method (MMPDE-Net), which can adaptively generate new sampling points by solving the moving mesh PDE. This model focuses on improving the quality of sampling points generation. Moreover, we develop an iterative algorithm based on MMPDE-Net, which makes sampling points distribute more precisely and controllably. Since MMPDE-Net is independent of the deep learning solver, we combine it with physics-informed neural networks (PINN) to propose moving sampling PINN (MS-PINN) and show the error estimate of our method under some assumptions. Finally, we demonstrate the performance improvement of MS-PINN compared to PINN through numerical experiments of four typical examples, which numerically verify the effectiveness of our method.
PMID:39270348 | DOI:10.1016/j.neunet.2024.106706
Trem2-expressing multinucleated giant macrophages are a biomarker of good prognosis in head and neck squamous cell carcinoma
Cancer Discov. 2024 Sep 16. doi: 10.1158/2159-8290.CD-24-0018. Online ahead of print.
ABSTRACT
Patients with head and neck squamous cell carcinomas (HNSCC) often have poor outcomes due to suboptimal risk-management and treatment strategies; yet integrating novel prognostic biomarkers into clinical practice is challenging. Here, we report the presence of multinucleated giant cells (MGC) - a type of macrophages - in tumors from patients with HNSCC, which are associated with a favorable prognosis in treatment-naive and preoperative-chemotherapy-treated patients. Importantly, MGC density increased in tumors following preoperative therapy, suggesting a role of these cells in the anti-tumoral response. To enable clinical translation of MGC density as a prognostic marker, we developed a deep-learning model to automate its quantification on routinely stained pathological whole slide images. Finally, we used spatial transcriptomic and proteomic approaches to describe the MGC-related tumor microenvironment and observed an increase in central memory CD4 T cells. We defined an MGC-specific signature resembling to TREM2-expressing mononuclear tumor associated macrophages, which co-localized in keratin tumor niches.
PMID:39270324 | DOI:10.1158/2159-8290.CD-24-0018
Wavefront sensing with optical differentiation powered by deep learning
Opt Lett. 2024 Sep 15;49(18):5216-5219. doi: 10.1364/OL.530559.
ABSTRACT
We report the experimental demonstration of an optical differentiation wavefront sensor (ODWS) based on binary pixelated linear and nonlinear amplitude filtering in the far-field. We trained and tested a convolutional neural network that reconstructs the spatial phase map from nonlinear-filter-based ODWS data for which an analytic reconstruction algorithm is not available. It shows accurate zonal retrieval over different magnitudes of wavefronts and on randomly shaped wavefronts. This work paves the way for the implementation of simultaneously sensitive, high dynamic range, and high-resolution wavefront sensing.
PMID:39270269 | DOI:10.1364/OL.530559
4 × 4 differential index modulation for optical orthogonal frequency division multiplexing
Opt Lett. 2024 Sep 15;49(18):5155-5158. doi: 10.1364/OL.530280.
ABSTRACT
In this Letter, we propose and demonstrate a 4 × 4 differential index modulation (DIM) scheme for optical orthogonal frequency division multiplexing (OOFDM) systems. The 4 × 4 DIM scheme avoids complex channel estimation by performing differential index modulation in the time-frequency domain, with the key aspect lying in the design of the time-frequency dispersion matrix that integrates the indices and constellation symbols. Moreover, we design a deep learning-based DIMFormer detector for the high decoding complexity problem of maximum likelihood (ML) detection. Experimental results show that the 4 × 4 DIM in OOFDM systems eliminates the need for complex channel estimation, and the loss of signal-to-noise ratio (SNR) is no more than 1 dB compared to conventional index modulation. The designed DIMFormer detector reduces the computational complexity by 38.98% as well as the time complexity by 99% compared to ML.
PMID:39270253 | DOI:10.1364/OL.530280
Accelerated CEST imaging through deep learning quantification from reduced frequency offsets
Magn Reson Med. 2024 Sep 13. doi: 10.1002/mrm.30269. Online ahead of print.
ABSTRACT
PURPOSE: To shorten CEST acquisition time by leveraging Z-spectrum undersampling combined with deep learning for CEST map construction from undersampled Z-spectra.
METHODS: Fisher information gain analysis identified optimal frequency offsets (termed "Fisher offsets") for the multi-pool fitting model, maximizing information gain for the amplitude and the FWHM parameters. These offsets guided initial subsampling levels. A U-NET, trained on undersampled brain CEST images from 18 volunteers, produced CEST maps at 3 T with varied undersampling levels. Feasibility was first tested using retrospective undersampling at three levels, followed by prospective in vivo undersampling (15 of 53 offsets), reducing scan time significantly. Additionally, glioblastoma grade IV pathology was simulated to evaluate network performance in patient-like cases.
RESULTS: Traditional multi-pool models failed to quantify CEST maps from undersampled images (structural similarity index [SSIM] <0.2, peak SNR <20, Pearson r <0.1). Conversely, U-NET fitting successfully addressed undersampled data challenges. The study suggests CEST scan time reduction is feasible by undersampling 15, 25, or 35 of 53 Z-spectrum offsets. Prospective undersampling cut scan time by 3.5 times, with a maximum mean squared error of 4.4e-4, r = 0.82, and SSIM = 0.84, compared to the ground truth. The network also reliably predicted CEST values for simulated glioblastoma pathology.
CONCLUSION: The U-NET architecture effectively quantifies CEST maps from undersampled Z-spectra at various undersampling levels.
PMID:39270056 | DOI:10.1002/mrm.30269
CL-Informer: Long time series prediction model based on continuous wavelet transform
PLoS One. 2024 Sep 13;19(9):e0303990. doi: 10.1371/journal.pone.0303990. eCollection 2024.
ABSTRACT
Time series, a type of data that measures how things change over time, remains challenging to predict. In order to improve the accuracy of time series prediction, a deep learning model CL-Informer is proposed. In the Informer model, an embedding layer based on continuous wavelet transform is added so that the model can capture the characteristics of multi-scale data, and the LSTM layer is used to capture the data dependency further and process the redundant information in continuous wavelet transform. To demonstrate the reliability of the proposed CL-Informer model, it is compared with mainstream forecasting models such as Informer, Informer+, and Reformer on five datasets. Experimental results demonstrate that the CL-Informer model achieves an average reduction of 30.64% in MSE across various univariate prediction horizons and a reduction of 10.70% in MSE across different multivariate prediction horizons, thereby improving the accuracy of Informer in long sequence prediction and enhancing the model's precision.
PMID:39269969 | DOI:10.1371/journal.pone.0303990
SwinDFU-Net: Deep learning transformer network for infection identification in diabetic foot ulcer
Technol Health Care. 2024 Aug 29. doi: 10.3233/THC-241444. Online ahead of print.
ABSTRACT
BACKGROUND: The identification of infection in diabetic foot ulcers (DFUs) is challenging due to variability within classes, visual similarity between classes, reduced contrast with healthy skin, and presence of artifacts. Existing studies focus on visual characteristics and tissue classification rather than infection detection, critical for assessing DFUs and predicting amputation risk.
OBJECTIVE: To address these challenges, this study proposes a deep learning model using a hybrid CNN and Swin Transformer architecture for infection classification in DFU images. The aim is to leverage end-to-end mapping without prior knowledge, integrating local and global feature extraction to improve detection accuracy.
METHODS: The proposed model utilizes a hybrid CNN and Swin Transformer architecture. It employs the Grad CAM technique to visualize the decision-making process of the CNN and Transformer blocks. The DFUC Challenge dataset is used for training and evaluation, emphasizing the model's ability to accurately classify DFU images into infected and non-infected categories.
RESULTS: The model achieves high performance metrics: sensitivity (95.98%), specificity (97.08%), accuracy (96.52%), and Matthews Correlation Coefficient (0.93). These results indicate the model's effectiveness in quickly diagnosing DFU infections, highlighting its potential as a valuable tool for medical professionals.
CONCLUSION: The hybrid CNN and Swin Transformer architecture effectively combines strengths from both models, enabling accurate classification of DFU images as infected or non-infected, even in complex scenarios. The use of Grad CAM provides insights into the model's decision process, aiding in identifying infected regions within DFU images. This approach shows promise for enhancing clinical assessment and management of DFU infections.
PMID:39269872 | DOI:10.3233/THC-241444
Revolutionizing health monitoring: Integrating transformer models with multi-head attention for precise human activity recognition using wearable devices
Technol Health Care. 2024 Aug 29. doi: 10.3233/THC-241064. Online ahead of print.
ABSTRACT
BACKGROUND: A daily activity routine is vital for overall health and well-being, supporting physical and mental fitness. Consistent physical activity is linked to a multitude of benefits for the body, mind, and emotions, playing a key role in raising a healthy lifestyle. The use of wearable devices has become essential in the realm of health and fitness, facilitating the monitoring of daily activities. While convolutional neural networks (CNN) have proven effective, challenges remain in quickly adapting to a variety of activities.
OBJECTIVE: This study aimed to develop a model for precise recognition of human activities to revolutionize health monitoring by integrating transformer models with multi-head attention for precise human activity recognition using wearable devices.
METHODS: The Human Activity Recognition (HAR) algorithm uses deep learning to classify human activities using spectrogram data. It uses a pretrained convolution neural network (CNN) with a MobileNetV2 model to extract features, a dense residual transformer network (DRTN), and a multi-head multi-level attention architecture (MH-MLA) to capture time-related patterns. The model then blends information from both layers through an adaptive attention mechanism and uses a SoftMax function to provide classification probabilities for various human activities.
RESULTS: The integrated approach, combining pretrained CNN with transformer models to create a thorough and effective system for recognizing human activities from spectrogram data, outperformed these methods in various datasets - HARTH, KU-HAR, and HuGaDB produced accuracies of 92.81%, 97.98%, and 95.32%, respectively. This suggests that the integration of diverse methodologies yields good results in capturing nuanced human activities across different activities. The comparison analysis showed that the integrated system consistently performs better for dynamic human activity recognition datasets.
CONCLUSION: In conclusion, maintaining a routine of daily activities is crucial for overall health and well-being. Regular physical activity contributes substantially to a healthy lifestyle, benefiting both the body and the mind. The integration of wearable devices has simplified the monitoring of daily routines. This research introduces an innovative approach to human activity recognition, combining the CNN model with a dense residual transformer network (DRTN) with multi-head multi-level attention (MH-MLA) within the transformer architecture to enhance its capability.
PMID:39269866 | DOI:10.3233/THC-241064
Change Representation and Extraction in Stripes: Rethinking Unsupervised Hyperspectral Image Change Detection With an Untrained Network
IEEE Trans Image Process. 2024 Sep 13;PP. doi: 10.1109/TIP.2024.3438100. Online ahead of print.
ABSTRACT
Deep learning-based hyperspectral image (HSI) change detection (CD) approaches have a strong ability to leverage spectral-spatial-temporal information through automatic feature extraction, and currently dominate in the research field. However, their efficiency and universality are limited by the dependency on labeled data. Although the newly applied untrained networks can avoid the need for labeled data, their feature volatility from the simple difference space easily leads to inaccurate CD results. Inspired by the interesting finding that salient changes appear as bright "stripes" in a new feature space, we propose a novel unsupervised CD method that represents and models changes in stripes for HSIs (named as StripeCD), which integrates optimization modeling into an untrained network. The StripeCD method constructs a new feature space that represents change features in stripes and models them in a novel optimization manner. It consists of three main parts: 1) A dual-branch untrained convolutional network, which is utilized to extract deep difference features from bitemporal HSIs and combined with a two-stage channel selection strategy to emphasize the important channels that contribute to CD. 2) A multiscale forward-backward segmentation framework, which is proposed for salient change representation. It transforms deep difference features into a new feature space by exploiting the structure information of ground objects and associates salient changes with the stripe-shaped change component. 3) A stripe-shaped change extraction model, which characterizes the global sparsity and local discontinuity of salient changes. It explores the intrinsic properties of deep difference features and constructs model-based constraints to better identify changed regions in a controllable manner. The proposed StripeCD method outperformed the state-of-the-art unsupervised CD approaches on three widely used datasets. In addition, the proposed StripeCD method indicates the potential for further investigation of untrained networks in facilitating reliable CD.
PMID:39269800 | DOI:10.1109/TIP.2024.3438100
Using AI to Differentiate Mpox From Common Skin Lesions in a Sexual Health Clinic: Algorithm Development and Validation Study
J Med Internet Res. 2024 Sep 13;26:e52490. doi: 10.2196/52490.
ABSTRACT
BACKGROUND: The 2022 global outbreak of mpox has significantly impacted health facilities, and necessitated additional infection prevention and control measures and alterations to clinic processes. Early identification of suspected mpox cases will assist in mitigating these impacts.
OBJECTIVE: We aimed to develop and evaluate an artificial intelligence (AI)-based tool to differentiate mpox lesion images from other skin lesions seen in a sexual health clinic.
METHODS: We used a data set with 2200 images, that included mpox and non-mpox lesions images, collected from Melbourne Sexual Health Centre and web resources. We adopted deep learning approaches which involved 6 different deep learning architectures to train our AI models. We subsequently evaluated the performance of each model using a hold-out data set and an external validation data set to determine the optimal model for differentiating between mpox and non-mpox lesions.
RESULTS: The DenseNet-121 model outperformed other models with an overall area under the receiver operating characteristic curve (AUC) of 0.928, an accuracy of 0.848, a precision of 0.942, a recall of 0.742, and an F1-score of 0.834. Implementation of a region of interest approach significantly improved the performance of all models, with the AUC for the DenseNet-121 model increasing to 0.982. This approach resulted in an increase in the correct classification of mpox images from 79% (55/70) to 94% (66/70). The effectiveness of this approach was further validated by a visual analysis with gradient-weighted class activation mapping, demonstrating a reduction in false detection within the background of lesion images. On the external validation data set, ResNet-18 and DenseNet-121 achieved the highest performance. ResNet-18 achieved an AUC of 0.990 and an accuracy of 0.947, and DenseNet-121 achieved an AUC of 0.982 and an accuracy of 0.926.
CONCLUSIONS: Our study demonstrated it was possible to use an AI-based image recognition algorithm to accurately differentiate between mpox and common skin lesions. Our findings provide a foundation for future investigations aimed at refining the algorithm and establishing the place of such technology in a sexual health clinic.
PMID:39269753 | DOI:10.2196/52490
MMFA-DTA: Multimodal Feature Attention Fusion Network for Drug-Target Affinity Prediction for Drug Repurposing Against SARS-CoV-2
J Chem Theory Comput. 2024 Sep 13. doi: 10.1021/acs.jctc.4c00663. Online ahead of print.
ABSTRACT
The continuous emergence of novel infectious diseases poses a significant threat to global public health security, necessitating the development of small-molecule inhibitors that directly target pathogens. The RNA-dependent RNA polymerase (RdRp) and main protease (Mpro) of SARS-CoV-2 have been validated as potential key antiviral drug targets for the treatment of COVID-19. However, the conventional new drug R&D cycle takes 10-15 years, failing to meet the urgent needs during epidemics. Here, we propose a general multimodal deep learning framework for drug repurposing, MMFA-DTA, to enable rapid virtual screening of known drugs and significantly improve discovery efficiency. By extracting graph topological and sequence features from both small molecules and proteins, we design attention mechanisms to achieve dynamic fusion across modalities. Results demonstrate the superior performance of MMFA-DTA in drug-target affinity prediction over several state-of-the-art baseline methods on Davis and KIBA data sets, validating the benefits of heterogeneous information integration for representation learning and interaction modeling. Further fine-tuning on COVID-19-relevant bioactivity data enhances model predictions for critical SARS-CoV-2 enzymes. Case studies screening the FDA-approved drug library successfully identify etacrynic acid as the potential lead compound against both RdRp and Mpro. Molecular dynamics simulations further confirm the stability and binding affinity of etacrynic acid to these targets. This study proves the great potential and advantages of deep learning and drug repurposing strategies in supporting antiviral drug discovery. The proposed general and rapid response computational framework holds significance for preparedness against future public health events.
PMID:39269697 | DOI:10.1021/acs.jctc.4c00663
A zero precision loss framework for EEG channel selection: enhancing efficiency and maintaining interpretability
Comput Methods Biomech Biomed Engin. 2024 Sep 13:1-16. doi: 10.1080/10255842.2024.2401918. Online ahead of print.
ABSTRACT
The brain-computer interface (BCI) systems based on motor imagery typically rely on a large number of electrode channels to acquire information. The rational selection of electroencephalography (EEG) channel combinations is crucial for optimizing computational efficiency and enhancing practical applicability. However, evaluating all potential channel combinations individually is impractical. This study aims to explore a strategy for quickly achieving a balance between maximizing channel reduction and minimizing precision loss. To this end, we developed a spatio-temporal attention perception network named STAPNet. Based on the channel contributions adaptively generated by its subnetwork, we propose an extended step bi-directional search strategy that includes variable ratio channel selection (VRCS) and strided greedy channel selection (SGCS), designed to enhance global search capabilities and accelerate the optimization process. Experimental results show that on the High Gamma and BCI Competition IV 2a public datasets, the framework respectively achieved average maximum accuracies of 91.47% and 84.17%. Under conditions of zero precision loss, the average number of channels was reduced by a maximum of 87.5%. Additionally, to investigate the impact of neural information loss due to channel reduction on the interpretation of complex brain functions, we employed a heatmap visualization algorithm to verify the universal importance and complete symmetry of the selected optimal channel combination across multiple datasets. This is consistent with the brain's cooperative mechanism when processing tasks involving both the left and right hands.
PMID:39269692 | DOI:10.1080/10255842.2024.2401918
Identification of Chemical Scaffolds That Inhibit the <em>Mycobacterium tuberculosis</em> Respiratory Complex Succinate Dehydrogenase
ACS Infect Dis. 2024 Sep 13. doi: 10.1021/acsinfecdis.3c00655. Online ahead of print.
ABSTRACT
Drug-resistant Mycobacterium tuberculosis is a significant cause of infectious disease morbidity and mortality for which new antimicrobials are urgently needed. Inhibitors of mycobacterial respiratory energy metabolism have emerged as promising next-generation antimicrobials, but a number of targets remain unexplored. Succinate dehydrogenase (SDH), a focal point in mycobacterial central carbon metabolism and respiratory energy production, is required for growth and survival in M. tuberculosis under a number of conditions, highlighting the potential of inhibitors targeting mycobacterial SDH enzymes. To advance SDH as a novel drug target in M. tuberculosis, we utilized a combination of biochemical screening and in-silico deep learning technologies to identify multiple chemical scaffolds capable of inhibiting mycobacterial SDH activity. Antimicrobial susceptibility assays show that lead inhibitors are bacteriostatic agents with activity against wild-type and drug-resistant strains of M. tuberculosis. Mode of action studies on lead compounds demonstrate that the specific inhibition of SDH activity dysregulates mycobacterial metabolism and respiration and results in the secretion of intracellular succinate. Interaction assays demonstrate that the chemical inhibition of SDH activity potentiates the activity of other bioenergetic inhibitors and prevents the emergence of resistance to a variety of drugs. Overall, this study shows that SDH inhibitors are promising next-generation antimicrobials against M. tuberculosis.
PMID:39268963 | DOI:10.1021/acsinfecdis.3c00655
GRABSEEDS: extraction of plant organ traits through image analysis
Plant Methods. 2024 Sep 12;20(1):140. doi: 10.1186/s13007-024-01268-2.
ABSTRACT
BACKGROUND: Phenotyping of plant traits presents a significant bottleneck in Quantitative Trait Loci (QTL) mapping and genome-wide association studies (GWAS). Computerized phenotyping using digital images promises rapid, robust, and reproducible measurements of dimension, shape, and color traits of plant organs, including grain, leaf, and floral traits.
RESULTS: We introduce GRABSEEDS, which is specifically tailored to extract a comprehensive set of features from plant images based on state-of-the-art computer vision and deep learning methods. This command-line enabled tool, which is adept at managing varying light conditions, background disturbances, and overlapping objects, uses digital images to measure plant organ characteristics accurately and efficiently. GRABSEED has advanced features including label recognition and color correction in a batch setting.
CONCLUSION: GRABSEEDS streamlines the plant phenotyping process and is effective in a variety of seed, floral and leaf trait studies for association with agronomic traits and stress conditions. Source code and documentations for GRABSEEDS are available at: https://github.com/tanghaibao/jcvi/wiki/GRABSEEDS .
PMID:39267072 | DOI:10.1186/s13007-024-01268-2
Mild cognitive impairment prediction based on multi-stream convolutional neural networks
BMC Bioinformatics. 2024 Sep 12;22(Suppl 5):638. doi: 10.1186/s12859-024-05911-6.
ABSTRACT
BACKGROUND: Mild cognitive impairment (MCI) is the transition stage between the cognitive decline expected in normal aging and more severe cognitive decline such as dementia. The early diagnosis of MCI plays an important role in human healthcare. Current methods of MCI detection include cognitive tests to screen for executive function impairments, possibly followed by neuroimaging tests. However, these methods are expensive and time-consuming. Several studies have demonstrated that MCI and dementia can be detected by machine learning technologies from different modality data. This study proposes a multi-stream convolutional neural network (MCNN) model to predict MCI from face videos.
RESULTS: The total effective data are 48 facial videos from 45 participants, including 35 videos from normal cognitive participants and 13 videos from MCI participants. The videos are divided into several segments. Then, the MCNN captures the latent facial spatial features and facial dynamic features of each segment and classifies the segment as MCI or normal. Finally, the aggregation stage produces the final detection results of the input video. We evaluate 27 MCNN model combinations including three ResNet architectures, three optimizers, and three activation functions. The experimental results showed that the ResNet-50 backbone with Swish activation function and Ranger optimizer produces the best results with an F1-score of 89% at the segment level. However, the ResNet-18 backbone with Swish and Ranger achieves the F1-score of 100% at the participant level.
CONCLUSIONS: This study presents an efficient new method for predicting MCI from facial videos. Studies have shown that MCI can be detected from facial videos, and facial data can be used as a biomarker for MCI. This approach is very promising for developing accurate models for screening MCI through facial data. It demonstrates that automated, non-invasive, and inexpensive MCI screening methods are feasible and do not require highly subjective paper-and-pencil questionnaires. Evaluation of 27 model combinations also found that ResNet-50 with Swish is more stable for different optimizers. Such results provide directions for hyperparameter tuning to further improve MCI predictions.
PMID:39266977 | DOI:10.1186/s12859-024-05911-6
Deep Learning for Automated Classification of Hip Hardware on Radiographs
J Imaging Inform Med. 2024 Sep 12. doi: 10.1007/s10278-024-01263-y. Online ahead of print.
ABSTRACT
PURPOSE: To develop a deep learning model for automated classification of orthopedic hardware on pelvic and hip radiographs, which can be clinically implemented to decrease radiologist workload and improve consistency among radiology reports.
MATERIALS AND METHODS: Pelvic and hip radiographs from 4279 studies in 1073 patients were retrospectively obtained and reviewed by musculoskeletal radiologists. Two convolutional neural networks, EfficientNet-B4 and NFNet-F3, were trained to perform the image classification task into the following most represented categories: no hardware, total hip arthroplasty (THA), hemiarthroplasty, intramedullary nail, femoral neck cannulated screws, dynamic hip screw, lateral blade/plate, THA with additional femoral fixation, and post-infectious hip. Model performance was assessed on an independent test set of 851 studies from 262 patients and compared to individual performance of five subspecialty-trained radiologists using leave-one-out analysis against an aggregate gold standard label.
RESULTS: For multiclass classification, the area under the receiver operating characteristic curve (AUC) for NFNet-F3 was 0.99 or greater for all classes, and EfficientNet-B4 0.99 or greater for all classes except post-infectious hip, with an AUC of 0.97. When compared with human observers, models achieved an accuracy of 97%, which is non-inferior to four out of five radiologists and outperformed one radiologist. Cohen's kappa coefficient for both models ranged from 0.96 to 0.97, indicating excellent inter-reader agreement.
CONCLUSION: A deep learning model can be used to classify a range of orthopedic hip hardware with high accuracy and comparable performance to subspecialty-trained radiologists.
PMID:39266912 | DOI:10.1007/s10278-024-01263-y
Convolutional Neural Networks for Segmentation of Pleural Mesothelioma: Analysis of Probability Map Thresholds (CALGB 30901, Alliance)
J Imaging Inform Med. 2024 Sep 12. doi: 10.1007/s10278-024-01092-z. Online ahead of print.
ABSTRACT
The purpose of this study was to evaluate the impact of probability map threshold on pleural mesothelioma (PM) tumor delineations generated using a convolutional neural network (CNN). One hundred eighty-six CT scans from 48 PM patients were segmented by a VGG16/U-Net CNN. A radiologist modified the contours generated at a 0.5 probability threshold. Percent difference of tumor volume and overlap using the Dice Similarity Coefficient (DSC) were compared between the reference standard provided by the radiologist and CNN outputs for thresholds ranging from 0.001 to 0.9. CNN-derived contours consistently yielded smaller tumor volumes than radiologist contours. Reducing the probability threshold from 0.5 to 0.01 decreased the absolute percent volume difference, on average, from 42.93% to 26.60%. Median and mean DSC ranged from 0.57 to 0.59, with a peak at a threshold of 0.2; no distinct threshold was found for percent volume difference. The CNN exhibited deficiencies with specific disease presentations, such as severe pleural effusion or disease in the pleural fissure. No single output threshold in the CNN probability maps was optimal for both tumor volume and DSC. This study emphasized the importance of considering both figures of merit when evaluating deep learning-based tumor segmentations across probability thresholds. This work underscores the need to simultaneously assess tumor volume and spatial overlap when evaluating CNN performance. While automated segmentations may yield comparable tumor volumes to that of the reference standard, the spatial region delineated by the CNN at a specific threshold is equally important.
PMID:39266911 | DOI:10.1007/s10278-024-01092-z
Training and validation of a deep learning U-net architecture general model for automated segmentation of inner ear from CT
Eur Radiol Exp. 2024 Sep 12;8(1):104. doi: 10.1186/s41747-024-00508-3.
ABSTRACT
BACKGROUND: The intricate three-dimensional anatomy of the inner ear presents significant challenges in diagnostic procedures and critical surgical interventions. Recent advancements in deep learning (DL), particularly convolutional neural networks (CNN), have shown promise for segmenting specific structures in medical imaging. This study aimed to train and externally validate an open-source U-net DL general model for automated segmentation of the inner ear from computed tomography (CT) scans, using quantitative and qualitative assessments.
METHODS: In this multicenter study, we retrospectively collected a dataset of 271 CT scans to train an open-source U-net CNN model. An external set of 70 CT scans was used to evaluate the performance of the trained model. The model's efficacy was quantitatively assessed using the Dice similarity coefficient (DSC) and qualitatively assessed using a 4-level Likert score. For comparative analysis, manual segmentation served as the reference standard, with assessments made on both training and validation datasets, as well as stratified analysis of normal and pathological subgroups.
RESULTS: The optimized model yielded a mean DSC of 0.83 and achieved a Likert score of 1 in 42% of the cases, in conjunction with a significantly reduced processing time. Nevertheless, 27% of the patients received an indeterminate Likert score of 4. Overall, the mean DSCs were notably higher in the validation dataset than in the training dataset.
CONCLUSION: This study supports the external validation of an open-source U-net model for the automated segmentation of the inner ear from CT scans.
RELEVANCE STATEMENT: This study optimized and assessed an open-source general deep learning model for automated segmentation of the inner ear using temporal CT scans, offering perspectives for application in clinical routine. The model weights, study datasets, and baseline model are worldwide accessible.
KEY POINTS: A general open-source deep learning model was trained for CT automated inner ear segmentation. The Dice similarity coefficient was 0.83 and a Likert score of 1 was attributed to 42% of automated segmentations. The influence of scanning protocols on the model performances remains to be assessed.
PMID:39266784 | DOI:10.1186/s41747-024-00508-3
Predicting multiple sclerosis disease progression and outcomes with machine learning and MRI-based biomarkers: a review
J Neurol. 2024 Sep 12. doi: 10.1007/s00415-024-12651-3. Online ahead of print.
ABSTRACT
Multiple sclerosis (MS) is a demyelinating neurological disorder with a highly heterogeneous clinical presentation and course of progression. Disease-modifying therapies are the only available treatment, as there is no known cure for the disease. Careful selection of suitable therapies is necessary, as they can be accompanied by serious risks and adverse effects such as infection. Magnetic resonance imaging (MRI) plays a central role in the diagnosis and management of MS, though MRI lesions have displayed only moderate associations with MS clinical outcomes, known as the clinico-radiological paradox. With the advent of machine learning (ML) in healthcare, the predictive power of MRI can be improved by leveraging both traditional and advanced ML algorithms capable of analyzing increasingly complex patterns within neuroimaging data. The purpose of this review was to examine the application of MRI-based ML for prediction of MS disease progression. Studies were divided into five main categories: predicting the conversion of clinically isolated syndrome to MS, cognitive outcome, EDSS-related disability, motor disability and disease activity. The performance of ML models is discussed along with highlighting the influential MRI-derived biomarkers. Overall, MRI-based ML presents a promising avenue for MS prognosis. However, integration of imaging biomarkers with other multimodal patient data shows great potential for advancing personalized healthcare approaches in MS.
PMID:39266777 | DOI:10.1007/s00415-024-12651-3