Deep learning
A generative adversarial network-based accurate masked face recognition model using dual scale adaptive efficient attention network
Sci Rep. 2025 May 21;15(1):17594. doi: 10.1038/s41598-025-02144-2.
ABSTRACT
Masked identification of faces is necessary for authentication purposes. Face masks are frequently utilized in a wide range of professions and sectors including public safety, health care, schooling, catering services, production, sales, and shipping. In order to solve this issue and provide precise identification and verification in masked events, masked facial recognition equipment has emerged as a key innovation. Although facial recognition is a popular and affordable biometric security solution, it has several difficulties in correctly detecting people who are wearing masks. As a result, a reliable method for identifying the masked faces is required. In this developed model, a deep learning-assisted masked face identification framework is developed to accurately recognize the person's identity for security concerns. At first, the input images are aggregated from standard datasets. From the database, both the masked face images and mask-free images are used for training the Generative Adversarial Network (GAN) model. Then, the collected input images are given to the GAN technique. If the input is a masked face image, then the GAN model generates a mask-free face image and it is considered as feature set 1. If the input is a mask-free image, then the GAN model generates a masked face image and these images are considered as feature set 2. If the input images contain both masked and mask-free images, then it is directly given to Dual Scale Adaptive Efficient Attention Network (DS-AEAN). Otherwise, generated feature set 1 and feature set 2 are given to the DS-AEAN for recognizing the faces to ensure the person's identity. The effectiveness of this model is further maximized using the Enhanced Addax Optimization Algorithm (EAOA). This model is helpful for a precise biometric verification process. The outcomes of the designed masked face recognition model are evaluated with the existing models to check its capability.
PMID:40399389 | DOI:10.1038/s41598-025-02144-2
An automated deep learning framework for brain tumor classification using MRI imagery
Sci Rep. 2025 May 21;15(1):17593. doi: 10.1038/s41598-025-02209-2.
ABSTRACT
The precise and timely diagnosis of brain tumors is essential for accelerating patient recovery and preserving lives. Brain tumors exhibit a variety of sizes, shapes, and visual characteristics, requiring individualized treatment strategies for each patient. Radiologists require considerable proficiency to manually detect brain malignancies. However, tumor recognition remains inefficient, imprecise, and labor-intensive in manual procedures, underscoring the need for automated methods. This study introduces an effective approach for identifying brain lesions in magnetic resonance imaging (MRI) images, minimizing dependence on manual intervention. The proposed method improves image clarity by combining guided filtering techniques with anisotropic Gaussian side windows (AGSW). A morphological analysis is conducted prior to segmentation to exclude non-tumor regions from the enhanced MRI images. Deep neural networks segment the images, extracting high-quality regions of interest (ROIs) and multiscale features. Identifying salient elements is essential and is accomplished through an attention module that isolates distinctive features while eliminating irrelevant information. An ensemble model is employed to classify brain tumors into different categories. The proposed technique achieves an overall accuracy of 99.94% and 99.67% on the publicly available brain tumor datasets BraTS2020 and Figshare, respectively. Furthermore, it surpasses existing technologies in terms of automation and robustness, thereby enhancing the entire diagnostic process.
PMID:40399378 | DOI:10.1038/s41598-025-02209-2
Phase recognition in manual Small-Incision cataract surgery with MS-TCN + + on the novel SICS-105 dataset
Sci Rep. 2025 May 21;15(1):16886. doi: 10.1038/s41598-025-00303-z.
ABSTRACT
Manual Small-Incision Cataract Surgery (SICS) is a prevalent technique in low- and middle-income countries (LMICs) but understudied with respect to computer assisted surgery. This prospective cross-sectional study introduces the first SICS video dataset, evaluates effectiveness of phase recognition through deep learning (DL) using the MS-TCN + + architecture, and compares its results with the well-studied phacoemulsification procedure using the Cataract-101 public dataset. Our novel SICS-105 dataset involved 105 patients recruited at Sankara Eye Hospital in India. Performance is evaluated with frame-wise accuracy, edit distance, F1-score, Precision-Recall AUC, sensitivity, and specificity. The MS-TCN + + architecture performs better on the Cataract-101 dataset, with an accuracy of 89.97% [CI 86.69-93.46%] compared to 85.56% [80.63-92.09%] on the SICS-105 dataset (ROC AUC 99.10% [98.34-99.51%] vs. 98.22% [97.16-99.26%]). The accuracy distribution and confidence-intervals overlap and the ROC AUC values range 46.20 to 94.18%. Even though DL is found to be effective for phase recognition in SICS, the larger number of phases and longer duration makes it more challenging compared to phacoemulsification. To support further developments, we make our dataset open access. This research marks a crucial step towards improving postoperative analysis and training for SICS.
PMID:40399321 | DOI:10.1038/s41598-025-00303-z
Improving image quality and diagnostic performance using deep learning image reconstruction in 100-kVp CT enterography for patients with wide-range body mass index
Eur J Radiol. 2025 May 14;189:112167. doi: 10.1016/j.ejrad.2025.112167. Online ahead of print.
ABSTRACT
OBJECTIVE: To assess the clinical value of the deep learning image reconstruction (DLIR) algorithm compared with conventional adaptive statistical iterative reconstruction-Veo (ASiR-V) in image quality, diagnostic confidence, and intestinal lesion detection in 100-kVp CT enterography (CTE) for patients with wide-range body mass index (BMI).
METHODS: A total of 84 patients underwent 100-kVp dual-phase CTE were included. Images were reconstructed using filtered back projection (FBP), ASiR-V 30 %, ASiR-V 60 %, and DLIR with low, medium, and high levels (DLIR-L, DLIR-M, and DLIR-H). The CT value, standard deviation (SD), signal-to-noise ratio (SNR), and contrast-to-noise ratio (CNR) of small and large intestines were compared using repeated measures analysis of variance with the Bonferroni correction or Friedman test. The correlation between relative CNR increment and BMI was analyzed using Pearson's correlation coefficient. The overall image quality and diagnostic confidence scores were evaluated. Additionally, lesion detection of intestinal disease was conducted by three readers with different experience and compared between DLIR-M and ASiR-V 60 % images using McNemar's test.
RESULTS: SD decreased sequentially from FBP, ASiR-V 30 %, DLIR-L, ASiR-V 60 %, DLIR-M, to DLIR-H, which corresponded with improvements in CNR and SNR (all p < 0.001). The relative CNR increment of DLIR exhibited a significantly positive linear correlation with BMI (r:0.307-0.506, all p ≤ 0.005). For overall image quality scores, the ranking was: FBP < ASiR-V 30 % < ASiR-V 60 % ≈DLIR-L < DLIR-M ≈ DLIR-H. DLIR-M outperformed ASiR-V 60 % in diagnostic confidence (p ≤ 0.018 for all three readers). In lesion detection, for the two junior readers, DLIR-M exhibited higher sensitivity for inflammatory lesions compared to ASiR-V 60 % (0.700 (95 % confidence interval [95 % CI]: 0.354-0.919) vs. 0.300 (95 % CI: 0.081-0.646) for reader 1 and 0.700 (95 %CI: 0.354-0.919) vs. 0.500 (95 % CI: 0.201-0.799) for reader 2), though no statistical significance was reached.
CONCLUSION: DLIR effectively reduces noise and improves image quality in 100-kVp dual-phase CTE for wide-range BMIs. DLIR-M exhibits superior performance in image quality and diagnostic confidence, also provide potential value in improving intestinal inflammatory lesion detection in junior readers and sheds lights on benefiting clinical decision making, which needs further investigation.
PMID:40398003 | DOI:10.1016/j.ejrad.2025.112167
Prediction of Spontaneous Breathing Trial Outcome in Critically Ill-Ventilated Patients Using Deep Learning: Development and Verification Study
JMIR Med Inform. 2025 May 21;13:e64592. doi: 10.2196/64592.
ABSTRACT
BACKGROUND: Long-term ventilator-dependent patients often face problems such as decreased quality of life, increased mortality, and increased medical costs. Respiratory therapists must perform complex and time-consuming ventilator weaning assessments, which typically take 48-72 hours. Traditional disengagement methods rely on manual evaluation and are susceptible to subjectivity, human errors, and low efficiency.
OBJECTIVE: This study aims to develop an artificial intelligence-based prediction model to predict whether a patient can successfully pass a spontaneous breathing trial (SBT) using the patient's clinical data collected before SBT initiation. Instead of comparing different SBT strategies or analyzing their impact on extubation success, this study focused on establishing a data-driven approach under a fixed SBT strategy to provide an objective and efficient assessment tool. Through this model, we aim to enhance the accuracy and efficiency of ventilator weaning assessments, reduce unnecessary SBT attempts, optimize intensive care unit resource usage, and ultimately improve the quality of care for ventilator-dependent patients.
METHODS: This study used a retrospective cohort study and developed a novel deep learning architecture, hybrid CNN-MLP (convolutional neural network-multilayer perceptron), for analysis. Unlike the traditional CNN-MLP classification method, hybrid CNN-MLP performs feature learning and fusion by interleaving CNN and MLP layers so that data features can be extracted and integrated at different levels, thereby improving the flexibility and prediction accuracy of the model. The study participants were patients aged 20 years or older hospitalized in the intensive care unit of a medical center in central Taiwan between January 1, 2016, and December 31, 2022. A total of 3686 patients were included in the study, and 6536 pre-SBT clinical records were collected before each SBT of these patients, of which 3268 passed the SBT and 3268 failed.
RESULTS: The model performed well in predicting SBT outcomes. The training dataset's precision is 99.3% (2443/2460 records), recall is 93.5% (2443/2614 records), specificity is 99.3% (2597/2614 records), and F1-score is 0.963. In the test dataset, the model maintains accuracy with a precision of 89.2% (561/629 records), a recall of 85.8% (561/654 records), a specificity of 89.6% (586/654 records), and an F1-score of 0.875. These results confirm the reliability of the model and its potential for clinical application.
CONCLUSIONS: This study successfully developed a deep learning-based SBT prediction model that can be used as an objective and efficient ventilator weaning assessment tool. The model's performance shows that it can be integrated into clinical workflow, improve the quality of patient care, and reduce ventilator dependence, which is an important step in improving the effectiveness of respiratory therapy.
PMID:40397953 | DOI:10.2196/64592
Identifying Disinformation on the Extended Impacts of COVID-19: Methodological Investigation Using a Fuzzy Ranking Ensemble of Natural Language Processing Models
J Med Internet Res. 2025 May 21;27:e73601. doi: 10.2196/73601.
ABSTRACT
BACKGROUND: During the COVID-19 pandemic, the continuous spread of misinformation on the internet posed an ongoing threat to public trust and understanding of epidemic prevention policies. Although the pandemic is now under control, information regarding the risks of long-term COVID-19 effects and reinfection still needs to be integrated into COVID-19 policies.
OBJECTIVE: This study aims to develop a robust and generalizable deep learning framework for detecting misinformation related to the prolonged impacts of COVID-19 by integrating pretrained language models (PLMs) with an innovative fuzzy rank-based ensemble approach.
METHODS: A comprehensive dataset comprising 566 genuine and 2361 fake samples was curated from reliable open sources and processed using advanced techniques. The dataset was randomly split using the scikit-learn package to facilitate both training and evaluation. Deep learning models were trained for 20 epochs on a Tesla T4 for hierarchical attention networks (HANs) and an RTX A5000 (for the other models). To enhance performance, we implemented an ensemble learning strategy that incorporated a reparameterized Gompertz function, which assigned fuzzy ranks based on each model's prediction confidence for each test case. This method effectively fused outputs from state-of-the-art PLMs such as robustly optimized bidirectional encoder representations from transformers pretraining approach (RoBERTa), decoding-enhanced bidirectional encoder representations from transformers with disentangled attention (DeBERTa), and XLNet.
RESULTS: After training on the dataset, various classification methods were evaluated on the test set, including the fuzzy rank-based method and state-of-the-art large language models. Experimental results reveal that language models, particularly XLNet, outperform traditional approaches that combine term frequency-inverse document frequency features with support vector machine or utilize deep models like HAN. The evaluation metrics-including accuracy, precision, recall, F1-score, and area under the curve (AUC)-indicated a clear performance advantage for models that had a larger number of parameters. However, this study also highlights that model architecture, training procedures, and optimization techniques are critical determinants of classification effectiveness. XLNet's permutation language modeling approach enhances bidirectional context understanding, allowing it to surpass even larger models in the bidirectional encoder representations from transformers (BERT) series despite having relatively fewer parameters. Notably, the fuzzy rank-based ensemble method, which combines multiple language models, achieved impressive results on the test set, with an accuracy of 93.52%, a precision of 94.65%, an F1-score of 96.03%, and an AUC of 97.15%.
CONCLUSIONS: The fusion of ensemble learning with PLMs and the Gompertz function, employing fuzzy rank-based methodology, introduces a novel prediction approach with prospects for enhancing accuracy and reliability. Additionally, the experimental results imply that training solely on textual content can yield high prediction accuracy, thereby providing valuable insights into the optimization of fake news detection systems. These findings not only aid in detecting misinformation but also have broader implications for the application of advanced deep learning techniques in public health policy and communication.
PMID:40397945 | DOI:10.2196/73601
Effects of neighborhood streetscape on the single-family housing price: Focusing on nonlinear and interaction effects using interpretable machine learning
PLoS One. 2025 May 21;20(5):e0323495. doi: 10.1371/journal.pone.0323495. eCollection 2025.
ABSTRACT
Previous studies using the conventional Hedonic Price Model to predict existing housing prices may have limitations in addressing the relationship between house prices and streetscapes as visually perceived at the human eye level, due to the constraints of streetscape estimations. Therefore, in this study, we analyzed the relationship between streetscapes visually perceived at eye level and single-family home prices in Seoul, Korea, using computer vision technology and machine learning algorithms. We used transaction data for 13,776 single-family housing sales between 2017 and 2019. To measure visually perceived streetscapes, this study used the Deeplab V3 + deep-learning model with 233,106 Google Street View panoramic images. Then, the best machine-learning model was selected by comparing the explanatory powers of the hedonic price model and all alternative machine-learning models. According to the results, the Gradient Boost model, a representative ensemble machine learning model, performed better than XGBoost, Random Forest, and Linear Regression models in predicting single-family house prices. In addition, this study used an interpretable machine learning model of the SHAP method to identify key features that affect single-family home price prediction. This solves the "black box" problem of machine learning models. Finally, by analyzing the nonlinear relationship and interaction effects between perceived streetscape characteristics and house prices, we easily and quickly identified the relationship between variables the hedonic price model partially considers.
PMID:40397916 | DOI:10.1371/journal.pone.0323495
Enhanced intelligent train operation algorithms for metro train based on expert system and deep reinforcement learning
PLoS One. 2025 May 21;20(5):e0323478. doi: 10.1371/journal.pone.0323478. eCollection 2025.
ABSTRACT
In recent decades, automatic train operation (ATO) systems have been gradually adopted by many metro systems, primarily due to their cost-effectiveness and practicality. However, a critical examination reveals computational constraints, adaptability to unforeseen conditions and multi-objective balancing that our research aims to address. In this paper, expert knowledge is combined with deep reinforcement learning algorithm (Proximal Policy Optimization, PPO) and two enhanced intelligent train operation algorithms (EITO) are proposed. The first algorithm, EITOE, is based on an expert system containing expert rules and a heuristic expert inference method. On the basis of EITOE, we propose EITOP algorithm using the PPO algorithm to optimize multiple objectives by designing reinforcement learning strategies, rewards, and value functions. We also develop the double minimal-time distribution (DMTD) calculation method in the EITO implementation to achieve longer coasting distances and further optimize the energy consumption. Compared with previous works, EITO enables the control of continuous train operation without reference to offline speed profiles and optimizes several key performance indicators online. Finally, we conducted comparative tests of the manual driving, intelligent driving algorithm (ITOR, STON), and the algorithms proposed in this paper, EITO, using real line data from the Yizhuang Line of Beijing Metro (YLBS). The test results show that the EITO outperform the current intelligent driving algorithms and manual driving in terms of energy consumption and passengers' comfort. In addition, we further validated the robustness of EITO by selecting some complex lines with speed limits, gradients and different running times for testing on the YLBS. Overall, the EITOP algorithm has the best performance.
PMID:40397887 | DOI:10.1371/journal.pone.0323478
Emotional engagement and perceived empathy in live vs. automated psychological interviews
PLoS One. 2025 May 21;20(5):e0323490. doi: 10.1371/journal.pone.0323490. eCollection 2025.
ABSTRACT
In clinical in-person conditions, social presence, perceived empathy, and emotional engagement are related to positive outcomes. In online settings, it is unclear how these factors affect outcomes. Here, in 10-15-minute interviews, we investigated the influence of automation. Participants (N = 75) engaged in one of three possible interviews: live semi-scripted, live scripted, or video scripted. In the first two, participants communicated with a live interviewer and, in the third, with pre-recorded interviewer questions and answers. Emotion recognition software revealed that expressed joy differed between conditions (χ2(2) = 18.08, p < .001); both live conditions had higher scores (vs. video scripted). Self-rated perceived interviewer empathy also differed between conditions in the same way (F[2, 72] = 9.445, p < 0.001). We found a positive correlation between perceived empathy and expressed joy (r = .35; p < .01). In sum, automatized interviews differed in perceived empathy and expressed emotion compared with live interviews.
PMID:40397863 | DOI:10.1371/journal.pone.0323490
Bone Age Estimation of Chinese Han Adolescents's and Children's Elbow Joint X-rays Based on Multiple Deep Convolutional Neural Network Models
Fa Yi Xue Za Zhi. 2025 Feb 25;41(1):48-58. doi: 10.12116/j.issn.1004-5619.2024.241202.
ABSTRACT
OBJECTIVES: To explore a deep learning-based automatic bone age estimation model for elbow joint X-ray images of Chinese Han adolescents and children and evaluate its performance.
METHODS: A total of 943 (517 males and 426 females) elbow joint frontal view X-ray images of Chinese Han adolescents and children aged 6.00 to <16.00 years were collected from East, South, Central and Northwest China. Three experimental schemes were adopted for bone age estimation. Scheme 1: Directly input preprocessed images into the regression model; Scheme 2: Train a segmentation network using "key elbow joint bone annotations" as labels, then input segmented images into the regression model; Scheme 3: Train a segmentation network using "full elbow joint bone annotations" as labels, then input segmented images into the regression model. For segmentation, the optimal model was selected from U-Net, UNet++ and TransUNet. For regression, VGG16, VGG19, InceptionV2, InceptionV3, ResNet34, ResNet50, ResNet101 and DenseNet121 models were selected for bone age estimation. The dataset was randomly split into 80% (754 samples) for training and validation for model fitting and hyperparameter tuning, and 20% (189 samples) as an internal test set to test the performance of the trained model. An additional 104 elbow joint X-ray images from the same demographic and age group were collected and used as an external test set. Model performance was evaluated by comparing the mean absolute error (MAE), root mean square error (RMSE), accuracies within ±0.7 years (P±0.7 years) and ±1.0 years (P±1.0 years) between the estimated age and the actual age, and by drawing radar charts, scatter plots, and heatmaps.
RESULTS: When segmented with Scheme 3, the UNet++ model achieved good segmentation performance with a segmentation loss of 0.000 4 and an accuracy of 93.8% at a learning rate of 0.000 1. In the internal test set, the DenseNet121 model with Scheme 3 yielded the best results with MAE, P±0.7 years and P±1.0 years being 0.83 years, 70.03%, and 84.30%, respectively. In the external test set, the DenseNet121 model with Scheme 3 also performed best, with an average MAE of 0.89 years and an average RMSE of 1.00 years.
CONCLUSIONS: When performing automatic bone age estimation using elbow joint X-ray images in Chinese Han adolescents and children, it is recommended to use the UNet++ model for segmentation. The DenseNet121 model with Scheme 3 achieves optimal performance. Using segmentation networks, especially that trained with annotation areas encompassing the full elbow joint including the distal humerus, proximal radius, and proximal ulna, can improve the accuracy of bone age estimation based on elbow joint X-ray images.
PMID:40397588 | DOI:10.12116/j.issn.1004-5619.2024.241202
Lung MRI: Indications, Capabilities, and Techniques-<em>AJR</em> Expert Panel Narrative Review
AJR Am J Roentgenol. 2025 May 21. doi: 10.2214/AJR.25.32637. Online ahead of print.
ABSTRACT
Lung MRI provides both structural and functional information across a spectrum of parenchymal and airway pathologies. MRI, using current widely available conventional sequences, provides high-quality diagnostic images that allow tissue characterization and delineation of lung lesions; dynamic evaluation of expiratory central airway collapse, diaphragmatic or chest wall motion, and the relations of lung masses to the chest wall; oncologic staging; surveillance of chronic lung pathologies; and differentiation of inflammation and fibrosis in interstitial lung disease. Ongoing technologic advances, including deep-learning acceleration methods, may enable future applications in longitudinal lung cancer screening without ionizing radiation exposure and in the regional quantification of ventilation and perfusion without hyperpolarized gas or IV contrast media. Although society statements highlight appropriate indications for lung MRI, and the modality has performed favorably relative to CT or FDG PET/CT in various indications, the examination's clinical utilization remains extremely low. Ongoing barriers to adoption include limited awareness by referring physicians, as well as insufficient proficiency and experience by radiologists and technologists. In this AJR Expert Panel Narrative Review, we review clinical indications for lung MRI, describe the examination's current capabilities, provide guidance on protocols comprised of widely available pulse sequences, introduce emerging techniques, and issue consensus recommendations.
PMID:40397559 | DOI:10.2214/AJR.25.32637
DeepCCDS: Interpretable Deep Learning Framework for Predicting Cancer Cell Drug Sensitivity through Characterizing Cancer Driver Signals
Adv Sci (Weinh). 2025 May 21:e2416958. doi: 10.1002/advs.202416958. Online ahead of print.
ABSTRACT
Accurate characterization of cellular states is the foundation for precise prediction of drug sensitivity in cancer cell lines, which in turn is fundamental to realizing precision oncology. However, current deep learning approaches have limitations in characterizing cellular states. They rely solely on isolated genetic markers, overlooking the complex regulatory networks and cellular mechanisms that underlie drug responses. To address this limitation, this work proposes DeepCCDS, a Deep learning framework for Cancer Cell Drug Sensitivity prediction through Characterizing Cancer Driver Signals. DeepCCDS incorporates a prior knowledge network to characterize cancer driver signals, building upon the self-supervised neural network framework. The signals can reflect key mechanisms influencing cancer cell development and drug response, enhancing the model's predictive performance and interpretability. DeepCCDS has demonstrated superior performance in predicting drug sensitivity compared to previous state-of-the-art approaches across multiple datasets. Benefiting from integrating prior knowledge, DeepCCDS exhibits powerful feature representation capabilities and interpretability. Based on these feature representations, we have identified embedding features that could potentially be used for drug screening in new indications. Further, this work demonstrates the applicability of DeepCCDS on solid tumor samples from The Cancer Genome Atlas. This work believes integrating DeepCCDS into clinical decision-making processes can potentially improve the selection of personalized treatment strategies for cancer patients.
PMID:40397390 | DOI:10.1002/advs.202416958
Discovery of novel potential 11beta-HSD1 inhibitors through combining deep learning, molecular modeling, and bio-evaluation
Mol Divers. 2025 May 21. doi: 10.1007/s11030-025-11171-0. Online ahead of print.
ABSTRACT
11β-Hydroxysteroid dehydrogenase type 1 (11β-HSD1) has been shown to play an important role in the treatment of impaired glucose tolerance, insulin resistance, dyslipidemia, and obesity and is a promising drug target. In this study, we built a gated recurrent unit (GRU)-based recurrent neural network using 1,854,484 (processed) drug-like molecules from ChEMBL and the US patent database and successfully built a molecular generative model of 11βHSD1 inhibitors by using the known 11β-HSD1 inhibitors that have undergone transfer learning, our constructed GRU model was able to accurately capture drug-like molecules evaluated using traditional machine model-related syntax, and transfer learning can also easily generate potential 11β-HSD1 inhibitors. By combining Lipinski's and absorption, distribution, metabolism, excretion, and toxicity (ADME/T) analyses to filter nonconforming molecules and stepwise screening through molecular docking and molecular dynamics simulation, we finally obtained 5 potential compounds. We found that compound 02 is identical to a previously published inhibitor of 11β-HSD1. We selected compounds 02 and 05 with the lowest binding free energy for in vitro activity validation and found that compound 02 possessed inhibitory activity but was not as potent as the control. In conclusion, our study provides new ideas and methods for the development of new drugs and the discovery of new 11β-HSD1 inhibitors.
PMID:40397334 | DOI:10.1007/s11030-025-11171-0
Mammography-based artificial intelligence for breast cancer detection, diagnosis, and BI-RADS categorization using multi-view and multi-level convolutional neural networks
Insights Imaging. 2025 May 21;16(1):109. doi: 10.1186/s13244-025-01983-x.
ABSTRACT
PURPOSE: We developed an artificial intelligence system (AIS) using multi-view multi-level convolutional neural networks for breast cancer detection, diagnosis, and BI-RADS categorization support in mammography.
METHODS: Twenty-four thousand eight hundred sixty-six breasts from 12,433 Asian women between August 2012 and December 2018 were enrolled. The study consisted of three parts: (1) evaluation of AIS performance in malignancy diagnosis; (2) stratified analysis of BI-RADS 3-4 subgroups with AIS; and (3) reassessment of BI-RADS 0 breasts with AIS assistance. We further evaluate AIS by conducting a counterbalance-designed AI-assisted study, where ten radiologists read 1302 cases with/without AIS assistance. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, and F1 score were measured.
RESULTS: The AIS yielded AUC values of 0.995, 0.933, and 0.947 for malignancy diagnosis in the validation set, testing set 1, and testing set 2, respectively. Within BI-RADS 3-4 subgroups with pathological results, AIS downgraded 83.1% of false-positives into benign groups, and upgraded 54.1% of false-negatives into malignant groups. AIS also successfully assisted radiologists in identifying 7 out of 43 malignancies initially diagnosed with BI-RADS 0, with a specificity of 96.7%. In the counterbalance-designed AI-assisted study, the average AUC across ten readers significantly improved with AIS assistance (p = 0.001).
CONCLUSION: AIS can accurately detect and diagnose breast cancer on mammography and further serve as a supportive tool for BI-RADS categorization.
CRITICAL RELEVANCE STATEMENT: An AI risk assessment tool employing deep learning algorithms was developed and validated for enhancing breast cancer diagnosis from mammograms, to improve risk stratification accuracy, particularly in patients with dense breasts, and serve as a decision support aid for radiologists.
KEY POINTS: The false positive and negative rates of mammography diagnosis remain high. The AIS can yield a high AUC for malignancy diagnosis. The AIS is important in stratifying BI-RADS categorization.
PMID:40397242 | DOI:10.1186/s13244-025-01983-x
Can machine learning be a reliable tool for predicting hematoma progression following traumatic brain injury? A systematic review and meta-analysis
Neuroradiology. 2025 May 21. doi: 10.1007/s00234-025-03657-3. Online ahead of print.
ABSTRACT
BACKGROUND: Predicting hematoma progression in traumatic brain injury (TBI) is crucial for timely interventions and effective clinical management, as unchecked hematoma growth can lead to rapid neurological deterioration, increased intracranial pressure, and poor patient outcomes. Accurate risk assessment enables proactive therapeutic strategies, minimizing secondary brain damage and improving survival rates.
METHODS: This study evaluated to assess the performance of artificial intelligence (AI) algorithms, including machine learning (ML) and deep learning (DL), in forecasting risk of hematoma progression. Comprehensive searches across Embase, Scopus, Web of Science and PubMed identified relevant studies, with data extracted on algorithm metrics such as sensitivity, specificity, and area under the curve (AUC).
RESULTS: 1,240 studies screened, five out of them met the inclusion criteria, evaluating various AI models. The meta-analysis revealed a pooled sensitivity and specificity was 0.76 [95% CI: 0.67-0.83], 0.84 [95% CI: 0.78-0.89], positive and negative likelihood ratio was 4.82 [95% CI: 3.51-6.61] 0.29 [95% CI: 0.21-0.39], diagnostic score was 2.82 [95% CI: 2.33-3.32], diagnostic odds ratio was16.85 [95% CI: 10.29-27.59] and an AUC of 0.88 [95% CI: 0.85-0.90]. Among the evaluated algorithms, XGBoost has the best predictive performance with an accuracy of 91%. Integrating radiomics and clinical features in these models considerably improved the predictive outcomes.
CONCLUSION: The current results demonstrated the potential of AI-based models to improve hematoma progression prediction for TBI patients, thereby supporting more effective clinical decision-making. Further research should aim to standardize datasets and diversify patient populations to improve model applicability and reliability.
PMID:40397134 | DOI:10.1007/s00234-025-03657-3
Comparison of Deep Learning-Based Auto-Segmentation Results on Daily Kilovoltage, Megavoltage, and Cone Beam CT Images in Image-Guided Radiotherapy
Technol Cancer Res Treat. 2025 Jan-Dec;24:15330338251344198. doi: 10.1177/15330338251344198. Epub 2025 May 21.
ABSTRACT
IntroductionThis study aims to evaluate auto-segmentation results using deep learning-based auto-segmentation models on different online CT imaging modalities in image-guided radiotherapy.MethodsPhantom studies were first performed to benchmark image quality. Daily CT images for sixty patients were retrospectively retrieved from fan-beam kilovoltage CT (kVCT), kV cone-beam CT (kV-CBCT), and megavoltage CT (MVCT) scans. For each imaging modality, half of the patients received CT scans in the pelvic region, while the other half in the thoracic region. Deep learning auto-segmentation models using a convolutional neural network algorithm were used to generate organs-at-risk contours. Quantitative metrics were calculated to compare auto-segmentation results with manual contours.ResultsThe auto-segmentation contours on kVCT images showed statistically significant difference in Dice similarity coefficient (DSC), Jaccard similarity coefficient, sensitivity index, inclusiveness index, and the 95th percentile Hausdorff distance, compared to those on kV-CBCT and MVCT images for most major organs. In the pelvic region, the largest difference in DSC was observed for the bowel volume with an average DSC of 0.84 ± 0.05, 0.35 ± 0.23, and 0.48 ± 0.27 for kVCT, kV-CBCT, and MVCT images, respectively (p-value < 0.05); in the thoracic region, the largest difference in DSC was found for the esophagus with an average DSC of 0.63 ± 0.16, 0.18 ± 0.13, and 0.22 ± 0.08 for kVCT, kV-CBCT, and MVCT images, respectively (p-value < 0.05).ConclusionDeep learning-based auto-segmentation models showed better agreement with manual contouring when using kVCT images compared to kV-CBCT or MVCT images. However, manual correction remains necessary after auto-segmentation with all imaging modalities, particularly for organs with limited contrast from surrounding tissues. These findings underscore the potential and limits in applying deep learning-based auto-segmentation models for adaptive radiotherapy.
PMID:40397131 | DOI:10.1177/15330338251344198
Systematic review on the impact of deep learning-driven worklist triage on radiology workflow and clinical outcomes
Eur Radiol. 2025 May 21. doi: 10.1007/s00330-025-11674-2. Online ahead of print.
ABSTRACT
OBJECTIVES: To perform a systematic review on the impact of deep learning (DL)-based triage for reducing diagnostic delays and improving patient outcomes in peer-reviewed and pre-print publications.
MATERIALS AND METHODS: A search was conducted of primary research studies focused on DL-based worklist optimization for diagnostic imaging triage published on multiple databases from January 2018 until July 2024. Extracted data included study design, dataset characteristics, workflow metrics including report turnaround time and time-to-treatment, and patient outcome differences. Further analysis between clinical settings and integration modality was investigated using nonparametric statistics. Risk of bias was assessed with the risk of bias in non-randomized studies-of interventions (ROBINS-I) checklist.
RESULTS: A total of 38 studies from 20 publications, involving 138,423 images, were analyzed. Workflow interventions concerned pulmonary embolism (n = 8), stroke (n = 3), intracranial hemorrhage (n = 12), and chest conditions (n = 15). Patients in the post DL-triage group had shorter median report turnaround times: a mean difference of 12.3 min (IQR: -25.7, -7.6) for pulmonary embolism, 20.5 min (IQR: -32.1, -9.3) for stroke, 4.3 min (IQR: -8.6, 1.3) for intracranial hemorrhage and 29.7 min (IQR: -2947.7, -18.3) for chest diseases. Sub-group analysis revealed that reductions varied per clinical environment and relative prevalence rates but were the highest when algorithms actively stratified and reordered the radiological worklist, with reductions of -43.7% in report turnaround time compared to -7.6% from widget-based systems (p < 0.01).
CONCLUSION: DL-based triage systems had comparable report turnaround time improvements, especially in outpatient and high-prevalence settings, suggesting that AI-based triage holds promise in alleviating radiology workloads.
KEY POINTS: Question Can DL-based triage address lengthening imaging report turnaround times and improve patient outcomes across distinct clinical environments? Findings DL-based triage improved report turnaround time across disease groups, with higher reductions reported in high-prevalence or lower acuity settings. Clinical relevance DL-based workflow prioritization is a reliable tool for reducing diagnostic imaging delay for time-sensitive disease across clinical settings. However, further research and reliable metrics are needed to provide specific recommendations with regards to false-negative examinations and multi-condition prioritization.
PMID:40397031 | DOI:10.1007/s00330-025-11674-2
Deep Learning with Domain Randomization in Image and Feature Spaces for Abdominal Multiorgan Segmentation on CT and MRI Scans
Radiol Artif Intell. 2025 May 21:e240586. doi: 10.1148/ryai.240586. Online ahead of print.
ABSTRACT
"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop a deep learning segmentation model that can segment abdominal organs on CT and MR images with high accuracy and generalization ability. Materials and Methods In this study, an extended nnU-Net model was trained for abdominal organ segmentation. A domain randomization method in both the image and feature space was developed to improve the generalization ability under cross-site and cross-modality settings on public prostate MRI and abdominal CT and MRI datasets. The prostate MRI dataset contains data from multiple health care institutions with domain shifts. The abdominal CT and MRI dataset is structured for cross-modality evaluation, training on one modality (eg, MRI) and testing on the other (eg, CT). This domain randomization method was then used to train a segmentation model with enhanced generalization ability on the abdominal multiorgan segmentation challenge (AMOS) dataset to improve abdominal CT and MR multiorgan segmentation, and the model was compared with two commonly used segmentation algorithms (TotalSegmentator and MRSegmentator). Model performance was evaluated using the Dice similarity coefficient (DSC). Results The proposed domain randomization method showed improved generalization ability on the cross-site and cross-modality datasets compared with the state-of-the-art methods. The segmentation model using this method outperformed two other publicly available segmentation models on data from unseen test domains (Average DSC: 0.88 versus 0.79; P < .001 and 0.88 versus 0.76; P < .001). Conclusion The combination of image and feature domain randomizations improved the accuracy and generalization ability of deep learning-based abdominal segmentation on CT and MR images. © RSNA, 2025.
PMID:40396895 | DOI:10.1148/ryai.240586
Quantitative tooth crowding analysis in occlusal intra-oral photographs using a convolutional neural network
Eur J Orthod. 2025 Apr 8;47(3):cjaf025. doi: 10.1093/ejo/cjaf025.
ABSTRACT
BACKGROUND: Dental crowding is a primary concern in orthodontic treatment and significantly impacts therapy choices. Accurate quantification of crowding requires time-intensive cast- or scan-based measurements. The aim was to develop an automated deep-learning model capable of assessing anterior crowding and calculating the Little Irregularity Index using single occlusal intra-oral photographs.
METHODS: A dataset of 125 untreated individuals (100 from Zurich, Switzerland, and 25 from Nijmegen, the Netherlands) comprised of annotated intra-oral scans and corresponding intra-oral photographs were used to train a dedicated convolutional neural network (CNN). The CNN was modeled to detect teeth boundaries, contact points and contact point displacements on photographs. The model's performance to determine anterior crowding and the Little Irregularity Index score was compared to consensus measurements based on intra-oral scans in terms of intra-class correlation (ICC) and mean absolute difference (MAD).
RESULTS: The model correlated well with the consensus measurement, and proved to be reliable (ICC = 0.900) and accurate (MAD = 0.36 mm) for anterior crowding assessment and Little Irregularity Index alike (ICC = 0.930; MAD = 0.74 mm).
LIMITATION: The model was not trained on cases with interdental spacing, and its reliability for cases with crowding severity outside the tested sample has not been established.
CONCLUSION: The presented CNN-based model was able to quantify the crowding in the anterior segment of the lower dental arch and score the Little Irregularity Index from a single intra-oral photograph with a satisfactory reliability and accuracy. Application of this model may lead to more efficient and convenient orthodontic diagnostics.
PMID:40396639 | DOI:10.1093/ejo/cjaf025
Brain age prediction from MRI scans in neurodegenerative diseases
Curr Opin Neurol. 2025 May 22. doi: 10.1097/WCO.0000000000001383. Online ahead of print.
ABSTRACT
PURPOSE OF REVIEW: This review explores the use of brain age estimation from MRI scans as a biomarker of brain health. With disorders like Alzheimer's and Parkinson's increasing globally, there is an urgent need for early detection tools that can identify at-risk individuals before cognitive symptoms emerge. Brain age offers a noninvasive, quantitative measure of neurobiological ageing, with applications in early diagnosis, disease monitoring, and personalized medicine.
RECENT FINDINGS: Studies show that individuals with Alzheimer's, mild cognitive impairment (MCI), and Parkinson's have older brain ages than their chronological age. Longitudinal research indicates that brain-predicted age difference (brain-PAD) rises with disease progression and often precedes cognitive decline. Advances in deep learning and multimodal imaging have improved the accuracy and interpretability of brain age predictions. Moreover, socioeconomic disparities and environmental factors significantly affect brain aging, highlighting the need for inclusive models.
SUMMARY: Brain age estimation is a promising biomarker for identify future risk of neurodegenerative disease, monitoring progression, and helping prognosis. Challenges like implementation of standardization, demographic biases, and interpretability remain. Future research should integrate brain age with biomarkers and multimodal imaging to enhance early diagnosis and intervention strategies.
PMID:40396549 | DOI:10.1097/WCO.0000000000001383