Deep learning
Deep-Learning-Based Analysis of Electronic Skin Sensing Data
Sensors (Basel). 2025 Mar 6;25(5):1615. doi: 10.3390/s25051615.
ABSTRACT
E-skin is an integrated electronic system that can mimic the perceptual ability of human skin. Traditional analysis methods struggle to handle complex e-skin data, which include time series and multiple patterns, especially when dealing with intricate signals and real-time responses. Recently, deep learning techniques, such as the convolutional neural network, recurrent neural network, and transformer methods, provide effective solutions that can automatically extract data features and recognize patterns, significantly improving the analysis of e-skin data. Deep learning is not only capable of handling multimodal data but can also provide real-time response and personalized predictions in dynamic environments. Nevertheless, problems such as insufficient data annotation and high demand for computational resources still limit the application of e-skin. Optimizing deep learning algorithms, improving computational efficiency, and exploring hardware-algorithm co-designing will be the key to future development. This review aims to present the deep learning techniques applied in e-skin and provide inspiration for subsequent researchers. We first summarize the sources and characteristics of e-skin data and review the deep learning models applicable to e-skin data and their applications in data analysis. Additionally, we discuss the use of deep learning in e-skin, particularly in health monitoring and human-machine interactions, and we explore the current challenges and future development directions.
PMID:40096464 | DOI:10.3390/s25051615
Research on Network Intrusion Detection Model Based on Hybrid Sampling and Deep Learning
Sensors (Basel). 2025 Mar 4;25(5):1578. doi: 10.3390/s25051578.
ABSTRACT
This study proposes an enhanced network intrusion detection model, 1D-TCN-ResNet-BiGRU-Multi-Head Attention (TRBMA), aimed at addressing the issues of incomplete learning of temporal features and low accuracy in the classification of malicious traffic found in existing models. The TRBMA model utilizes Temporal Convolutional Networks (TCNs) to improve the ResNet18 architecture and incorporates Bidirectional Gated Recurrent Units (BiGRUs) and Multi-Head Self-Attention mechanisms to enhance the comprehensive learning of temporal features. Additionally, the ResNet network is adapted into a one-dimensional version that is more suitable for processing time-series data, while the AdamW optimizer is employed to improve the convergence speed and generalization ability during model training. Experimental results on the CIC-IDS-2017 dataset indicate that the TRBMA model achieves an accuracy of 98.66% in predicting malicious traffic types, with improvements in precision, recall, and F1-score compared to the baseline model. Furthermore, to address the challenge of low identification rates for malicious traffic types with small sample sizes in unbalanced datasets, this paper introduces TRBMA (BS-OSS), a variant of the TRBMA model that integrates Borderline SMOTE-OSS hybrid sampling. Experimental results demonstrate that this model effectively identifies malicious traffic types with small sample sizes, achieving an overall prediction accuracy of 99.88%, thereby significantly enhancing the performance of the network intrusion detection model.
PMID:40096461 | DOI:10.3390/s25051578
AD-VAE: Adversarial Disentangling Variational Autoencoder
Sensors (Basel). 2025 Mar 4;25(5):1574. doi: 10.3390/s25051574.
ABSTRACT
Face recognition (FR) is a less intrusive biometrics technology with various applications, such as security, surveillance, and access control systems. FR remains challenging, especially when there is only a single image per person as a gallery dataset and when dealing with variations like pose, illumination, and occlusion. Deep learning techniques have shown promising results in recent years using VAE and GAN, with approaches such as patch-VAE, VAE-GAN for 3D Indoor Scene Synthesis, and hybrid VAE-GAN models. However, in Single Sample Per Person Face Recognition (SSPP FR), the challenge of learning robust and discriminative features that preserve the subject's identity persists. To address these issues, we propose a novel framework called AD-VAE, specifically for SSPP FR, using a combination of variational autoencoder (VAE) and Generative Adversarial Network (GAN) techniques. The proposed AD-VAE framework is designed to learn how to build representative identity-preserving prototypes from both controlled and wild datasets, effectively handling variations like pose, illumination, and occlusion. The method uses four networks: an encoder and decoder similar to VAE, a generator that receives the encoder output plus noise to generate an identity-preserving prototype, and a discriminator that operates as a multi-task network. AD-VAE outperforms all tested state-of-the-art face recognition techniques, demonstrating its robustness. The proposed framework achieves superior results on four controlled benchmark datasets-AR, E-YaleB, CAS-PEAL, and FERET-with recognition rates of 84.9%, 94.6%, 94.5%, and 96.0%, respectively, and achieves remarkable performance on the uncontrolled LFW dataset, with a recognition rate of 99.6%. The AD-VAE framework shows promising potential for future research and real-world applications.
PMID:40096455 | DOI:10.3390/s25051574
A Multimodal Data Fusion and Embedding Attention Mechanism-Based Method for Eggplant Disease Detection
Plants (Basel). 2025 Mar 4;14(5):786. doi: 10.3390/plants14050786.
ABSTRACT
A novel eggplant disease detection method based on multimodal data fusion and attention mechanisms is proposed in this study, aimed at improving both the accuracy and robustness of disease detection. The method integrates image and sensor data, optimizing the fusion of multimodal features through an embedded attention mechanism, which enhances the model's ability to focus on disease-related features. Experimental results demonstrate that the proposed method excels across various evaluation metrics, achieving a precision of 0.94, recall of 0.90, accuracy of 0.92, and mAP@75 of 0.91, indicating excellent classification accuracy and object localization capability. Further experiments, through ablation studies, evaluated the impact of different attention mechanisms and loss functions on model performance, all of which showed superior performance for the proposed approach. The multimodal data fusion combined with the embedded attention mechanism effectively enhances the accuracy and robustness of the eggplant disease detection model, making it highly suitable for complex disease identification tasks and demonstrating significant potential for widespread application.
PMID:40094753 | DOI:10.3390/plants14050786
Integrative Approaches to Soybean Resilience, Productivity, and Utility: A Review of Genomics, Computational Modeling, and Economic Viability
Plants (Basel). 2025 Feb 21;14(5):671. doi: 10.3390/plants14050671.
ABSTRACT
Soybean is a vital crop globally and a key source of food, feed, and biofuel. With advancements in high-throughput technologies, soybeans have become a key target for genetic improvement. This comprehensive review explores advances in multi-omics, artificial intelligence, and economic sustainability to enhance soybean resilience and productivity. Genomics revolution, including marker-assisted selection (MAS), genomic selection (GS), genome-wide association studies (GWAS), QTL mapping, GBS, and CRISPR-Cas9, metagenomics, and metabolomics have boosted the growth and development by creating stress-resilient soybean varieties. The artificial intelligence (AI) and machine learning approaches are improving genetic trait discovery associated with nutritional quality, stresses, and adaptation of soybeans. Additionally, AI-driven technologies like IoT-based disease detection and deep learning are revolutionizing soybean monitoring, early disease identification, yield prediction, disease prevention, and precision farming. Additionally, the economic viability and environmental sustainability of soybean-derived biofuels are critically evaluated, focusing on trade-offs and policy implications. Finally, the potential impact of climate change on soybean growth and productivity is explored through predictive modeling and adaptive strategies. Thus, this study highlights the transformative potential of multidisciplinary approaches in advancing soybean resilience and global utility.
PMID:40094561 | DOI:10.3390/plants14050671
A Diffusion-Based Detection Model for Accurate Soybean Disease Identification in Smart Agricultural Environments
Plants (Basel). 2025 Feb 22;14(5):675. doi: 10.3390/plants14050675.
ABSTRACT
Accurate detection of soybean diseases is a critical component in achieving intelligent agricultural management. However, traditional methods often underperform in complex field scenarios. This paper proposes a diffusion-based object detection model that integrates the endogenous diffusion sub-network and the endogenous diffusion loss function to progressively optimize feature distributions, significantly enhancing detection performance for complex backgrounds and diverse disease regions. Experimental results demonstrate that the proposed method outperforms multiple baseline models, achieving a precision of 94%, recall of 90%, accuracy of 92%, and mAP@50 and mAP@75 of 92% and 91%, respectively, surpassing RetinaNet, DETR, YOLOv10, and DETR v2. In fine-grained disease detection, the model performs best on rust detection, with a precision of 96% and a recall of 93%. For more complex diseases such as bacterial blight and Fusarium head blight, precision and mAP exceed 90%. Compared to self-attention and CBAM, the proposed endogenous diffusion attention mechanism further improves feature extraction accuracy and robustness. This method demonstrates significant advantages in both theoretical innovation and practical application, providing critical technological support for intelligent soybean disease detection.
PMID:40094551 | DOI:10.3390/plants14050675
Exon-intron boundary detection made easy by physicochemical properties of DNA
Mol Omics. 2025 Mar 17. doi: 10.1039/d4mo00241e. Online ahead of print.
ABSTRACT
Genome architecture in eukaryotes exhibits a high degree of complexity. Amidst the numerous intricacies, the existence of genes as non-continuous stretches composed of exons and introns has garnered significant attention and curiosity among researchers. Accurate identification of exon-intron (EI) boundaries is crucial to decipher the molecular biology governing gene expression and regulation. This includes understanding both normal and aberrant splicing, with aberrant splicing referring to the abnormal processing of pre-mRNA that leads to improper inclusion or exclusion of exons or introns. Such splicing events can result in dysfunctional or non-functional proteins, which are often associated with various diseases. The currently employed frameworks for genomic signals, which aim to identify exons and introns within a genomic segment, need to be revised primarily due to the lack of a robust consensus sequence and the limitations posed by the training on available experimental datasets. To tackle these challenges and capitalize on the understanding that DNA exhibits function-dependent local physicochemical variations, we present ChemEXIN, an innovative novel method for predicting EI boundaries. The method utilizes a deep-learning (DL) architecture alongside tri- and tetra-nucleotide-based structural and energy features. ChemEXIN outperforms existing methods with notable accuracy and precision. It achieves an accuracy of 92.5% for humans, 79.9% for mice, and 92.0% for worms, along with precision values of 92.0%, 79.6%, and 91.8% for the same organisms, respectively. These results represent a significant advancement in EI boundary annotations, with potential implications for understanding gene expression, regulation, and cellular functions.
PMID:40094442 | DOI:10.1039/d4mo00241e
Automatic bone age assessment: a Turkish population study
Diagn Interv Radiol. 2025 Mar 17. doi: 10.4274/dir.2025.242999. Online ahead of print.
ABSTRACT
PURPOSE: Established methods for bone age assessment (BAA), such as the Greulich and Pyle atlas, suffer from variability due to population differences and observer discrepancies. Although automated BAA offers speed and consistency, limited research exists on its performance across different populations using deep learning. This study examines deep learning algorithms on the Turkish population to enhance bone age models by understanding demographic influences.
METHODS: We analyzed reports from Bağcılar Hospital's Health Information Management System between April 2012 and September 2023 using "bone age" as a keyword. Patient images were re-evaluated by an experienced radiologist and anonymized. A total of 2,730 hand radiographs from Bağcılar Hospital (Turkish population), 12,572 from the Radiological Society of North America (RSNA), and 6,185 from the Radiological Hand Pose Estimation (RHPE) public datasets were collected, along with corresponding bone ages and gender information. A random set of 546 radiographs (273 from Bağcılar, 273 from public datasets) was initially randomly split for an internal test set with bone age stratification; the remaining data were used for training and validation. BAAs were generated using a modified InceptionV3 model on 500 × 500-pixel images, selecting the model with the lowest mean absolute error (MAE) on the validation set.
RESULTS: Three models were trained and tested based on dataset origin: Bağcılar (Turkish), public (RSNA-RHPE), and a Combined model. Internal test set predictions of the Combined model estimated bone age within less than 6, 12, 18, and 24 months at rates of 44%, 73%, 87%, and 94%, respectively. The MAE was 9.2 months in the overall internal test set, 7 months on the public test set, and 11.5 months on the Bağcılar internal test data. The Bağcılar-only model had an MAE of 12.7 months on the Bağcılar internal test data. Despite less training data, there was no significant difference between the combined and Bağcılar models on the Bağcılar dataset (P > 0.05). The public model showed an MAE of 16.5 months on the Bağcılar dataset, significantly worse than the other models (P < 0.05).
CONCLUSION: We developed an automatic BAA model including the Turkish population, one of the few such studies using deep learning. Despite challenges from population differences and data heterogeneity, these models can be effectively used in various clinical settings. Model accuracy can improve over time with cumulative data, and publicly available datasets may further refine them. Our approach enables more accurate and efficient BAAs, supporting healthcare professionals where traditional methods are time-consuming and variable.
CLINICAL SIGNIFICANCE: The developed automated BAA model for the Turkish population offers a reliable and efficient alternative to traditional methods. By utilizing deep learning with diverse datasets from Bağcılar Hospital and publicly available sources, the model minimizes assessment time and reduces variability. This advancement enhances clinical decision-making, supports standardized BAA practices, and improves patient care in various healthcare settings.
PMID:40094318 | DOI:10.4274/dir.2025.242999
Explainable deep learning algorithm for identifying cerebral venous sinus thrombosis-related hemorrhage (CVST-ICH) from spontaneous intracerebral hemorrhage using computed tomography
EClinicalMedicine. 2025 Feb 26;81:103128. doi: 10.1016/j.eclinm.2025.103128. eCollection 2025 Mar.
ABSTRACT
BACKGROUND: Misdiagnosis of hemorrhage secondary to cerebral venous sinus thrombosis (CVST-ICH) as arterial-origin spontaneous intracerebral hemorrhage (sICH) can lead to inappropriate treatment and the potential for severe adverse outcomes. The current practice for identifying CVST-ICH involves venography, which, despite being increasingly utilized in many centers, is not typically used as the initial imaging modality for ICH patients. The study aimed to develop an explainable deep learning model to quickly identify ICH caused by CVST based on non-contrast computed tomography (NCCT).
METHODS: The study population included patients diagnosed with CVST-ICH and other spontaneous ICH from January 2016 to March 2023 at the Second Affiliated Hospital of Zhejiang University, Taizhou First People's Hospital, Taizhou Hospital, Quzhou Second People's Hospital, and Longyan First People's Hospital. A transfer learning-based 3D U-Net with segmentation and classification was proposed and developed only on admission plain CT. Model performance was assessed using the area under the curve (AUC), sensitivity, and specificity metrics. For further evaluation, the average diagnostic performance of nine doctors on plain CT was compared with model assistance. Interpretability methods, including Grad-CAM++, SHAP, IG, and occlusion, were employed to understand the model's attention.
FINDINGS: An internal dataset was constructed using propensity score matching based on age, initially including 102 CVST-ICH patients (median age: 44 [29, 61] years) and 683 sICH patients (median age: 65 [52, 73] years). After matching, 102 CVST-ICH patients and 306 sICH patients (median age: 50 [40, 62] years) were selected. An external dataset consisted of 38 CVST-ICH and 119 sICH patients from four other hospitals. Validation showed AUC 0·94, sensitivity 0·96, and specificity 0·8 for the internal testing subset; AUC 0·85, sensitivity 0·87, and specificity 0·82 for the external dataset, respectively. The discrimination performance of nine doctors interpreting CT images significantly improved with the assistance of the proposed model (accuracy 0·79 vs 0·71, sensitivity 0·88 vs 0·81, specificity 0·75 vs 0·68, p < 0·05). Interpretability methods highlighted the attention of model to the features of hemorrhage edge appearance.
INTERPRETATION: The present model demonstrated high-performing and robust results on discrimination between CVST-ICH and spontaneous ICH, and aided doctors' diagnosis in clinical practice as well. Prospective validation with larger-sample size is required.
FUNDING: The work was funded by the National Key R&D Program of China (2023YFE0118900), National Natural Science Foundation of China (No.81971155 and No.81471168), the Science and Technology Department of Zhejiang Province (LGJ22H180004), Medical and Health Science and Technology Project of Zhejiang Province (No.2022KY174), the 'Pioneer' R&D Program of Zhejiang (No. 2024C03006 and No. 2023C03026) and the MOE Frontier Science Center for Brain Science & Brain-Machine Integration, Zhejiang University.
PMID:40093990 | PMC:PMC11909457 | DOI:10.1016/j.eclinm.2025.103128
Deep learning-based model for prediction of early recurrence and therapy response on whole slide images in non-muscle-invasive bladder cancer: a retrospective, multicentre study
EClinicalMedicine. 2025 Feb 26;81:103125. doi: 10.1016/j.eclinm.2025.103125. eCollection 2025 Mar.
ABSTRACT
BACKGROUND: Accurate prediction of early recurrence is essential for disease management of patients with non-muscle-invasive bladder cancer (NMIBC). We aimed to develop and validate a deep learning-based early recurrence predictive model (ERPM) and a treatment response predictive model (TRPM) on whole slide images to assist clinical decision making.
METHODS: In this retrospective, multicentre study, we included consecutive patients with pathology-confirmed NMIBC who underwent transurethral resection of bladder tumour from five centres. Patients from one hospital (Sun Yat-sen Memorial Hospital of Sun Yat-sen University, Guangzhou, China) were assigned to training and internal validation cohorts, and patients from four other hospitals (the Third Affiliated Hospital of Sun Yat-sen University, and Zhujiang Hospital of Southern Medical University, Guangzhou, China; the Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China; Shenshan Medical Centre, Shanwei, China) were assigned to four independent external validation cohorts. Based on multi-instance and ensemble learning, the ERPM was developed to make predictions on haematoxylin and eosin (H&E) staining and immunohistochemistry staining slides. Sharing the same architecture of the ERPM, the TRPM was trained and evaluated by cross validation on patients who received Bacillus Calmette-Guérin (BCG). The performance of the ERPM was mainly evaluated and compared with the clinical model, H&E-based model, and integrated model through the area under the curve. Survival analysis was performed to assess the prognostic capability of the ERPM.
FINDINGS: Between January 1, 2017, and September 30, 2023, 4395 whole slide images of 1275 patients were included to train and validate the models. The ERPM was superior to the clinical and H&E-based model in predicting early recurrence in both internal validation cohort (area under the curve: 0.837 vs 0.645 vs 0.737) and external validation cohorts (area under the curve: 0.761-0.802 vs 0.626-0.682 vs 0.694-0.723) and was on par with the integrated model. It also stratified recurrence-free survival significantly (p < 0.0001) with a hazard ratio of 4.50 (95% CI 3.10-6.53). The TRPM performed well in predicting BCG-unresponsive NMIBC (accuracy 84.1%).
INTERPRETATION: The ERPM showed promising performance in predicting early recurrence and recurrence-free survival of patients with NMIBC after surgery and with further validation and in combination with TRPM could be used to guide the management of NMIBC.
FUNDING: National Natural Science Foundation of China, the Science and Technology Planning Project of Guangdong Province, the National Key Research and Development Programme of China, the Guangdong Provincial Clinical Research Centre for Urological Diseases, and the Science and Technology Projects in Guangzhou.
PMID:40093987 | PMC:PMC11909458 | DOI:10.1016/j.eclinm.2025.103125
ViE-Take: A Vision-Driven Multi-Modal Dataset for Exploring the Emotional Landscape in Takeover Safety of Autonomous Driving
Research (Wash D C). 2025 Mar 14;8:0603. doi: 10.34133/research.0603. eCollection 2025.
ABSTRACT
Takeover safety draws increasing attention in the intelligent transportation as the new energy vehicles with cutting-edge autopilot capabilities vigorously blossom on the road. Despite recent studies highlighting the importance of drivers' emotions in takeover safety, the lack of emotion-aware takeover datasets hinders further investigation, thereby constraining potential applications in this field. To this end, we introduce ViE-Take, the first Vision-driven (Vision is used since it constitutes the most cost-effective and user-friendly solution for commercial driver monitor systems) dataset for exploring the Emotional landscape in Takeovers of autonomous driving. ViE-Take enables a comprehensive exploration of the impact of emotions on drivers' takeover performance through 3 key attributes: multi-source emotion elicitation, multi-modal driver data collection, and multi-dimensional emotion annotations. To aid the use of ViE-Take, we provide 4 deep models (corresponding to 4 prevalent learning strategies) for predicting 3 different aspects of drivers' takeover performance (readiness, reaction time, and quality). These models offer benefits for various downstream tasks, such as driver emotion recognition and regulation for automobile manufacturers. Initial analysis and experiments conducted on ViE-Take indicate that (a) emotions have diverse impacts on takeover performance, some of which are counterintuitive; (b) highly expressive social media clips, despite their brevity, prove effective in eliciting emotions (a foundation for emotion regulation); and (c) predicting takeover performance solely through deep learning on vision data not only is feasible but also holds great potential.
PMID:40093973 | PMC:PMC11908832 | DOI:10.34133/research.0603
Artificial intelligence-enhanced retinal imaging as a biomarker for systemic diseases
Theranostics. 2025 Feb 18;15(8):3223-3233. doi: 10.7150/thno.100786. eCollection 2025.
ABSTRACT
Retinal images provide a non-invasive and accessible means to directly visualize human blood vessels and nerve fibers. Growing studies have investigated the intricate microvascular and neural circuitry within the retina, its interactions with other systemic vascular and nervous systems, and the link between retinal biomarkers and various systemic diseases. Using the eye to study systemic health, based on these connections, has been given a term as oculomics. Advancements in artificial intelligence (AI) technologies, particularly deep learning, have further increased the potential impact of this study. Leveraging these technologies, retinal analysis has demonstrated potentials in detecting numerous diseases, including cardiovascular diseases, central nervous system diseases, chronic kidney diseases, metabolic diseases, endocrine disorders, and hepatobiliary diseases. AI-based retinal imaging, which incorporates established modalities such as digital color fundus photographs, optical coherence tomography (OCT) and OCT angiography, as well as emerging technologies like ultra-wide field imaging, shows great promises in predicting systemic diseases. This provides a valuable opportunity for systemic diseases screening, early detection, prediction, risk stratification, and personalized prognostication. As the AI and big data research field grows, with the mission of transforming healthcare, they also face numerous challenges and limitations both in data and technology. The application of natural language processing framework, large language model, and other generative AI techniques presents both opportunities and concerns that require careful consideration. In this review, we not only summarize key studies on AI-enhanced retinal imaging for predicting systemic diseases but also underscore the significance of these advancements in transforming healthcare. By highlighting the remarkable progress made thus far, we provide a comprehensive overview of state-of-the-art techniques and explore the opportunities and challenges in this rapidly evolving field. This review aims to serve as a valuable resource for researchers and clinicians, guiding future studies and fostering the integration of AI in clinical practice.
PMID:40093903 | PMC:PMC11905132 | DOI:10.7150/thno.100786
Prediction of lymph node metastasis in papillary thyroid carcinoma using non-contrast CT-based radiomics and deep learning with thyroid lobe segmentation: A dual-center study
Eur J Radiol Open. 2025 Feb 24;14:100639. doi: 10.1016/j.ejro.2025.100639. eCollection 2025 Jun.
ABSTRACT
OBJECTIVES: This study aimed to develop a predictive model for lymph node metastasis (LNM) in papillary thyroid carcinoma (PTC) patients by deep learning radiomic (DLRad) and clinical features.
METHODS: This study included 271 thyroid lobes from 228 PTC patients who underwent preoperative neck non-contrast CT at Center 1 (May 2021-April 2024). LNM status was confirmed via postoperative pathology, with each thyroid lobe labeled accordingly. The cohort was divided into training (n = 189) and validation (n = 82) cohorts, with additional temporal (n = 59 lobes, Center 1, May-August 2024) and external (n = 66 lobes, Center 2) test cohorts. Thyroid lobes were manually segmented from the isthmus midline, ensuring interobserver consistency (ICC ≥ 0.8). Deep learning and radiomics features were selected using LASSO algorithms to compute DLRad scores. Logistic regression identified independent predictors, forming DLRad, clinical, and combined models. Model performance was evaluated using AUC, calibration, decision curves, and the DeLong test, compared against radiologists' assessments.
RESULTS: Independent predictors of LNM included age, gender, multiple nodules, tumor size group, and DLRad. The combined model demonstrated superior diagnostic performance with AUCs of 0.830 (training), 0.799 (validation), 0.819 (temporal test), and 0.756 (external test), outperforming the DLRad model (AUCs: 0.786, 0.730, 0.753, 0.642), clinical model (AUCs: 0.723, 0.745, 0.671, 0.660), and radiologist evaluations (AUCs: 0.529, 0.606, 0.620, 0.503). It also achieved the lowest Brier scores (0.167, 0.184, 0.175, 0.201) and the highest net benefit in decision-curve analysis at threshold probabilities > 20 %.
CONCLUSIONS: The combined model integrating DLRad and clinical features exhibits good performance in predicting LNM in PTC patients.
PMID:40093877 | PMC:PMC11908562 | DOI:10.1016/j.ejro.2025.100639
Voxel-level radiomics and deep learning for predicting pathologic complete response in esophageal squamous cell carcinoma after neoadjuvant immunotherapy and chemotherapy
J Immunother Cancer. 2025 Mar 15;13(3):e011149. doi: 10.1136/jitc-2024-011149.
ABSTRACT
BACKGROUND: Accurate prediction of pathologic complete response (pCR) following neoadjuvant immunotherapy combined with chemotherapy (nICT) is crucial for tailoring patient care in esophageal squamous cell carcinoma (ESCC). This study aimed to develop and validate a deep learning model using a novel voxel-level radiomics approach to predict pCR based on preoperative CT images.
METHODS: In this multicenter, retrospective study, 741 patients with ESCC who underwent nICT followed by radical esophagectomy were enrolled from three institutions. Patients from one center were divided into a training set (469 patients) and an internal validation set (118 patients) while the data from the other two centers was used as external validation sets (120 and 34 patients, respectively). The deep learning model, Vision-Mamba, integrated voxel-level radiomics feature maps and CT images for pCR prediction. Additionally, other commonly used deep learning models, including 3D-ResNet and Vision Transformer, as well as traditional radiomics methods, were developed for comparison. Model performance was evaluated using accuracy, area under the curve (AUC), sensitivity, specificity, and prognostic stratification capabilities. The SHapley Additive exPlanations analysis was employed to interpret the model's predictions.
RESULTS: The Vision-Mamba model demonstrated robust predictive performance in the training set (accuracy: 0.89, AUC: 0.91, sensitivity: 0.82, specificity: 0.92) and validation sets (accuracy: 0.83-0.91, AUC: 0.83-0.92, sensitivity: 0.73-0.94, specificity: 0.84-1.0). The model outperformed other deep learning models and traditional radiomics methods. The model's ability to stratify patients into high and low-risk groups was validated, showing superior prognostic stratification compared with traditional methods. SHAP provided quantitative and visual model interpretation.
CONCLUSIONS: We present a voxel-level radiomics-based deep learning model to predict pCR to neoadjuvant immunotherapy combined with chemotherapy based on pretreatment diagnostic CT images with high accuracy and robustness. This model could provide a promising tool for individualized management of patients with ESCC.
PMID:40090670 | DOI:10.1136/jitc-2024-011149
Automatic pre-screening of outdoor airborne microplastics in micrographs using deep learning
Environ Pollut. 2025 Mar 14:125993. doi: 10.1016/j.envpol.2025.125993. Online ahead of print.
ABSTRACT
Airborne microplastics (AMPs) are prevalent in both indoor and outdoor environments, posing potential health risks to humans. Automating the process of spotting them in micrographs can significantly enhance research and monitoring. Although deep learning has shown substantial promise in microplastic analysis, existing studies have primarily focused on high-resolution images of samples collected from marine and freshwater environments. In contrast, this work introduces a novel approach by employing enhanced U-Net models (Attention U-Net and Dynamic RU-NEXT) along with the Mask Region Convolutional Neural Network (Mask R-CNN) to identify and classify AMPs in lower-resolution micrographs (256 × 256 pixels) obtained from outdoor environments. A key innovation involves integrating classification directly within the U-Net-based segmentation frameworks, thereby streamlining the workflow and improving computational efficiency which is an advancement over previous work where segmentation and classification were performed separately. The enhanced U-Net models attained average classification F1-scores exceeding 85% and segmentation scores above 77%. Additionally, the Mask R-CNN model achieved an average bounding box precision of 73.32% on the test set, a classification F1-score of 84.29%, and a mask precision of 71.31%, demonstrating robust performance. The proposed method provides a faster and more accurate means of identifying AMPs compared to thresholding techniques. It also functions effectively as a pre-screening tool, substantially reducing the number of particles requiring labour-intensive chemical analysis. By integrating advanced deep learning strategies into AMPs research, this study paves the way for more efficient monitoring and characterisation of microplastics.
PMID:40090454 | DOI:10.1016/j.envpol.2025.125993
Explainable Artificial Intelligence to Quantify Adenoid Hypertrophy-related Upper Airway Obstruction using 3D Shape Analysis
J Dent. 2025 Mar 14:105689. doi: 10.1016/j.jdent.2025.105689. Online ahead of print.
ABSTRACT
OBJECTIVES: To develop and validate an explainable Artificial Intelligence (AI) model for classifying and quantifying upper airway obstruction related to adenoid hypertrophy using three-dimensional (3D) shape analysis of cone-beam computed tomography (CBCT) scans.
METHODS: 400 CBCT scans of patients aged 5-18 years were analyzed. Nasopharyngeal airway obstruction (NAO) ratio was calculated to label scans into four grades of obstruction severity, used as the ground truth. Upper airway surface meshes were used to train a deep learning model combining multiview and point-cloud approaches for 3D shape analysis and obstruction severity classification and quantification. Surface Gradient-weighted Class Activation Mapping (SurfGradCAM) generated explainability heatmaps. Performance was evaluated using area under the curve (AUC), precision, recall, F1-score, mean absolute error, root mean squared error, and correlation coefficients.
RESULTS: The explainable AI model demonstrated strong performance in both classification and quantification tasks. The AUC values for the classification task ranged from 0.77 to 0.94, with the highest values of 0.88 and 0.94 for Grades 3 and 4, respectively, indicating excellent discriminative ability for identifying more severe cases of obstruction. The SurfGradCAM-generated heatmaps consistently highlighted the most relevant regions of the upper airway influencing the AI's decision-making process. In the quantification task, the regression model successfully predicted the NAO ratio, with a strong correlation coefficient of 0.854 (p<0.001) and R2= 0.728, explaining a substantial proportion of the variance in NAO ratios.
CONCLUSIONS: The proposed explainable AI model, using 3D shape analysis, demonstrated strong performance in classifying and quantifying adenoid hypertrophy-related upper airway obstruction in CBCT scans.
CLINICAL SIGNIFICANCE: This AI model provides clinicians with a reliable, automated tool for standardized adenoid hypertrophy assessment. The model's explainable nature enhances clinical confidence and patient communication, potentially improving diagnostic workflow and treatment planning.
PMID:40090403 | DOI:10.1016/j.jdent.2025.105689
Correlation of point-wise retinal sensitivity with localized features of diabetic macular edema using deep learning
Can J Ophthalmol. 2025 Mar 13:S0008-4182(25)00070-5. doi: 10.1016/j.jcjo.2025.02.013. Online ahead of print.
ABSTRACT
PURPOSE: To evaluate the association between localized features of diabetic macular edema (DME) and point-wise retinal sensitivity (RS) assessed with microperimetry (MP) using deep learning (DL)-based automated quantification on optical coherence tomography (OCT) scans.
DESIGN: Cross-sectional study.
PARTICIPANTS: Twenty eyes of 20 subjects with clinically significant DME were included in this study.
METHODS: Patients with DME visible on OCT scans (Spectralis HRA+OCT) completed 2 MP examinations using a custom 45 stimuli grid on MAIA (CenterVue). MP stimuli were coregistered with the corresponding OCT location using image registration algorithms. DL-based algorithms were used to quantify intraretinal fluid (IRF) and ellipsoid zone (EZ) thickness. Hard exudates (HEs) were quantified semiautomatically. Multivariable mixed-effect models were calculated to investigate the association between DME-specific OCT features and point-wise RS. As EZ thickness values below HEs were excluded, the models included either EZ thickness or HEs.
RESULTS: A total of 1800 MP stimuli from 20 eyes of 20 patients were analyzed. Stimuli with IRF (n = 568) showed significantly decreased RS compared to areas without (estimate [95% CI]: -1.11 dB [-1.69, -0.52]; p = 0.0002). IRF volume was significantly negatively (-0.45 dB/nL [-0.71; -0.18]; p = 0.001) and EZ thickness positively (0.14 dB/µm [0.1; 0.19]; p < 0.0001) associated with localized point-wise RS. In the multivariable mixed model, including HE volume instead of EZ thickness, a negative impact on RS was observed (-0.43/0.1 nL [-0.81; -0.05]; p = 0.027).
CONCLUSIONS: DME-specific features, as analyzed on OCT, have a significant impact on point-wise RS. IRF and HE volume showed a negative and EZ thickness, a positive association with localized RS.
PMID:40090368 | DOI:10.1016/j.jcjo.2025.02.013
Automated detection of arrhythmias using a novel interpretable feature set extracted from 12-lead electrocardiogram
Comput Biol Med. 2025 Mar 15;189:109957. doi: 10.1016/j.compbiomed.2025.109957. Online ahead of print.
ABSTRACT
The availability of large-scale electrocardiogram (ECG) databases and advancements in machine learning have facilitated the development of automated diagnostic systems for cardiac arrhythmias. Deep learning models, despite their potential for high accuracy, have had limited clinical adoption due to their inherent lack of interpretability. This study bridges this gap by proposing a feature-based approach that maintains comparable performance to deep learning while providing enhanced interpretability for clinical utility. The method extracts a total of 654 individual features, classified into 60 feature types from each ECG. The features use mathematical techniques such as the Fourier transform, wavelet transform, and cross-correlation for rigorous evaluation of ECG characteristics. The eXtreme Gradient Boosting framework was employed to classify each ECG into one of nine diagnostic classes. Shapley Additive Explanations (SHAP) value analysis was used to downselect the feature set to the minimal set without incurring a performance reduction (159 features). Overall, the proposed method demonstrated performance comparable to state-of-the-art deep learning classifiers, achieving a weighted F1 score of 81% during cross-validation and 68% on the external test dataset, while offering greater ease of implementation and adaptability to diverse clinical applications. Notably, the proposed method demonstrated superior accuracy in identifying atrial fibrillation and block-related abnormalities, achieving overall F1 scores of 89% and 87% during cross-validation and 79% and 75% on the external test dataset, respectively. SHAP value analysis of the testing results revealed the top-performing features for each diagnostic class aligned with standard clinical diagnostic processes, highlighting the clinical interpretability of our approach.
PMID:40090185 | DOI:10.1016/j.compbiomed.2025.109957
An MR-only deep learning inference model-based dose estimation algorithm for MR-guided adaptive radiation therapy
Med Phys. 2025 Mar 16. doi: 10.1002/mp.17759. Online ahead of print.
ABSTRACT
BACKGROUND: Magnetic resonance-guided adaptive radiation therapy (MRgART) systems combine Magnetic resonance imaging (MRI) technology with linear accelerators (LINAC) to enhance the precision and efficacy of cancer treatment. These systems enable real-time adjustments of treatment plans based on the latest patient anatomy, creating an urgent need for accurate and rapid dose calculation algorithms. Traditional CT-based dose calculations and ray-tracing (RT) processes are time-consuming and may not be feasible for the online adaptive workflow required in MRgART. Recent advancements in deep learning (DL) offer promising solutions to overcome these limitations.
PURPOSE: This study aims to develop a DL-based dose calculation engine for MRgART that relies solely on MR images. This approach addresses the critical need for accurate and rapid dose calculations in the MRgART workflow without relying on CT images or time-consuming RT processes.
METHODS: We used a deep residual network inspired by U-Net to establish a direct connection between distance-corrected conical (DCC) fluence maps and dose distributions in the image domain. The study utilized data from 30 prostate cancer patients treated with fixed-beam Intensity-Modulated Radiation Therapy (IMRT) on an MR-guided LINAC system. We trained, validated, and tested the model using a total of 120 online treatment plans, which encompassed 1080 individual beams. We extensively evaluated the network's performance by comparing its dose calculation accuracy against Monte Carlo (MC)-based methods using metrics such as mean absolute error (MAE) of pixel-wise dose differences, 3D gamma analysis, dose-volume histograms (DVHs), dosimetric indices, and isodose line similarity.
RESULTS: The proposed DL model demonstrated high accuracy in dose calculations. The median MAE of pixel-wise dose differences was 1.2% for the whole body, 1.9% for targets, and 1.1% for organs at risk (OARs). The median 3D gamma passing rates for the 3%/3 mm criterion were 94.8% for the whole body, 95.7% for targets, and 98.7% for OARs. Additionally, the Dice similarity coefficient (DSC) of isodose lines between the DL-based and MC-based dose calculations averaged 0.94 ± 0.01. There were no big differences between the DL-based and MC-based calculations in the DVH curves and clinical dosimetric indices. This proved that the two methods were clinically equivalent.
CONCLUSION: This study presents a novel MR-only dose calculation engine that eliminates the need for CT images and complex RT processes. By leveraging DL, the proposed method significantly enhances the efficiency and accuracy of the MRgART workflow, particularly for prostate cancer treatment. This approach holds potential for broader applications across different cancer types and MR-linac systems, paving the way for more streamlined and precise radiation therapy planning.
PMID:40089982 | DOI:10.1002/mp.17759
Quantitative susceptibility mapping via deep neural networks with iterative reverse concatenations and recurrent modules
Med Phys. 2025 Mar 16. doi: 10.1002/mp.17747. Online ahead of print.
ABSTRACT
BACKGROUND: Quantitative susceptibility mapping (QSM) is a post-processing magnetic resonance imaging (MRI) technique that extracts the distribution of tissue susceptibilities and holds significant promise in the study of neurological diseases. However, the ill-conditioned nature of dipole inversion often results in noise and artifacts during QSM reconstruction from the tissue field. Deep learning methods have shown great potential in addressing these issues; however, most existing approaches rely on basic U-net structures, leading to limited performances and reconstruction artifacts sometimes.
PURPOSE: This study aims to develop a novel deep learning-based method, IR2QSM, for improving QSM reconstruction accuracy while mitigating noise and artifacts by leveraging a unique network architecture that enhances latent feature utilization.
METHODS: IR2QSM, an advanced U-net architecture featuring four iterations of reverse concatenations and middle recurrent modules, was proposed to optimize feature fusion and improve QSM accuracy, and comparative experiments based on both simulated and in vivo datasets were carried out to compare IR2QSM with two traditional iterative methods (iLSQR, MEDI) and four recently proposed deep learning methods (U-net, xQSM, LPCNN, and MoDL-QSM).
RESULTS: In this work, IR2QSM outperformed all other methods in reducing artifacts and noise in QSM images. It achieved on average the lowest XSIM (84.81%) in simulations, showing improvements of 12.80%, 12.68%, 18.66%, 10.49%, 25.57%, and 19.78% over iLSQR, MEDI, U-net, xQSM, LPCNN, and MoDL-QSM, respectively, and yielded results with the least artifacts on the in vivo data and present the most visually appealing results. In the meantime, it successfully alleviated the over-smoothing and susceptibility underestimation in LPCNN results.
CONCLUSION: Overall, the proposed IR2QSM showed superior QSM results compared to iterative and deep learning-based methods, offering a more accurate QSM solution for clinical applications.
PMID:40089979 | DOI:10.1002/mp.17747