Deep learning
Predicting the structures of cyclic peptides containing unnatural amino acids by HighFold2
Brief Bioinform. 2025 May 1;26(3):bbaf202. doi: 10.1093/bib/bbaf202.
ABSTRACT
Cyclic peptides containing unnatural amino acids possess many excellent properties and have become promising candidates in drug discovery. Therefore, accurately predicting the 3D structures of cyclic peptides containing unnatural residues will significantly advance the development of cyclic peptide-based therapeutics. Although deep learning-based structural prediction models have made tremendous progress, these models still cannot predict the structures of cyclic peptides containing unnatural amino acids. To address this gap, we introduce a novel model, HighFold2, built upon the AlphaFold-Multimer framework. HighFold2 first extends the pre-defined rigid groups and their initial atomic coordinates from natural amino acids to unnatural amino acids, thus enabling structural prediction for these residues. Then, it incorporates an additional neural network to characterize the atom-level features of peptides, allowing for multi-scale modeling of peptide molecules while enabling the distinction between various unnatural amino acids. Besides, HighFold2 constructs a relative position encoding matrix for cyclic peptides based on different cyclization constraints. Except for training using spatial structures with unnatural amino acids, HighFold2 also parameterizes the unnatural amino acids to relax the predicted structure by energy minimization for clash elimination. Extensive empirical experiments demonstrate that HighFold2 can accurately predict the 3D structures of cyclic peptide monomers containing unnatural amino acids and their complexes with proteins, with the median RMSD for Cα reaching 1.891 Å. All these results indicate the effectiveness of HighFold2, representing a significant advancement in cyclic peptide-based drug discovery.
PMID:40350698 | DOI:10.1093/bib/bbaf202
Automatic construction of risk transmission network about subway construction based on deep learning models
Sci Rep. 2025 May 11;15(1):16383. doi: 10.1038/s41598-025-99561-0.
ABSTRACT
Safety risks management is a critical part during the subway construction. However, conventional methods for risk identification heavily rely on experience from experts and fail to effectively identify the relationship between risk factors and events embedded in accident texts, which fail to provide substantial guidance for subway safety risks management. With a dataset comprising 562 occurrences of subway construction accidents, this study devised a domain-specific entity recognition model for identifying safety hazards during the subway construction. The model was constructed by a Bidirectional Long Short-Term Memory Network with Conditional Random Fields (BiLSTM-CRF). Additionally, a domain-specific entity causal relation extraction model employing Convolutional Neural Networks (CNN) was also developed in thsi model. The constructed models automatically extract safety risk factors, safety events, and their causal relationships from the texts about subway accidents. The precision, recall, and F1 scores of Metro Construction Safety Risk Named Entity Recognition Model (MCSR-NER-Model) all exceeded 77%. Its performance in the specialized domain named entity recognition (NER) with a limited volume of textual data is satisfactory. The Metro Construction Safety Risk Domain Entity Causal Relationship Extraction Model (MCSR-CE-Model) achieved an impressive accuracy, recall, and F1 score of 98.96%, exhibiting excellent performance. Moreover, the extracted entities were normalized and domain dictionary was developed. Based on the processed entities and relationships processed by the domain dictionary, 533 domain entity causal relation triplets were obtained, facilitating the establishment of the directed and unweighted complex network and case database about the risks of subway construction. This research successfully converted accident texts into a causal chain structure of "safety risk factors to risk events," providing detailed categorization of safety risks and events. Concurrently, it revealed the interrelationships and historical statistical patterns among various safety risk factors and categories of risk events through the complex safety risks network. The construction of the database facilitated project managers in conducting management decisions about safety risks.
PMID:40350479 | DOI:10.1038/s41598-025-99561-0
Domain-specific AI segmentation of IMPDH2 rod/ring structures in mouse embryonic stem cells
BMC Biol. 2025 May 12;23(1):126. doi: 10.1186/s12915-025-02226-7.
ABSTRACT
BACKGROUND: Inosine monophosphate dehydrogenase 2 (IMPDH2) is an enzyme that catalyses the rate-limiting step of guanine nucleotides. In mouse embryonic stem cells (ESCs), IMPDH2 forms large multi-protein complexes known as rod-ring (RR) structures that dissociate when ESCs differentiate. Manual analysis of RR structures from confocal microscopy images, although possible, is not feasible on a large scale due to the quantity of RR structures present in each field of view. To address this analysis bottleneck, we have created a fully automatic RR image classification pipeline to segment, characterise and measure feature distributions of these structures in ESCs.
RESULTS: We find that this model can automatically segment images with a Dice score of over 80% for both rods and rings for in-domain images compared to expert annotation, with a slight drop to 70% for datasets out of domain. Important feature measurements derived from these segmentations show high agreement with the measurements derived from expert annotation, achieving an R2 score of over 90% for counting the number of RRs over the dataset.
CONCLUSIONS: We have established for the first time a quantitative baseline for RR distribution in pluripotent ESCs and have made a pipeline available for training to be applied to other models in which RR remain an open topic of study.
PMID:40350411 | DOI:10.1186/s12915-025-02226-7
Cine Cardiac Magnetic Resonance Segmentation using Temporal-spatial Adaptation of Prompt-enabled Segment-Anything-Model: A Feasibility Study
J Cardiovasc Magn Reson. 2025 May 9:101909. doi: 10.1016/j.jocmr.2025.101909. Online ahead of print.
ABSTRACT
BACKGROUND: We propose an approach to adapt a segmentation foundation model, segment-anything-model (SAM), for cine Cardiac Magnetic Resonance (CMR) segmentation and evaluate its generalization performance on unseen datasets.
METHODS: We present our model, cineCMR-SAM, which introduces a temporal-spatial attention mechanism to produce segmentation across one cardiac cycle. We freeze the pre-trained SAM's weights to leverage SAM's generalizability while fine-tuning the rest of the model on two public cine CMR datasets. Our model also enables text prompts to specify the view type (short-axis or long-axis) of the input slices and box prompts to guide the segmentation region. We evaluated our model's generalization performance on three external testing datasets including a public multi-center, multi-vendor testing dataset of 136 cases and two retrospectively collected in-house datasets from two different centers with specific pathologies: aortic stenosis (40 cases) and heart failure with preserved ejection fraction (HFpEF) (53 cases).
RESULTS: Our approach achieved superior generalization in both the public testing dataset (Dice for LV = 0.94 and for myocardium = 0.86) and two in-house datasets (Dice ≥ 0.90 for LV and ≥ 0.82 for myocardium) compared to existing CMR deep learning segmentation methods. Clinical parameters derived from automatic and manual segmentations showed a strong correlation (r ≥ 0.90). The use of both text prompts and box prompts enhanced the segmentation accuracy.
CONCLUSION: cineCMR-SAM effectively adapts SAM for cine CMR segmentation, achieving high generalizability and superior accuracy on unseen datasets.
PMID:40350082 | DOI:10.1016/j.jocmr.2025.101909
External Validation of an AI Ensemble for Skin Cancer Detection: Enhancing Diagnostic Performance on Dermoscopic Images
J Invest Dermatol. 2025 May 9:S0022-202X(25)00469-5. doi: 10.1016/j.jid.2025.04.021. Online ahead of print.
NO ABSTRACT
PMID:40350056 | DOI:10.1016/j.jid.2025.04.021
A subject transfer neural network fuses Generator and Euclidean Alignment for EEG-based motor imagery classification
J Neurosci Methods. 2025 May 9:110483. doi: 10.1016/j.jneumeth.2025.110483. Online ahead of print.
ABSTRACT
BACKGROUND: Brain-computer interface (BCI) facilitates the connection between human brain and computer, enabling individuals to control external devices indirectly through cognitive processes. Although it has great development prospects, the significant difference in EEG signals among individuals hinders users from further utilizing the BCI system.
NEW METHOD: Addressing this difference and improving BCI classification accuracy remain key challenges. In this paper, we propose a transfer learning model based on deep learning to transfer the data distribution from the source domain to the target domain, named a subject transfer neural network combining the Generator with Euclidean alignment (ST-GENN). It consists of three parts: 1) Align the original EEG signals in the Euclidean space; 2) Send the aligned data to the Generator to obtain the transferred features; 3) Utilize the Convolution-attention-temporal (CAT) classifier to classify the transferred features.
RESULTS: The model is validated on BCI competition IV 2a, BCI competition IV 2b and SHU datasets to evaluate its classification performance, and the results are 82.85%, 86.28% and 67.2% for the three datasets, respectively.
COMPARISON WITH EXISTING METHODS: The results have been shown to be robust to subject variability, with the average accuracy of the proposed method outperforming baseline algorithms by ranging from 2.03% to 15.43% on the 2a dataset, from 0.86% to 10.16% on the 2b dataset and from 3.3% to 17.9% on the SHU dataset.
CONCLUSIONS FOR RESEARCH ARTICLES: The advantage of our model lies in its ability to effectively transfer the experience and knowledge of the source domain data to the target domain, thus bridging the gap between them. Our method can improve the practicability of MI-BCI systems.
PMID:40350042 | DOI:10.1016/j.jneumeth.2025.110483
CirnetamorNet: An ultrasonic temperature measurement network for microwave hyperthermia based on deep learning
SLAS Technol. 2025 May 9:100297. doi: 10.1016/j.slast.2025.100297. Online ahead of print.
ABSTRACT
OBJECTIVE: Microwave thermotherapy is a promising approach for cancer treatment, but accurate noninvasive temperature monitoring remains challenging. This study aims to achieve accurate temperature prediction during microwave thermotherapy by efficiently integrating multi-feature data, thereby improving the accuracy and reliability of noninvasive thermometry techniques.
METHODS: We proposed an enhanced recurrent neural network architecture, namely CirnetamorNet. The experimental data acquisition system is developed by using the material that simulates the characteristics of human tissue to construct the body model. Ultrasonic image data at different temperatures were collected, and 5 parameters with high temperature correlation were extracted from gray scale covariance matrix and Homodyned-K distribution. Using multi-feature data as input and temperature prediction as output, the CirnetamorNet model is constructed by multi-head attention mechanism. Model performance was evaluated by analyzing training losses, predicting mean square error and accuracy, and ablation experiments were performed to evaluate the contribution of each module.
RESULTS: Compared with common models, the CirnetamorNet model performs well, with training losses as low as 1.4589 and mean square error of only 0.1856. Its temperature prediction accuracy of 0.3°C exceeds that of many advanced models. Ablation experiments show that the removal of any key module of the model will lead to performance degradation, which proves that the collaboration of all modules is significant for improving the performance of the model.
CONCLUSION: The proposed CirnetamorNet model exhibits exceptional performance in noninvasive thermometry for microwave thermotherapy. It offers a novel approach to multi-feature data fusion in the medical field and holds significant practical application value.
PMID:40350037 | DOI:10.1016/j.slast.2025.100297
Deep-Learning Method for the Diagnosis and Classification of Orbital Blowout Fracture Based on Computed Tomography
J Oral Maxillofac Surg. 2025 Apr 23:S0278-2391(25)00243-5. doi: 10.1016/j.joms.2025.04.010. Online ahead of print.
ABSTRACT
BACKGROUND: Blowout fractures (BOFs) are common injuries. Accurate and rapid diagnosis based on computed tomography (CT) is important for proper management. Deep-learning techniques can contribute to accelerating the diagnostic process and supporting timely and accurate management, particularly in environments with limited medical resources.
PURPOSE: The purpose of this retrospective in-silico cohort study was to develop deep-learning models for detecting and classifying BOF using facial CT.
STUDY DESIGN, SETTING, AND SAMPLE: We conducted a retrospective analysis of facial CT from patients diagnosed with BOF involving the medial wall, orbital floor, or both at Konkuk University Hospital between December 2005 and April 2024. Patients with other facial fractures or those involving the superior or lateral orbital walls were excluded.
PREDICTOR VARIABLE: The predictor variables are the outputs as each model's designated categories from the deep-learning models, which include the predicted 1) fracture status (normal or BOF), 2) fracture location (medial, inferior, or inferomedial), and 3) fracture timing (acute or old).
MAIN OUTCOME VARIABLES: The main outcomes were the human assessments serving as the gold standard, including the presence or absence of BOF, fracture location, and timing.
COVARIATES: The covariates were age and sex.
ANALYSES: Model performance was evaluated using the following metrics: 1) accuracy, 2) positive predictive value (PPV), 3) sensitivity, 4) F1 score (harmonic average between PPV and sensitivity), and 5) area under the receiver operating characteristic curve (AUC) for classification models.
RESULTS: This study analyzed 1,264 facial CT from 233 patients with multiple CT slices taken from each patient in various coronal views (mean age: 37.5 ± 17.9 years; 79.8% male-186 subjects). Based on these data, 3 deep-learning models were developed for 1) BOF detection (accuracy 99.5%, PPV 99.2%, sensitivity 99.6%, F1 score 99.4%, AUC 0.9999), 2) BOF location (medial, inferior, or inferomedial; accuracy 97.4%, PPV 92.7%, sensitivity 89.0%, F1 score 90.8%), and 3) BOF timing (accuracy 96.8%, PPV 90.1%, sensitivity 89.7%, F1 score 89.9%). In addition, the BOF detection model had an AUC of 0.9999.
CONCLUSIONS AND RELEVANCE: Deep-learning models developed with Neuro-T (Neurocle Inc, Seoul, Republic of Korea) can reliably diagnose and classify BOF in CT, distinguishing acute from old fractures and aiding clinical decision-making.
PMID:40349723 | DOI:10.1016/j.joms.2025.04.010
AI Applications in Transfusion Medicine: Opportunities, Challenges, and Future Directions
Acta Haematol. 2025 May 9:1-20. doi: 10.1159/000546303. Online ahead of print.
ABSTRACT
Artificial intelligence (AI) is reshaping healthcare, with its applications in transfusion medicine showing great promise to address longstanding challenges. This review explores the integration of AI-driven tools, including Machine Learning (ML), Deep Learning, Natural Language Processing (NLP), and predictive analytics, across various domains of transfusion medicine. From enhancing donor management and optimizing blood product quality to predicting transfusion needs and assessing bleeding risks, AI has demonstrated its potential to improve operational efficiency, patient safety, and resource allocation. Additionally, AI-powered systems enable more accurate blood antigen phenotyping, automate hemovigilance workflows, and streamline inventory management through advanced forecasting models. While these advancements are largely exploratory, early studies highlight the growing importance of AI in improving patient outcomes and advancing precision medicine. However, challenges such as variability in clinical workflows, algorithmic transparency, equitable access, and ethical concerns around data privacy and bias must be addressed to ensure responsible integration. Future directions in this rapidly evolving field include refining AI models for scalability and exploring emerging areas such as federated learning and AI-driven clinical trials. By addressing these challenges, AI has the potential to redefine transfusion medicine, delivering safer, more efficient, and equitable practices worldwide.
PMID:40349705 | DOI:10.1159/000546303
Machine learning-based approaches for distinguishing viral and bacterial pneumonia in paediatrics: A scoping review
Comput Methods Programs Biomed. 2025 May 8;268:108802. doi: 10.1016/j.cmpb.2025.108802. Online ahead of print.
ABSTRACT
BACKGROUND AND OBJECTIVE: Pneumonia is the leading cause of hospitalisation and mortality among children under five, particularly in low-resource settings. Accurate differentiation between viral and bacterial pneumonia is essential for guiding appropriate treatment, yet it remains challenging due to overlapping clinical and radiographic features. Advances in machine learning (ML), particularly deep learning (DL), have shown promise in classifying pneumonia using chest X-ray (CXR) images. This scoping review summarises the evidence on ML techniques for classifying viral and bacterial pneumonia using CXR images in paediatric patients.
METHODS: This scoping review was conducted following the Joanna Briggs Institute methodology and the PRISMA-ScR guidelines. A comprehensive search was performed in PubMed, Embase, and Scopus to identify studies involving children (0-18 years) with pneumonia diagnosed through CXR, using ML models for binary or multiclass classification. Data extraction included ML models, dataset characteristics, and performance metrics.
RESULTS: A total of 35 studies, published between 2018 and 2025, were included in this review. Of these, 31 studies used the publicly available Kermany dataset, raising concerns about overfitting and limited generalisability to broader, real-world clinical populations. Most studies (n=33) used convolutional neural networks (CNNs) for pneumonia classification. While many models demonstrated promising performance, significant variability was observed due to differences in methodologies, dataset sizes, and validation strategies, complicating direct comparisons. For binary classification (viral vs bacterial pneumonia), a median accuracy of 92.3% (range: 80.8% to 97.9%) was reported. For multiclass classification (healthy, viral pneumonia, and bacterial pneumonia), the median accuracy was 91.8% (range: 76.8% to 99.7%).
CONCLUSIONS: Current evidence is constrained by a predominant reliance on a single dataset and variability in methodologies, which limit the generalisability and clinical applicability of findings. To address these limitations, future research should focus on developing diverse and representative datasets while adhering to standardised reporting guidelines. Such efforts are essential to improve the reliability, reproducibility, and translational potential of machine learning models in clinical settings.
PMID:40349546 | DOI:10.1016/j.cmpb.2025.108802
Intelligent transformation of ultrasound-assisted novel solvent extraction plant active ingredients: Tools for machine learning and deep learning
Food Chem. 2025 May 7;486:144649. doi: 10.1016/j.foodchem.2025.144649. Online ahead of print.
ABSTRACT
Ultrasound-assisted novel solvent extraction enhances plant bioactive compound yield via cavitation, mechanical, and thermal mechanisms. However, the high designability of novel solvents, the multiple influence factors for extracting results, the complexity of extraction mechanisms, and the safety of extraction equipment still pose many challenges for ultrasound-assisted extraction (UAE). This review highlights advancements in utilizing machine learning and deep learning models to provide actionable solutions for UAE challenges, which include accelerating novel solvent screening, promoting the discovery of active ingredients, optimizing complex extraction processes, in-depth analysis of extraction mechanisms, and real-time monitoring of ultrasound equipment. Challenges such as model interpretability, dataset standardization, and industrial scalability are discussed. Future opportunities lie in developing universal predictive frameworks for ultrasound-related technologies and fostering cross-disciplinary integration of AI, computational chemistry, and sustainable engineering. This interdisciplinary approach aligns with the goals of Industry 5.0, fostering a transition toward digitized, eco-efficient, and intelligent extraction systems.
PMID:40349518 | DOI:10.1016/j.foodchem.2025.144649
Automated vertebrae identification and segmentation with structural uncertainty analysis in longitudinal CT scans of patients with multiple myeloma
Eur J Radiol. 2025 May 3;188:112160. doi: 10.1016/j.ejrad.2025.112160. Online ahead of print.
ABSTRACT
OBJECTIVES: Optimize deep learning-based vertebrae segmentation in longitudinal CT scans of multiple myeloma patients using structural uncertainty analysis.
MATERIALS & METHODS: Retrospective CT scans from 474 multiple myeloma patients were divided into train (179 patients, 349 scans, 2005-2011) and test cohort (295 patients, 671 scans, 2012-2020). An enhanced segmentation pipeline was developed on the train cohort. It integrated vertebrae segmentation using an open-source deep learning method (Payer's) with a post-hoc structural uncertainty analysis. This analysis identified inconsistencies, automatically correcting them or flagging uncertain regions for human review. Segmentation quality was assessed through vertebral shape analysis using topology. Metrics included 'identification rate', 'longitudinal vertebral match rate', 'success rate' and 'series success rate' and evaluated across age/sex subgroups. Statistical analysis included McNemar and Wilcoxon signed-rank tests, with p < 0.05 indicating significant improvement.
RESULTS: Payer's method achieved an identification rate of 95.8% and success rate of 86.7%. The proposed pipeline automatically improved these metrics to 98.8% and 96.0%, respectively (p < 0.001). Additionally, 3.6% of scans were marked for human inspection, increasing the success rate from 96.0% to 98.8% (p < 0.001). The vertebral match rate increased from 97.0% to 99.7% (p < 0.001), and the series success rate from 80.0% to 95.4% (p < 0.001). Subgroup analysis showed more consistent performance across age and sex groups.
CONCLUSION: The proposed pipeline significantly outperforms Payer's method, enhancing segmentation accuracy and reducing longitudinal matching errors while minimizing evaluation workload. Its uncertainty analysis ensures robust performance, making it a valuable tool for longitudinal studies in multiple myeloma.
PMID:40349413 | DOI:10.1016/j.ejrad.2025.112160
iEnhancer-DS: Attention-based improved densenet for identifying enhancers and their strength
Comput Biol Chem. 2025 May 5;118:108484. doi: 10.1016/j.compbiolchem.2025.108484. Online ahead of print.
ABSTRACT
Enhancers are short DNA fragments that enhance gene expression by binding to transcription factors. Accurately identifying enhancers and their strength is crucial for understanding gene regulation mechanisms. However, traditional enhancer sequencing techniques are costly and time-consuming. Therefore, it is necessary to develop computational methods to quickly and accurately identify enhancers and their strength. Given the limitations of existing computational methods, such as low performance and complex encoding, this study proposes a deep learning-based multi-task framework, iEnhancer-DS, for enhancer identification and their strength classification. First, feature embeddings characterizing DNA sequences are obtained using one-hot encoding and nucleotide chemical properties (NCP). Next, an improved DenseNet module is applied to learn implicit high-order features from the concatenated feature embeddings. Subsequently, the self-attention mechanism is used to dynamically assess the importance of features and assign weights to them, and then the features are passed to the multilayer perceptron (MLP) to calculate the prediction probabilities. Experimental results show that iEnhancer-DS achieves state-of-the-art performance in both enhancer identification and strength prediction. In the enhancer identification task, iEnhancer-DS improves ACC and MCC by 4.03% and 8.47% respectively over the current state-of-the-art methods. Similarly, in the enhancer strength prediction task, the ACC and MCC values of iEnhancer-DS increased by 1.40% and 3.81%, respectively. In addition, we used the t-SNE method to perform an interpretable analysis of the mechanism of action of iEnhancer-DS. The detailed code and raw data of iEnhancer-DS can be obtained from https://github.com/zha12ja/iEnhancer-DS.
PMID:40349379 | DOI:10.1016/j.compbiolchem.2025.108484
Enhancing segmentation accuracy of the common iliac vein in OLIF51 surgery in intraoperative endoscopic video through gamma correction: a deep learning approach
Int J Comput Assist Radiol Surg. 2025 May 11. doi: 10.1007/s11548-025-03388-z. Online ahead of print.
ABSTRACT
PURPOSE: The principal objective of this study was to develop and evaluate a deep learning model for segmenting the common iliac vein (CIV) from intraoperative endoscopic videos during oblique lateral interbody fusion for L5/S1 (OLIF51), a minimally invasive surgical procedure for degenerative lumbosacral spine diseases. The study aimed to address the challenge of intraoperative differentiation of the CIV from surrounding tissues to minimize the risk of vascular damage during the surgery.
METHODS: We employed two convolutional neural network (CNN) architectures: U-Net and U-Net++ with a ResNet18 backbone, for semantic segmentation. Gamma correction was applied during image preprocessing to improve luminance contrast between the CIV and adjacent tissues. We used a dataset of 614 endoscopic images from OLIF51 surgeries for model training, validation, and testing.
RESULTS: The U-Net++/ResNet18 model outperformed, achieving a Dice score of 0.70, indicating superior ability in delineating the position and shape of the CIV compared to the U-Net/ResNet18 model, which achieved a Dice score of 0.59. Gamma correction increased the differentiation between the CIV and the artery, improving the Dice score from 0.44 to 0.70.
CONCLUSION: The findings demonstrate that deep learning models, especially the U-Net++ with ResNet18 enhanced by gamma correction preprocessing, can effectively segment the CIV in intraoperative videos. This approach has the potential to significantly improve intraoperative assistance and reduce the risk of vascular injury during OLIF51 procedures, despite the need for further research and refinement of the model for clinical application.
PMID:40349282 | DOI:10.1007/s11548-025-03388-z
Ultrasound-based deep learning radiomics for enhanced axillary lymph node metastasis assessment: a multicenter study
Oncologist. 2025 May 8;30(5):oyaf090. doi: 10.1093/oncolo/oyaf090.
ABSTRACT
BACKGROUND: Accurate preoperative assessment of axillary lymph node metastasis (ALNM) in breast cancer is crucial for guiding treatment decisions. This study aimed to develop a deep-learning radiomics model for assessing ALNM and to evaluate its impact on radiologists' diagnostic accuracy.
METHODS: This multicenter study included 866 breast cancer patients from 6 hospitals. The data were categorized into training, internal test, external test, and prospective test sets. Deep learning and handcrafted radiomics features were extracted from ultrasound images of primary tumors and lymph nodes. The tumor score and LN score were calculated following feature selection, and a clinical-radiomics model was constructed based on these scores along with clinical-ultrasonic risk factors. The model's performance was validated across the 3 test sets. Additionally, the diagnostic performance of radiologists, with and without model assistance, was evaluated.
RESULTS: The clinical-radiomics model demonstrated robust discrimination with AUCs of 0.94, 0.92, 0.91, and 0.95 in the training, internal test, external test, and prospective test sets, respectively. It surpassed the clinical model and single score in all sets (P < .05). Decision curve analysis and clinical impact curves validated the clinical utility of the clinical-radiomics model. Moreover, the model significantly improved radiologists' diagnostic accuracy, with AUCs increasing from 0.71 to 0.82 for the junior radiologist and from 0.75 to 0.85 for the senior radiologist.
CONCLUSIONS: The clinical-radiomics model effectively predicts ALNM in breast cancer patients using noninvasive ultrasound features. Additionally, it enhances radiologists' diagnostic accuracy, potentially optimizing resource allocation in breast cancer management.
PMID:40349137 | DOI:10.1093/oncolo/oyaf090
A novel framework for esophageal cancer grading: combining CT imaging, radiomics, reproducibility, and deep learning insights
BMC Gastroenterol. 2025 May 10;25(1):356. doi: 10.1186/s12876-025-03952-6.
ABSTRACT
OBJECTIVE: This study aims to create a reliable framework for grading esophageal cancer. The framework combines feature extraction, deep learning with attention mechanisms, and radiomics to ensure accuracy, interpretability, and practical use in tumor analysis.
MATERIALS AND METHODS: This retrospective study used data from 2,560 esophageal cancer patients across multiple clinical centers, collected from 2018 to 2023. The dataset included CT scan images and clinical information, representing a variety of cancer grades and types. Standardized CT imaging protocols were followed, and experienced radiologists manually segmented the tumor regions. Only high-quality data were used in the study. A total of 215 radiomic features were extracted using the SERA platform. The study used two deep learning models-DenseNet121 and EfficientNet-B0-enhanced with attention mechanisms to improve accuracy. A combined classification approach used both radiomic and deep learning features, and machine learning models like Random Forest, XGBoost, and CatBoost were applied. These models were validated with strict training and testing procedures to ensure effective cancer grading.
RESULTS: This study analyzed the reliability and performance of radiomic and deep learning features for grading esophageal cancer. Radiomic features were classified into four reliability levels based on their ICC (Intraclass Correlation) values. Most of the features had excellent (ICC > 0.90) or good (0.75 < ICC ≤ 0.90) reliability. Deep learning features extracted from DenseNet121 and EfficientNet-B0 were also categorized, and some of them showed poor reliability. The machine learning models, including XGBoost and CatBoost, were tested for their ability to grade cancer. XGBoost with Recursive Feature Elimination (RFE) gave the best results for radiomic features, with an AUC (Area Under the Curve) of 91.36%. For deep learning features, XGBoost with Principal Component Analysis (PCA) gave the best results using DenseNet121, while CatBoost with RFE performed best with EfficientNet-B0, achieving an AUC of 94.20%. Combining radiomic and deep features led to significant improvements, with XGBoost achieving the highest AUC of 96.70%, accuracy of 96.71%, and sensitivity of 95.44%. The combination of both DenseNet121 and EfficientNet-B0 models in ensemble models achieved the best overall performance, with an AUC of 95.14% and accuracy of 94.88%.
CONCLUSIONS: This study improves esophageal cancer grading by combining radiomics and deep learning. It enhances diagnostic accuracy, reproducibility, and interpretability, while also helping in personalized treatment planning through better tumor characterization.
CLINICAL TRIAL NUMBER: Not applicable.
PMID:40348987 | DOI:10.1186/s12876-025-03952-6
Research and application of deep learning object detection methods for forest fire smoke recognition
Sci Rep. 2025 May 10;15(1):16328. doi: 10.1038/s41598-025-98086-w.
ABSTRACT
Forest fires are severe ecological disasters worldwide that cause extensive ecological destruction and economic losses while threatening biodiversity and human safety. With the escalation of climate change, the frequency and intensity of forest fires are increasing annually, underscoring the urgent need for effective monitoring and early warning systems. This study investigates the application effectiveness of deep learning-based object detection technology in forest fire smoke recognition by using the YOLOv11x algorithm to develop an efficient fire detection model. The objective is to enhance early fire detection capabilities and mitigate potential damage. To improve the model's applicability and generalizability, two publicly available fire image datasets, WD (Wildfire Dataset) and FFS (Forest Fire Smoke), encompassing various complex scenarios and external conditions, were employed. After 501 training epochs, the model's detection performance was comprehensively evaluated via multiple metrics, including precision, recall, and mean average precision (mAP50 and mAP50-95). The results demonstrate that YOLOv11x excels in bounding box loss (box loss), classification loss (cls loss), and distribution focal loss (dfl loss), indicating effective optimization of object detection performance across multiple dimensions. Specifically, the model achieved a precision of 0.949, a recall of 0.850, an mAP50 of 0.901, and an mAP50-95 of 0.786, highlighting its high detection accuracy and stability. Analysis of the precision‒recall (PR) curve revealed an average mAP@0.5 of 0.901, further confirming the effectiveness of YOLOv11x in fire smoke detection. Notably, the mAP@0.5 for the smoke category reached 0.962, whereas for the flame category, it was 0.841, indicating superior performance in smoke detection compared with flame detection. This disparity primarily arises from the distinct visual characteristics of flames and smoke; flames possess more vivid colors and defined shapes, facilitating easier recognition by the model, whereas smoke exhibits more ambiguous and variable textures and shapes, increasing detection difficulty. In the test set, 86.89% of the samples had confidence scores exceeding 0.85, further validating the model's reliability. In summary, the YOLOv11x algorithm demonstrates excellent performance and broad application potential in forest fire smoke recognition, providing robust technical support for early fire warning systems and offering valuable insights for the design of intelligent monitoring systems in related fields.
PMID:40348915 | DOI:10.1038/s41598-025-98086-w
A new deep learning-based fast transcoding for internet of things applications
Sci Rep. 2025 May 10;15(1):16325. doi: 10.1038/s41598-025-99533-4.
ABSTRACT
To achieve low-power video communication in Internet of Things, this study presents a new deep learning-based fast transcoding algorithm from distributed video coding (DVC) to high efficiency video coding (HEVC). The proposed method accelerates transcoding by minimizing HEVC encoding complexity. Specifically, it models the selections of coding unit (CU) partitions and prediction unit (PU) partition modes as classification tasks. To address these tasks, a novel lightweight deep learning network has been developed acting as the classifier in a top-down transcoding strategy for improved efficiency. The proposed transcoding algorithm operates efficiently at both CU and PU levels. At the CU level, it reduces HEVC encoding complexity by accurately predicting CU partitions. At the PU level, predicting PU partition modes for non-split CUs further streamlines the encoding process. Experimental results demonstrate that the proposed CU-level transcoding reduces complexity overhead by 45.69%, with a 1.33% average Bjøntegaard delta bit-rate (BD-BR) increase. At the PU level, the transcoding achieves an even greater complexity reduction, averaging 60.97%, with a 2.16% average BD-BR increase. These results highlight the algorithm's efficiency in balancing computational cost and compression performance. The proposed method provides a promising low-power video coding scheme for resource-constrained terminals in both upstream and downstream video communication scenarios.
PMID:40348899 | DOI:10.1038/s41598-025-99533-4
Performance of fully automated deep-learning-based coronary artery calcium scoring in ECG-gated calcium CT and non-gated low-dose chest CT
Eur Radiol. 2025 May 10. doi: 10.1007/s00330-025-11559-4. Online ahead of print.
ABSTRACT
OBJECTIVES: This study aimed to validate the agreement and diagnostic performance of a deep-learning-based coronary artery calcium scoring (DL-CACS) system for ECG-gated and non-gated low-dose chest CT (LDCT) across multivendor datasets.
MATERIALS AND METHODS: In this retrospective study, datasets from Seoul National University Hospital (SNUH, 652 paired ECG-gated and non-gated CT scans) and the Stanford public dataset (425 ECG-gated and 199 non-gated CT scans) were analyzed. Agreement metrics included intraclass correlation coefficient (ICC), coefficient of determination (R²), and categorical agreement (κ). Diagnostic performance was assessed using categorical accuracy and the area under the receiver operating characteristic curve (AUROC).
RESULTS: DL-CACS demonstrated excellent performance for ECG-gated CT in both datasets (SNUH: R² = 0.995, ICC = 0.997, κ = 0.97, AUROC = 0.99; Stanford: R² = 0.989, ICC = 0.990, κ = 0.97, AUROC = 0.99). For non-gated CT using manual LDCT CAC scores as a reference, performance was similarly high (R² = 0.988, ICC = 0.994, κ = 0.96, AUROC = 0.98-0.99). When using ECG-gated CT scores as the reference, performance for non-gated CT was slightly lower but remained robust (SNUH: R² = 0.948, ICC = 0.968, κ = 0.88, AUROC = 0.98-0.99; Stanford: R² = 0.949, ICC = 0.948, κ = 0.71, AUROC = 0.89-0.98).
CONCLUSION: DL-CACS provides a reliable and automated solution for CACS, potentially reducing workload while maintaining robust performance in both ECG-gated and non-gated CT settings.
KEY POINTS: Question How accurate and reliable is deep-learning-based coronary artery calcium scoring (DL-CACS) in ECG-gated CT and non-gated low-dose chest CT (LDCT) across multivendor datasets? Findings DL-CACS showed near-perfect performance for ECG-gated CT. For non-gated LDCT, performance was excellent using manual scores as the reference and lower but reliable when using ECG-gated CT scores. Clinical relevance DL-CACS provides a reliable and automated solution for CACS, potentially reducing workload and improving diagnostic workflow. It supports cardiovascular risk stratification and broader clinical adoption, especially in settings where ECG-gated CT is unavailable.
PMID:40348882 | DOI:10.1007/s00330-025-11559-4
Multimodal anomaly detection in complex environments using video and audio fusion
Sci Rep. 2025 May 10;15(1):16291. doi: 10.1038/s41598-025-01146-4.
ABSTRACT
Due to complex environmental conditions and varying noise levels, traditional models are limited in their effectiveness for detecting anomalies in video sequences. Aiming at the challenges of accuracy, robustness, and real-time processing requirements in the field of image and video processing, this study proposes an anomaly detection and recognition algorithm for video image data based on deep learning. The algorithm combines the innovative methods of spatio-temporal feature extraction and noise suppression, and aims to improve the processing performance, especially in complex environments, by introducing an improved Variable Auto Encoder (VAE) structure. The model named Spatio-Temporal Anomaly Detection Network (STADNet) captures the spatio-temporal features of video images through multi-scale Three-Dimensional (3D) convolution module and spatio-temporal attention mechanism. This approach improves the accuracy of anomaly detection. Multi-stream network architecture and cross-attention fusion mechanism are also adopted to comprehensively consider different factors such as color, texture, and motion, and further improve the robustness and generalization ability of the model. The experimental results show that compared with the existing models, the new model has obvious advantages in performance stability and real-time processing under different noise levels. Specifically, the AUC value of the proposed model is 0.95 on UCSD Ped2 dataset, which is about 10% higher than other models, and the AUC value on Avenue dataset is 0.93, which is about 12% higher. This study not only proposes an effective image and video processing scheme but also demonstrates wide practical potential, providing a new perspective and methodological basis for future research and application in related fields.
PMID:40348836 | DOI:10.1038/s41598-025-01146-4