Deep learning
Using Traditional and Deep Machine Learning to Predict Emergency Room Triage Levels
J Comput Biol. 2025 May 22. doi: 10.1089/cmb.2024.0632. Online ahead of print.
ABSTRACT
Accurate triage in emergency rooms is crucial for efficient patient care and resource allocation. We developed methods to predict triage levels using several traditional machine learning methods (logistic regression, random forest, XGBoost) and neural network deep learning-based approaches. These models were tested on a dataset from emergency department visits of patients at a local Turkish hospital; this dataset consists of both structured and unstructured data. Compared with previous work, our challenge was to build a predictive model that uses documents written in the Turkish language and that handles specific aspects of the Turkish medical system. Text embedding techniques such as Bag of Words, Word2Vec, and BERT-based embedding were used to process the unstructured patient complaints. We used a comprehensive set of features including patient history data and disease diagnosis within our predictive models, which included advanced neural network architectures such as convolutional neural networks, attention mechanisms, and long-short-term memory networks. Our results revealed that BERT embeddings significantly enhanced the performance of neural network models, while Word2Vec embeddings showed slight better results in traditional machine learning models. The most effective model was XGBoost combined with Word2Vec embeddings, achieving 86.7% AUC, 81.5% accuracy, and 68.7% weighted F1 score. We conclude that text embedding methods and machine learning methods are effective tools to predict emergency room triage levels. The integration of patient history into the models, alongside the strategic use of text embeddings, significantly improves predictive accuracy.
PMID:40401726 | DOI:10.1089/cmb.2024.0632
Machine learning models for pharmacogenomic variant effect predictions - recent developments and future frontiers
Pharmacogenomics. 2025 May 22:1-12. doi: 10.1080/14622416.2025.2504863. Online ahead of print.
ABSTRACT
Pharmacogenomic variations in genes involved in drug disposition and in drug targets is a major determinant of inter-individual differences in drug response and toxicity. While the effects of common variants are well established, millions of rare variations remain functionally uncharacterized, posing a challenge for the implementation of precision medicine. Recent advances in machine learning (ML) have significantly enhanced the prediction of variant effects by considering DNA as well as protein sequences, as well as their evolutionary conservation and haplotype structures. Emerging deep learning models utilize techniques to capture evolutionary conservation and biophysical properties, and ensemble approaches that integrate multiple predictive models exhibit increased accuracy, robustness, and interpretability. This review explores the current landscape of ML-based variant effect predictors. We discuss key methodological differences and highlight their strengths and limitations for pharmacogenomic applications. We furthermore discuss emerging methodologies for the prediction of substrate-specificity and for consideration of variant epistasis. Combined, these tools improve the functional effect prediction of drug-related variants and offer a viable strategy that could in the foreseeable future translate comprehensive genomic information into pharmacogenetic recommendations.
PMID:40401639 | DOI:10.1080/14622416.2025.2504863
iPSC-RPE patch restores photoreceptors and regenerates choriocapillaris in a pig retinal degeneration model
JCI Insight. 2025 May 22;10(10):e179246. doi: 10.1172/jci.insight.179246. eCollection 2025 May 22.
ABSTRACT
Dry age-related macular degeneration (AMD) is a leading cause of untreatable vision loss. In advanced cases, retinal pigment epithelium (RPE) cell loss occurs alongside photoreceptor and choriocapillaris degeneration. We hypothesized that an RPE-patch would mitigate photoreceptor and choriocapillaris degeneration to restore vision. An induced pluripotent stem cell-derived RPE (iRPE) patch was developed using a clinically compatible manufacturing process by maturing iRPE cells on a biodegradable poly(lactic-co-glycolic acid) (PLGA) scaffold. To compare outcomes, we developed a surgical procedure for immediate sequential delivery of PLGA-iRPE and/or PLGA-only patches in the subretinal space of a pig model of laser-induced outer retinal degeneration. Deep learning algorithm-based optical coherence tomography (OCT) image segmentation verified preservation of the photoreceptors over the areas of PLGA-iRPE-transplanted retina and not in laser-injured or PLGA-only-transplanted retina. Adaptive optics imaging of individual cone photoreceptors further supported this finding. OCT-angiography revealed choriocapillaris regeneration in PLGA-iRPE- and not in PLGA-only-transplanted retinas. Our data, obtained using clinically relevant techniques, verified that PLGA-iRPE supports photoreceptor survival and regenerates choriocapillaris in a laser-injured pig retina. Sequential delivery of two 8 mm2 transplants allows for testing of surgical feasibility and safety of the double dose. This work allows one surgery to treat larger and noncontiguous retinal degeneration areas.
PMID:40401519 | DOI:10.1172/jci.insight.179246
Convolutional autoencoder-based deep learning for intracerebral hemorrhage classification using brain CT images
Cogn Neurodyn. 2025 Dec;19(1):77. doi: 10.1007/s11571-025-10259-5. Epub 2025 May 19.
ABSTRACT
Intracerebral haemorrhage (ICH) is a common form of stroke that affects millions of people worldwide. The incidence is associated with a high rate of mortality and morbidity. Accurate diagnosis using brain non-contrast computed tomography (NCCT) is crucial for decision-making on potentially life-saving surgery. Limited access to expert readers and inter-observer variability imposes barriers to timeous and accurate ICH diagnosis. We proposed a hybrid deep learning model for automated ICH diagnosis using NCCT images, which comprises a convolutional autoencoder (CAE) to extract features with reduced data dimensionality and a dense neural network (DNN) for classification. In order to ensure that the model generalizes to new data, we trained it using tenfold cross-validation and holdout methods. Principal component analysis (PCA) based dimensionality reduction and classification is systematically implemented for comparison. The study dataset comprises 1645 ("ICH" class) and 1648 ("Normal" class belongs to patients with non-hemorrhagic stroke) labelled images obtained from 108 patients, who had undergone CT examination on a 64-slice computed tomography scanner at Kalinga Institute of Medical Sciences between 2020 and 2023. Our developed CAE-DNN hybrid model attained 99.84% accuracy, 99.69% sensitivity, 100% specificity, 100% precision, and 99.84% F1-score, which outperformed the comparator PCA-DNN model as well as the published results in the literature. In addition, using saliency maps, our CAE-DNN model can highlight areas on the images that are closely correlated with regions of ICH, which have been manually contoured by expert readers. The CAE-DNN model demonstrates the proof-of-concept for accurate ICH detection and localization, which can potentially be implemented to prioritize the treatment using NCCT images in clinical settings.
PMID:40401248 | PMC:PMC12089006 | DOI:10.1007/s11571-025-10259-5
Data source and utilization of artificial intelligence technologies in vascular surgery-a scoping review
Front Cardiovasc Med. 2025 May 7;12:1497822. doi: 10.3389/fcvm.2025.1497822. eCollection 2025.
ABSTRACT
OBJECTIVE: The goals of this scoping review were to determine the source of data used to develop AI-based algorithms with emphasis on natural language processing, establish their application in different areas of vascular surgery and identify a target audience of published journals.
MATERIALS AND METHODS: A literature search was carried out using established database from January 1996 to March 2023.
RESULTS: 342 peer-reviewed articles met the eligibility criteria. NLP algorithms were described in 34 papers, while 115 and 193 papers focused on machine learning (ML) and deep learning (DL), respectively. The AI-based algorithms found widest application in research related to aorta (126 articles), carotid disease (85), and peripheral arterial disease (65). Image-based data were utilised in 216 articles, while 153 and 85 papers relied on medical records, and clinical parameters. The AI algorithms were used for predictive modelling (123 papers), medical image segmentation (118), and to aid identification, detection, and diagnosis (103).
DISCUSSION: Applications of Artificial Intelligence (AI) are gaining traction in healthcare, including vascular surgery. While most healthcare data is in the form of narrative text or audio recordings, natural language processing (NLP) offers the ability to extract information from unstructured medical records. This can be used to develop more accurate risk prediction models, support shared-decision model, and identify patients for trials to improve recruitment.
CONCLUSION: Utilisation of different data sources and AI technologies depends on the purpose of the undertaken research. Despite the abundance of available of textual data, the NLP is disproportionally underutilised AI sub-domain in vascular surgery.
PMID:40401223 | PMC:PMC12093488 | DOI:10.3389/fcvm.2025.1497822
Brain-wide 3D neuron detection and mapping with deep learning
Neurophotonics. 2025 Apr;12(2):025012. doi: 10.1117/1.NPh.12.2.025012. Epub 2025 May 20.
ABSTRACT
SIGNIFICANCE: Mapping the spatial distribution of specific neurons across the entire brain is essential for understanding the neural circuits associated with various brain functions, which in turn requires automated and reliable neuron detection and mapping techniques.
AIM: To accurately identify somatic regions from 3D imaging data and generate reliable soma locations for mapping to diverse brain regions, we introduce NeuronMapper, a brain-wide 3D neuron detection and mapping approach that leverages the power of deep learning.
APPROACH: NeuronMapper is implemented as a four-stage framework encompassing preprocessing, classification, detection, and mapping. Initially, whole-brain imaging data is divided into 3D sub-blocks during the preprocessing phase. A lightweight classification network then identifies the sub-blocks containing somata. Following this, a Video Swin Transformer-based segmentation network delineates the soma regions within the identified sub-blocks. Last, the locations of the somata are extracted and registered with the Allen Brain Atlas for comprehensive whole-brain neuron mapping.
RESULTS: Through the accurate detection and localization of somata, we achieved the mapping of somata at the one million level within the mouse brain. Comparative analyses with other soma detection techniques demonstrated that our method exhibits remarkably superior performance for whole-brain 3D soma detection.
CONCLUSIONS: Our approach has demonstrated its effectiveness in detecting and mapping somata within whole-brain imaging data. This method can serve as a computational tool to facilitate a deeper understanding of the brain's complex networks and functions.
PMID:40401216 | PMC:PMC12093273 | DOI:10.1117/1.NPh.12.2.025012
Real-time monitoring of molten zinc splatter using machine learning-based computer vision
J Intell Manuf. 2025;36(5):3399-3425. doi: 10.1007/s10845-024-02418-y. Epub 2024 May 22.
ABSTRACT
During steel galvanisation, immersing steel strip into molten zinc forms a protective coating. Uniform coating thickness is crucial for quality and is achieved using air knives which wipe off excess zinc. At high strip speeds, zinc splatters onto equipment, causing defects and downtime. Parameters such as knife positioning and air pressure influence splatter severity and can be optimised to reduce it. Therefore, this paper proposes a system that converges computer vision and manufacturing whilst addressing some challenges of real-time monitoring in harsh industrial environments, such as the extreme heat, metallic dust, dynamic machinery and high-speed processing at the galvanising site. The approach is primarily comprised of the Counting (CNT) background subtraction algorithm and YOLOv5, which together ensure robustness to noise produced by heat distortion and dust, as well as adaptability to the highly dynamic environment. The YOLOv5 element achieved precision, recall and mean average precision (mAP) values of 1. When validated against operator judgement using mean average error (MAE), interquartile range, median and scatter plot analysis, it was found that there was more discrepancy between the two operators than the operators and the model.This research also strategises the deployment process for integration into the galvanising line. The model proposed allows real-time monitoring and quantification of splatter severity which provides valuable insights into root-cause analysis, process optimisation and maintenance strategies. This research contributes to the digital transformation of manufacturing and whilst solving a current problem, also plants the seed for many other novel applications.
PMID:40401169 | PMC:PMC12089258 | DOI:10.1007/s10845-024-02418-y
Advances in functional magnetic resonance imaging-based brain function mapping: a deep learning perspective
Psychoradiology. 2025 Apr 29;5:kkaf007. doi: 10.1093/psyrad/kkaf007. eCollection 2025.
ABSTRACT
Functional magnetic resonance imaging (fMRI) provides a powerful tool for studying brain function by capturing neural activity in a non-invasive manner. Mapping brain function from fMRI data enables researchers to investigate the spatial and temporal dynamics of neural processes, providing insights into how the brain responds to various tasks and stimuli. In this review, we explore the evolution of deep learning-based methods for brain function mapping using fMRI. We begin by discussing various network architectures such as convolutional neural networks, recurrent neural networks, and transformers. We further examine supervised, unsupervised, and self-supervised learning paradigms for fMRI-based brain function mapping, highlighting the strengths and limitations of each approach. Additionally, we discuss emerging trends such as fMRI embedding, brain foundation models, and brain-inspired artificial intelligence, emphasizing their potential to revolutionize brain function mapping. Finally, we delve into the real-world applications and prospective impact of these advancements, particularly in the diagnosis of neural disorders, neuroscientific research, and brain-computer interfaces for decoding brain activity. This review aims to provide a comprehensive overview of current techniques and future directions in the field of deep learning and fMRI-based brain function mapping.
PMID:40401160 | PMC:PMC12093097 | DOI:10.1093/psyrad/kkaf007
Cutting-edge AI tools revolutionizing scientific research in life sciences
BioTechnologia (Pozn). 2025 Mar 31;106(1):77-102. doi: 10.5114/bta/200803. eCollection 2025.
ABSTRACT
Artificial intelligence (AI) is becoming a transformative force in the life sciences, pushing the boundaries of possibility. Imagine AI automating time-consuming tasks, uncovering hidden patterns in vast datasets, designing proteins in minutes instead of years, and even predicting disease outbreaks before they occur. This review explores the latest AI tools revolutionizing scientific fields, including research and data analysis, healthcare, and tools supporting scientific writing. Beyond data processing, AI is reshaping how scientists draft and share their findings, enhancing processes ranging from literature reviews to citation management. However, with great power comes great responsibility. Are we prepared for this leap? This review delves into the forefront of AI in the life sciences, where innovation meets responsibility.
PMID:40401131 | PMC:PMC12089930 | DOI:10.5114/bta/200803
Integration of magnetic resonance imaging and deep learning for prostate cancer detection: a systematic review
Am J Clin Exp Urol. 2025 Apr 25;13(2):69-91. doi: 10.62347/CSIJ8326. eCollection 2025.
ABSTRACT
OBJECTIVES: This study aims to evaluate the overall impact of incorporating deep learning (DL) with magnetic resonance imaging (MRI) for improving diagnostic performance in the detection and stratification of prostate cancer (PC).
METHODS: A systematic search was conducted in the PubMed database to identify relevant studies. The QUADAS-2 tool was employed to assess the scientific quality, risk of bias, and applicability of primary diagnostic accuracy studies. Additionally, adherence to the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) guidelines was evaluated to determine the extent of heterogeneity among the included studies. The systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines.
RESULTS: A total of 29 articles involving 17,954 participants were included in the study. The median agreement to the 42 CLAIM checklist items across studies was 61.90% (IQR: 57.14-66.67, range: 40.48-80.95). Most studies utilized T2-weighted imaging (T2WI) and/or apparent diffusion coefficient (ADC) derived from diffusion-weighted imaging (DWI) as input for evaluating the performance of DL-based architectures. Notably, the detection and stratification of PC in the transition zone was the least explored area.
CONCLUSIONS: DL demonstrates significant advancements in the rapid, sensitive, specific, and robust detection and stratification of PC. Promising applications include enhancing the quality of DWI, developing advanced DL models, and designing innovative nomograms or diagnostic tools to improve clinical decision-making.
PMID:40400999 | PMC:PMC12089223 | DOI:10.62347/CSIJ8326
Pleural invasion of peripheral cT1 lung cancer by deep learning analysis of thoracoscopic images: a retrospective pilot study
J Thorac Dis. 2025 Apr 30;17(4):1991-1999. doi: 10.21037/jtd-24-1510. Epub 2025 Apr 28.
ABSTRACT
BACKGROUND: Sublobar resection for small peripheral non-small cell lung cancer (NSCLC) (≤2 cm) became one of the standard procedures. Retrospective studies demonstrated that pathological pleural invasion (pPL) is associated with a higher risk of local recurrence during sublobar resection. If pPL can be properly assessed intraoperatively, converting to lobectomy may reduce the risk of local recurrence associated with sublobar resection. The study objective was to develop a deep learning algorithm predicting pPL from thoracoscopic images.
METHODS: Among consecutive patients who underwent radical thoracoscopic surgery for cT1N0M0 NSCLC (TNM 8th) from 5/2020 to 3/2022, 80 patients with pleural surface changes due to tumor (excluding cTis/1mi or peritumoral adhesions) were included. A tumor recognition deep learning model using the ResNet50 architecture was constructed from images and the focus was visualized using gradient-weighted class activation mapping (Grad-CAM). Among images in which a tumor is visible, the presence of pPL was predicted (trained on 64, validated on 16). Predictive ability was compared with the surgeons' intraoperative evaluation using McNemar's test.
RESULTS: Among 80 patients (age 69±10 years, 42.5% female, tumor diameter 20±7 mm), pPL was found in 22 patients. Compared to the pPL- group, the pPL+ group was significantly older, with larger solid diameter, more pure solid nodules, and higher SUV max. Among the 422,873 images extracted from all 80 videos, 2,074 images showed tumors, of which 608 images were pPL+. The tumor recognition algorithm had an image-level accuracy of 0.78 and F1 score of 0.60. The pPL model had a patient-level accuracy of 0.69, while the accuracy of thoracic surgeons was 0.75 (P=0.32).
CONCLUSIONS: Deep learning analysis of thoracoscopic images of lung cancer surgery showed the possibility of prediction of pPL to a comparable degree to surgeons.
PMID:40400958 | PMC:PMC12090174 | DOI:10.21037/jtd-24-1510
Multimodal LLMs for retinal disease diagnosis via OCT: few-shot versus single-shot learning
Ther Adv Ophthalmol. 2025 May 20;17:25158414251340569. doi: 10.1177/25158414251340569. eCollection 2025 Jan-Dec.
ABSTRACT
BACKGROUND AND AIM: Multimodal large language models (LLMs) have shown potential in processing both text and image data for clinical applications. This study evaluated their diagnostic performance in identifying retinal diseases from optical coherence tomography (OCT) images.
METHODS: We assessed the diagnostic accuracy of GPT-4o and Claude Sonnet 3.5 using two public OCT datasets (OCTID, OCTDL) containing expert-labeled images of four pathological conditions and normal retinas. Both models were tested using single-shot and few-shot prompts, with an overall of 3088 models' API calls. Statistical analyses were performed to evaluate differences in overall and condition-specific performance.
RESULTS: GPT-4o's accuracy improved from 56.29% with single-shot prompts to 73.08% with few-shot prompts (p < 0.001). Similarly, Claude Sonnet 3.5 increased from 40.03% to 70.98% using the same approach (p < 0.001). Condition-specific analyses revealed similar trends, with absolute improvements ranging from 2% to 64%. These findings were consistent across the validation dataset.
CONCLUSION: Few-shot prompted multimodal LLMs show promise for clinical integration, particularly in identifying normal retinas, which could help streamline referral processes in primary care. While these models fall short of the diagnostic accuracy reported in established deep learning literature, they offer simple, effective tools for assisting in routine retinal disease diagnosis. Future research should focus on further validation and integrating clinical text data with imaging.
PMID:40400723 | PMC:PMC12093016 | DOI:10.1177/25158414251340569
Assessing Self-supervised xLSTM-UNet Architectures for Head and Neck Tumor Segmentation in MR-Guided Applications
Head Neck Tumor Segm MR Guid Appl (2024). 2025;15273:166-178. doi: 10.1007/978-3-031-83274-1_12. Epub 2025 Mar 3.
ABSTRACT
Radiation therapy (RT) plays a pivotal role in treating head and neck cancer (HNC), with MRI-guided approaches offering superior soft tissue contrast and daily adaptive capabilities that significantly enhance treatment precision while minimizing side effects. To optimize MRI-guided adaptive RT for HNC, we propose a novel two-stage model for Head and Neck Tumor Segmentation. In the first stage, we leverage a Self-Supervised 3D Student-Teacher Learning Framework, specifically utilizing the DINOv2 architecture, to learn effective representations from a limited unlabeled dataset. This approach effectively addresses the challenge posed by the scarcity of annotated data, enabling the model to generalize better in tumor identification and segmentation. In the second stage, we fine-tune an xLSTM-based UNet model that is specifically designed to capture both spatial and sequential features of tumor progression. This hybrid architecture improves segmentation accuracy by integrating temporal dependencies, making it particularly well-suited for MRI-guided adaptive RT planning in HNC. The model's performance is rigorously evaluated on a diverse set of HNC cases, demonstrating significant improvements over state-of-the-art deep learning models in accurately segmenting tumor structures. Our proposed solution achieved an impressive mean aggregated Dice Coefficient of 0.81 for pre-RT segments and 0.65 for mid-RT segments, underscoring its effectiveness in automated segmentation tasks. This work advances the field of HNC imaging by providing a robust, generalizable solution for automated Head and Neck Tumor Segmentation, ultimately enhancing the quality of care for patients undergoing RT. Our team name is DeepLearnAI (CEMRG). The code for this work is available at https://github.com/RespectKnowledge/SSL-based-DINOv2_Vision-LSTM_Head-and-Neck-Tumor_Segmentation.
PMID:40400661 | PMC:PMC12091698 | DOI:10.1007/978-3-031-83274-1_12
Synthesizing [<sup>18</sup>F]PSMA-1007 PET bone images from CT images with GAN for early detection of prostate cancer bone metastases: a pilot validation study
BMC Cancer. 2025 May 21;25(1):907. doi: 10.1186/s12885-025-14301-x.
ABSTRACT
BACKGROUND: [18F]FDG PET/CT scan combined with [18F]PSMA-1007 PET/CT scan is commonly conducted for detecting bone metastases in prostate cancer (PCa). However, it is expensive and may expose patients to more radiation hazards. This study explores deep learning (DL) techniques to synthesize [18F]PSMA-1007 PET bone images from CT bone images for the early detection of bone metastases in PCa, which may reduce additional PET/CT scans and relieve the burden on patients.
METHODS: We retrospectively collected paired whole-body (WB) [18F]PSMA-1007 PET/CT images from 152 patients with clinical and pathological diagnosis results, including 123 PCa and 29 cases of benign lesions. The average age of the patients was 67.48 ± 10.87 years, and the average lesion size was 8.76 ± 15.5 mm. The paired low-dose CT and PET images were preprocessed and segmented to construct the WB bone structure images. 152 subjects were randomly stratified into training, validation, and test groups in the number of 92:41:19. Two generative adversarial network (GAN) models-Pix2pix and Cycle GAN-were trained to synthesize [18F]PSMA-1007 PET bone images from paired CT bone images. The performance of two synthesis models was evaluated using quantitative metrics of mean absolute error (MAE), mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index metrics (SSIM), as well as the target-to-background ratio (TBR).
RESULTS: The results of DL-based image synthesis indicated that the synthesis of [18F]PSMA-1007 PET bone images from low-dose CT bone images was highly feasible. The Pix2pix model performed better with an SSIM of 0.97, PSNR of 44.96, MSE of 0.80, and MAE of 0.10, respectively. The TBRs of bone metastasis lesions calculated on DL-synthesized PET bone images were highly correlated with those of real PET bone images (Pearson's r > 0.90) and had no significant differences (p < 0.05).
CONCLUSIONS: It is feasible to generate synthetic [18F]PSMA-1007 PET bone images from CT bone images by using DL techniques with reasonable accuracy, which can provide information for early detection of PCa bone metastases.
PMID:40399853 | DOI:10.1186/s12885-025-14301-x
Prediction of B/T Subtype and ETV6-RUNX1 Translocation in Pediatric Acute Lymphoblastic Leukemia by Deep Learning Analysis of Giemsa-Stained Whole Slide Images of Bone Marrow Aspirates
Pediatr Blood Cancer. 2025 May 21:e31797. doi: 10.1002/pbc.31797. Online ahead of print.
ABSTRACT
BACKGROUND: Accurate determination of B/T-cell lineage and the presence of the ETV6-RUNX1 translocation is critical for diagnosing acute lymphoblastic leukemia (ALL), as these factors influence treatment decisions and outcomes. However, these diagnostic processes often rely on advanced tools unavailable in low-resource settings, creating a need for alternative solutions.
PROCEDURE: We developed a deep learning pipeline to analyze Giemsa-stained bone marrow (BM) aspirate smears. The models were trained to distinguish between ALL, acute myeloid leukemia (AML), and non-leukemic BM samples, predict B- and T-cell lineage in ALL, and detect the presence of the ETV6-RUNX1 translocation. The performance was evaluated using cross-validation (CV) and an external validation cohort.
RESULTS: The models achieved a statistically significant area under the curve (AUC) of 0.99 in distinguishing ALL from AML and control samples. In cross-validation (CV), the models achieved a cross-validation AUC of 0.74 for predicting B/T subtypes. For predicting ETV6-RUNX1 translocation, the models achieved an AUC of 0.80. External cohort validation confirmed significant AUCs of 0.72 for B/T subtype classification and 0.69 for ETV6-RUNX1 translocation prediction.
CONCLUSIONS: Convolutional neural networks (CNNs) demonstrate potential as a diagnostic tool for pediatric ALL, enabling the identification of B/T lineage and ETV6-RUNX1 translocation from Giemsa-stained smears. These results pave the way for future utilization of CNNs as a diagnostic modality for pediatric leukemia in low-resource settings, where access to advanced diagnostic techniques is limited.
PMID:40399768 | DOI:10.1002/pbc.31797
Unveiling Spectrum-Structure Correlation in Vibrational Spectroscopy: Task-Driven Deep Learning Classification Balancing Global Fusion and Local Extraction
Anal Chem. 2025 May 21. doi: 10.1021/acs.analchem.4c05842. Online ahead of print.
ABSTRACT
Spectrum-structure correlation is crucial to identify and quantify chemicals, in which classification of mixtures and identification of functional groups are two central tasks. Deep learning-driven algorithms have made significant strides to these two tasks. However, many of these algorithms are merely adaptations of models originally designed for computer vision applications. As a result, the models often suffer from either low accuracy or limited generality when applied to spectral data due to the overlooked inherent limitations in feature richness and volume of spectral data. Here, in light of the distinctive difference in the attention of global and local information in spectral data between these two tasks, we developed contrapuntally two CNN-based algorithms, incorporating multiscale convolution and attention mechanism, to address the unique requirements of each task. It was found that the lightweight CNN-Peak algorithm is favored for the classification of a mixture, a type of single-label task, in which the feature fusion of global information is more important. Meanwhile, the more complex ResNet-ResPeak algorithm is ideally suited for the identification of functional groups, a type of multilabel task, in which the feature extraction of local information takes precedence. The task-oriented, conceptual design of deep learning algorithms not only enhances the efficacy and accuracy of spectrum-structure correlation analysis but also feeds back to achieve a more rigorous experimental design and implementation, forming a closed loop of AI for Science.
PMID:40399767 | DOI:10.1021/acs.analchem.4c05842
Emotion-Aware RoBERTa enhanced with emotion-specific attention and TF-IDF gating for fine-grained emotion recognition
Sci Rep. 2025 May 21;15(1):17617. doi: 10.1038/s41598-025-99515-6.
ABSTRACT
Emotion recognition in text is a fundamental task in natural language processing, underpinning applications such as sentiment analysis, mental health monitoring, and content moderation. Although transformer-based models like RoBERTa have advanced contextual understanding in text, they still face limitations in identifying subtle emotional cues, handling class imbalances, and processing noisy or informal input. To address these challenges, this paper introduces Emotion-Aware RoBERTa, an enhanced framework that integrates an Emotion-Specific Attention (ESA) layer and a TF-IDF based gating mechanism. These additions are designed to dynamically prioritize emotionally salient tokens while suppressing irrelevant content, thereby improving both classification accuracy and robustness. The model achieved 96.77% accuracy and a weighted F1-score of 0.97 on the primary dataset, outperforming baseline RoBERTa and other benchmark models such as DistilBERT and ALBERT with a relative improvement ranging from 9.68% to 10.87%. Its generalization capability was confirmed across two external datasets, achieving 88.03% on a large-scale corpus and 65.67% on a smaller, noisier dataset. An ablation study revealed the complementary impact of the ESA and TF-IDF components, balancing performance and inference efficiency. Attention heatmaps were used to visualize ESA's ability to focus on key emotional expressions, while inference-time optimizations using FP16 and Automatic Mixed Precision (AMP) reduced memory consumption and latency. Additionally, McNemar's statistical test confirmed the significance of the improvements over the baseline. These findings demonstrate that Emotion-Aware RoBERTa offers a scalable, interpretable, and deployment-friendly solution for fine-grained emotion recognition, making it well-suited for real-world NLP applications in emotion-aware systems.
PMID:40399457 | DOI:10.1038/s41598-025-99515-6
Deep learning based multi attribute evaluation for holistic student assessment in physical education
Sci Rep. 2025 May 21;15(1):17698. doi: 10.1038/s41598-025-02168-8.
ABSTRACT
The evaluation of students in physical education remains a formidable challenge due to the limitations of traditional assessment approaches, which are often excessively one-dimensional. This study proposes a solution utilizing deep learning via multi-attribute user evaluation modelling to address these issues. This development proposal utilizes all available data, including physical activities, cognitive tasks, emotional responses, and social interactions, for a comprehensive assessment of student performance. The methodology comprises a ten-step process that involves information collection, preparation, model construction, and deployment, followed by regular review and adjustments. The model has considerable efficacy, with a high level of accuracy and reduced errors. Moreover, an experimental investigation illustrates its robustness, having attained a low mean score. The analysis indicates that the current models exhibit more flexibility in providing personalized feedback to improve educational outcomes and enhance decision-making. Moreover, the model incorporates visualization tools such as heatmaps, which affirm the system's ability to monitor performance and progressively adjust to the dynamics of students. The developed approach incorporates automated, objective, and scalable attributes that improve student assessment. This also aids in tackling many multi-faceted challenges in physical education while formulating effective interventions for student advancement. Subsequent research may focus on the integration of real-time sensor data, enhancement of computational efficiency, and wider application across diverse educational organizations.
PMID:40399440 | DOI:10.1038/s41598-025-02168-8
FasNet: a hybrid deep learning model with attention mechanisms and uncertainty estimation for liver tumor segmentation on LiTS17
Sci Rep. 2025 May 21;15(1):17697. doi: 10.1038/s41598-025-98427-9.
ABSTRACT
Liver cancer, especially hepatocellular carcinoma (HCC), remains one of the most fatal cancers globally, emphasizing the critical need for accurate tumor segmentation to enable timely diagnosis and effective treatment planning. Traditional imaging techniques, such as CT and MRI, rely on manual interpretation, which can be both time-intensive and subject to variability. This study introduces FasNet, an innovative hybrid deep learning model that combines ResNet-50 and VGG-16 architectures, incorporating Channel and Spatial Attention mechanisms alongside Monte Carlo Dropout to improve segmentation precision and reliability. FasNet leverages ResNet-50's robust feature extraction and VGG-16's detailed spatial feature capture to deliver superior liver tumor segmentation accuracy. Channel and spatial attention mechanisms could selectively focus on the most relevant features and spatial regions for suitable segmentation with good accuracy and reliability. Monte Carlo Dropout estimates uncertainty and adds robustness, which is critical for high-stakes medical applications. Tested on the LiTS17 dataset, FasNet achieved a Dice Coefficient of 0.8766 and a Jaccard Index of 0.8487, surpassing several state-of-the-art methods. The Channel and Spatial Attention mechanisms in FasNet enhance feature selection, focusing on the most relevant spatial and channel information, while Monte Carlo Dropout improves model robustness and uncertainty estimation. These results position FasNet as a powerful diagnostic tool, offering precise and automated liver tumor segmentation that aids in early detection and precise treatment, ultimately enhancing patient outcomes.
PMID:40399406 | DOI:10.1038/s41598-025-98427-9
Towards precision agriculture tea leaf disease detection using CNNs and image processing
Sci Rep. 2025 May 21;15(1):17571. doi: 10.1038/s41598-025-02378-0.
ABSTRACT
In this study, we introduce a groundbreaking deep learning (DL) model designed for the precise task of classifying common diseases in tea leaves, leveraging advanced image analysis techniques. Our model is distinguished by its complex multi-layer architecture, crafted to adeptly handle 256 × 256 pixel images across three color channels (RGB). Beginning with an input layer complemented by a Zero Padding 2D layer to preserve spatial dimensions, our model ensures the retention of crucial geographical information across its depth. The innovative use of a convolutional layer with 64 7 × 7 filters, followed by batch normalization and Rel U activation, allows for the extraction and representation of intricate patterns from the input data. Key to our model's design is the incorporation of residual blocks, facilitating the learning of deeper networks by alleviating the vanishing gradient problem. These blocks combine Conv2D layers, batch normalization, activation layers, and shortcut connections, ensuring robust and efficient feature extraction at various levels of abstraction. The GlobalAveragePooling2D layer towards the model's end succinctly summarizes the extracted features, preparing the model for the final classification stage. This stage features a dropout layer for regularization, a dense layer with 512 units for further pattern learning, and a final dense layer with 8 units and a soft max activation function, producing a probability distribution across different disease classes. Our model's architecture is not just a testament to the sophistication of modern deep learning techniques but also highlights the novelty of applying such complex structures to the challenges of agricultural disease detection. We utilized a datasets consisting of 4000 high-resolution images of tea leaves, encompassing both diseased and healthy states, meticulously captured in the tea gardens of Pathantula, Sylhet, Bangladesh. Employing the Canon EOS 250d Camera ensured detailed representation crucial for training a robust deep learning model for disease detection in tea plants. By achieving remarkable accuracy in identifying diseases in tea leaves, this research not only sets a new benchmark for precision in agricultural diagnostics but also opens avenues for future innovations in the field of precision agriculture.
PMID:40399405 | DOI:10.1038/s41598-025-02378-0