Deep learning
AI-driven multi-omics integration for multi-scale predictive modeling of genotype-environment-phenotype relationships
Comput Struct Biotechnol J. 2025 Jan 2;27:265-277. doi: 10.1016/j.csbj.2024.12.030. eCollection 2025.
ABSTRACT
Despite the wealth of single-cell multi-omics data, it remains challenging to predict the consequences of novel genetic and chemical perturbations in the human body. It requires knowledge of molecular interactions at all biological levels, encompassing disease models and humans. Current machine learning methods primarily establish statistical correlations between genotypes and phenotypes but struggle to identify physiologically significant causal factors, limiting their predictive power. Key challenges in predictive modeling include scarcity of labeled data, generalization across different domains, and disentangling causation from correlation. In light of recent advances in multi-omics data integration, we propose a new artificial intelligence (AI)-powered biology-inspired multi-scale modeling framework to tackle these issues. This framework will integrate multi-omics data across biological levels, organism hierarchies, and species to predict genotype-environment-phenotype relationships under various conditions. AI models inspired by biology may identify novel molecular targets, biomarkers, pharmaceutical agents, and personalized medicines for presently unmet medical needs.
PMID:39886532 | PMC:PMC11779603 | DOI:10.1016/j.csbj.2024.12.030
Automated Quantitative Assessment of Retinal Vascular Tortuosity in Patients with Sickle Cell Disease
Ophthalmol Sci. 2024 Nov 22;5(2):100658. doi: 10.1016/j.xops.2024.100658. eCollection 2025 Mar-Apr.
ABSTRACT
OBJECTIVE: To quantitatively assess the retinal vascular tortuosity of patients with sickle cell disease (SCD) and retinopathy (SCR) using an automated deep learning (DL)-based pipeline.
DESIGN: Cross-sectional study.
SUBJECTS: Patients diagnosed with SCD and screened for SCR at an academic eye center between January 2015 and November 2022 were identified using electronic health records. Eyes of unaffected matched patients (i.e., no history of SCD, hypertension, diabetes mellitus, or retinal occlusive disorder) served as controls.
METHODS: For each patient, demographic data, sickle cell diagnosis, types and total number of sickle cell crises, SCD medications used, ocular and systemic comorbidities, and history of intraocular treatment were extracted. A previously published DL algorithm was used to calculate retinal microvascular tortuosity using ultrawidefield pseudocolor fundus imaging among patients with SCD vs. controls.
MAIN OUTCOME MEASURES: Cumulative tortuosity index (CTI).
RESULTS: Overall, 64 patients (119 eyes) with SCD and 57 age- and race-matched controls (106 eyes) were included. The majority of the patients with SCD were females (65.6%) and of Black or African descent (78.1%), with an average age of 35.1 ± 20.1 years. The mean number of crises per patient was 3.4 ± 5.2, and the patients took 0.7 ± 0.9 medications. The mean CTI for eyes with SCD was higher than controls (1.06 ± vs. 1.03 ± 0.02, P < 0.001). On subgroup analysis, hemoglobin S, hemoglobin C, and HbS/beta-thalassemia variants had significantly higher CTIs compared with controls (1.07 vs. 1.03, P < 0.001), but not with sickle cell trait variant (1.04 vs. 1.03 control, P = .2). Univariable analysis showed a higher CTI in patients diagnosed with proliferative SCR, most significantly among those with sea-fan neovascularization (1.06 ± 0.02 vs. 1.04 ± 0.01, P < 0.001) and those with >3 sickle cell crises (1.07 ± 0.02 vs. 1.05 ± 0.02, P < 0.001).
CONCLUSIONS: A DL-based metric of cumulative vascular tortuosity associates with and may be a potential biomarker for SCD and SCR disease severity.
FINANCIAL DISCLOSURES: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
PMID:39886358 | PMC:PMC11780102 | DOI:10.1016/j.xops.2024.100658
Deep Learning-Based Identification of Echocardiographic Abnormalities From Electrocardiograms
JACC Asia. 2024 Dec 10;5(1):88-98. doi: 10.1016/j.jacasi.2024.10.012. eCollection 2025 Jan.
ABSTRACT
BACKGROUND: Heart failure should be diagnosed as early as possible. Although deep learning models can predict one or more echocardiographic findings from electrocardiograms (ECGs), such analyses are not comprehensive.
OBJECTIVES: This study aimed to develop a deep learning model for comprehensive prediction of echocardiographic findings from ECGs.
METHODS: We obtained 229,439 paired ECG and echocardiography data sets from 8 centers. Six centers contributed to model development and 2 to external validation. We identified 12 echocardiographic findings related to left-sided cardiac abnormalities, valvular heart diseases, and right-sided cardiac abnormalities. These findings were predicted using convolutional neural networks, and a composite label was analyzed using logistic regression. A positive composite label indicated positivity in any of the 12 findings.
RESULTS: For the composite findings label, the area under the receiver-operating characteristic curve was 0.80 (95% CI: 0.80-0.81) on hold-out validation and 0.78 (95% CI: 0.78-0.79) on external validation. The composite findings label applying logistic regression had an area under the receiver-operating characteristic curve of 0.80 (95% CI: 0.80-0.81) with accuracy of 73.8% (95% CI: 73.2-74.4), sensitivity of 81.1% (95% CI: 80.5-81.8), and specificity of 60.7% (95% CI: 59.6-61.8).
CONCLUSIONS: We have developed convolutional neural network models that predict a wide range of echocardiographic findings, including left-sided cardiac abnormalities, valvular heart diseases, and right-sided cardiac abnormalities from ECGs and created a model to predict a composite findings label by logistic regression analysis. This model has potential to serve as an adjunct for early diagnosis and treatment of previously undetected cardiac disease.
PMID:39886205 | PMC:PMC11775793 | DOI:10.1016/j.jacasi.2024.10.012
Emerging Potential and Challenges of AI-Based ECG Analysis in Clinical Medicine
JACC Asia. 2025 Jan 7;5(1):99-100. doi: 10.1016/j.jacasi.2024.11.017. eCollection 2025 Jan.
NO ABSTRACT
PMID:39886196 | PMC:PMC11775781 | DOI:10.1016/j.jacasi.2024.11.017
Feasibility and time gain of implementing artificial intelligence-based delineation tools in daily magnetic resonance image-guided adaptive prostate cancer radiotherapy
Phys Imaging Radiat Oncol. 2024 Dec 28;33:100694. doi: 10.1016/j.phro.2024.100694. eCollection 2025 Jan.
ABSTRACT
BACKGROUND AND PURPOSE: Daily magnetic resonance image (MRI)-guided radiotherapy plan adaptation requires time-consuming manual contour edits of targets and organs at risk in the online workflow. Recent advances in auto-segmentation promise to deliver high-quality delineations within a short time frame. However, the actual time benefit in a clinical setting is unknown. The current study investigated the feasibility and time gain of implementing online artificial intelligence (AI)-based delineations at a 1.5 T MRI-Linac.
MATERIALS AND METHODS: Fifteen consecutive prostate cancer patients, treated to 60 Gy in 20 fractions at a 1.5 T MRI-Linac, were included in the study. The first 5 patients (Group 1) were treated using the standard contouring workflow for all fractions. The last 10 patients (Group 2) were treated with the standard workflow for fractions 1 up to 3 (Group 2 - Standard) and an AI-based workflow for the remaining fractions (Group 2 - AI). AI delineations were delivered using an in-house developed AI inference service and an in-house trained nnU-Net.
RESULTS: The AI-based workflow reduced delineation time from 9.8 to 5.3 min. The variance in delineation time seemed to increase during the treatment course for Group 1, while the delineation time for the AI-based workflow was constant (Group 2 - AI). Fewer occurrences of readaptation due to target movement occurred with the AI-based workflow.
CONCLUSION: Implementing an AI-based workflow at the 1.5 T MRI-Linac is feasible and reduces the delineation time. Lower variance in delineation duration supports a better ability to plan daily treatment schedules and avoids delays.
PMID:39885904 | PMC:PMC11780162 | DOI:10.1016/j.phro.2024.100694
Virtual staining from bright-field microscopy for label-free quantitative analysis of plant cell structures
Plant Mol Biol. 2025 Jan 31;115(1):29. doi: 10.1007/s11103-025-01558-w.
ABSTRACT
The applicability of a deep learning model for the virtual staining of plant cell structures using bright-field microscopy was investigated. The training dataset consisted of microscopy images of tobacco BY-2 cells with the plasma membrane stained with the fluorescent dye PlasMem Bright Green and the cell nucleus labeled with Histone-red fluorescent protein. The trained models successfully detected the expansion of cell nuclei upon aphidicolin treatment and a decrease in the cell aspect ratio upon propyzamide treatment, demonstrating its utility in cell morphometry. The model also accurately documented the shape of Arabidopsis pavement cells in both wild type and the bpp125 triple mutant, which has an altered pavement cell phenotype. Metrics such as cell area, circularity, and solidity obtained from virtual staining analyses were highly correlated with those obtained by manual measurements of cell features from microscopy images. Furthermore, the versatility of virtual staining was highlighted by its application to track chloroplast movement in Egeria densa. The method was also effective for classifying live and dead BY-2 cells using texture-based machine learning, suggesting that virtual staining can be applied beyond typical segmentation tasks. Although this method still has some limitations, its non-invasive nature and efficiency make it highly suitable for label-free, dynamic, and high-throughput analyses in quantitative plant cell biology.
PMID:39885095 | DOI:10.1007/s11103-025-01558-w
Automatic Identification of Adenoid Hypertrophy via Ensemble Deep Learning Models Employing X-ray Adenoid Images
J Imaging Inform Med. 2025 Jan 30. doi: 10.1007/s10278-025-01423-8. Online ahead of print.
ABSTRACT
Adenoid hypertrophy, characterized by the abnormal enlargement of adenoid tissue, is a condition that can cause significant breathing and sleep disturbances, particularly in children. Accurate diagnosis of adenoid hypertrophy is critical, yet traditional methods, such as imaging and manual interpretation, are prone to errors. This study uses an ensemble deep learning-based approach for adenoid classification. It utilizes a unique dataset sourced from Batman Training and Research Hospital. The dataset is composed of masked and non-masked X-ray images. It is used to train and compare the performance of multiple convolutional neural network (CNN) models. By comparing classification accuracy between masked and non-masked datasets, the study reveals the importance of image preprocessing. Six deep learning models-EfficientNet, MobileNet, ResNet50, ResNet152, VGG16, and Xception-are tested, with ResNet50 achieving the highest accuracy (100% on masked images), while Xception performs the worst (65% F1-score). The results indicate that masking significantly enhances the accuracy and reliability of adenoid classification. ResNet50 and EfficientNet show strong generalization capabilities. Conversely, the lower performance of models like Xception highlights the variability in model suitability for this task. This research provides valuable insights into optimizing deep learning models for medical image classification and it advances the field of AI-based adenoid detection.
PMID:39885079 | DOI:10.1007/s10278-025-01423-8
Predictors and Implications of Myocardial Injury in Intracerebral Hemorrhage
Clin Neuroradiol. 2025 Jan 30. doi: 10.1007/s00062-025-01498-4. Online ahead of print.
ABSTRACT
PURPOSE: Myocardial injury, indicated by an elevation of high-sensitive cardiac Troponin (hs-cTnT), is a frequent stroke-related complication. Most studies investigated patients with ischemic stroke, but only little is known about its occurrence in patients with intracerebral hemorrhage (ICH). This study aimed to assess the frequency, predictors, and implications of myocardial injury in ICH patients.
METHODS: Our retrospective analysis included 322 ICH patients. We defined myocardial injury as an elevation of hs-cTnT above the 99th percentile (i.e. 14 ng/L). Acute myocardial injury was defined as either a changing pattern of > 50% within 24 h or an excessive elevation of initial hs-cTnT (> 52 ng/L). 3D brain scans were assessed for ICH visually and quantitatively by a deep learning algorithm. Multiple regression models and Voxel-based Lesion-Symptom Mapping (VLSM) were applied.
RESULTS: 63.0% (203/322) of patients presented with myocardial injury, which was associated with more severe strokes and worse outcomes during the in-hospital phase (P < 0.01). Acute myocardial injury occurred in 24.5% (79/322) of patients. The only imaging finding associated with acute myocardial injury was midline shift (69.8% vs. 44.6% for normal or stable hs-cTnT, P < 0.01), which also independently predicted it (odds ratio 3.29, confidence interval 1.38-7.87, P < 0.01). In contrast, VLSM did not identify any specific brain region significantly associated with acute myocardial injury. Acute myocardial injury did not correlate with preexisting cardiac diseases; however, the frequency of adverse cardiac events was higher in the acute myocardial injury group (11.4% vs. 4.1% in patients with normal and/or stable patterns of hs-cTnT, P < 0.05).
CONCLUSION: Myocardial injury occurs frequently in ICH and is linked to poor outcomes. Acute myocardial injury primarily correlates to space-occupying effects of ICH but is less dependent on premorbid cardiac status. Nonetheless, it is associated with a higher rate of adverse cardiac events.
PMID:39884976 | DOI:10.1007/s00062-025-01498-4
A deep learning approach for classifying and predicting children's nutritional status in Ethiopia using LSTM-FC neural networks
BioData Min. 2025 Jan 30;18(1):11. doi: 10.1186/s13040-025-00425-0.
ABSTRACT
BACKGROUND: This study employs a LSTM-FC neural networks to address the critical public health issue of child undernutrition in Ethiopia. By employing this method, the study aims classify children's nutritional status and predict transitions between different undernutrition states over time. This analysis is based on longitudinal data extracted from the Young Lives cohort study, which tracked 1,997 Ethiopian children across five survey rounds conducted from 2002 to 2016. This paper applies rigorous data preprocessing, including handling missing values, normalization, and balancing, to ensure optimal model performance. Feature selection was performed using SHapley Additive exPlanations to identify key factors influencing nutritional status predictions. Hyperparameter tuning was thoroughly applied during model training to optimize performance. Furthermore, this paper compares the performance of LSTM-FC with existing baseline models to demonstrate its superiority. We used Python's TensorFlow and Keras libraries on a GPU-equipped system for model training.
RESULTS: LSTM-FC demonstrated superior predictive accuracy and long-term forecasting compared to baseline models for assessing child nutritional status. The classification and prediction performance of the model showed high accuracy rates above 93%, with perfect predictions for Normal (N) and Stunted & Wasted (SW) categories, minimal errors in most other nutritional statuses, and slight over- or underestimations in a few instances. The LSTM-FC model demonstrates strong generalization performance across multiple folds, with high recall and consistent F1-scores, indicating its robustness in predicting nutritional status. We analyzed the prevalence of children's nutritional status during their transition from late adolescence to early adulthood. The results show a notable decline in normal nutritional status among males, decreasing from 58.3% at age 5 to 33.5% by age 25. At the same time, the risk of severe undernutrition, including conditions of being underweight, stunted, and wasted (USW), increased from 1.3% to 9.4%.
CONCLUSIONS: The LSTM-FC model outperforms baseline methods in classifying and predicting Ethiopian children's nutritional statuses. The findings reveal a critical rise in undernutrition, emphasizing the need for urgent public health interventions.
PMID:39885567 | DOI:10.1186/s13040-025-00425-0
Biomedical named entity recognition using improved green anaconda-assisted Bi-GRU-based hierarchical ResNet model
BMC Bioinformatics. 2025 Jan 30;26(1):34. doi: 10.1186/s12859-024-06008-w.
ABSTRACT
BACKGROUND: Biomedical text mining is a technique that extracts essential information from scientific articles using named entity recognition (NER). Traditional NER methods rely on dictionaries, rules, or curated corpora, which may not always be accessible. To overcome these challenges, deep learning (DL) methods have emerged. However, DL-based NER methods may need help identifying long-distance relationships within text and require significant annotated datasets.
RESULTS: This research has proposed a novel model to address the challenges in natural language processing. The Improved Green anaconda-assisted Bi-GRU based Hierarchical ResNet BNER model (IGa-BiHR BNERM) is the model. IGa-BiHR BNERM model has shown promising results in accurately identifying named entities. The MACCROBAT dataset was obtained from Kaggle and underwent several pre-processing steps such as Stop Word Filtering, WordNet processing, Removal of non-alphanumeric characters, stemming Segmentation, and Tokenization, which is standardized and improves its quality. The pre-processed text was fed into a feature extraction model like the Robustly Optimized BERT -Whole Word Masking model. This model provides word embeddings with semantic information. Then, the BNER process utilized an Improved Green Anaconda-assisted Bi-GRU-based Hierarchical ResNet BNER model (IGa-BiHR BNERM).
CONCLUSION: To improve the training phase of the IGa-BiHR BNERM, the Improved Green Anaconda Optimization technique was used to select optimal weight parameter coefficients for training the model parameters. After the model was tested using the MACCROBAT dataset, it outperformed previous models with a tremendous accuracy rate of 99.11%. This model effectively and accurately identifies biomedical names within the text, significantly advancing this field.
PMID:39885428 | DOI:10.1186/s12859-024-06008-w
Versatile waste sorting in small batch and flexible manufacturing industries using deep learning techniques
Sci Rep. 2025 Jan 30;15(1):3756. doi: 10.1038/s41598-025-87226-x.
ABSTRACT
The expansion of LEAN and small batch manufacturing demands flexible automated workstations capable of switching between sorting various wastes over time. To address this challenge, our study is focused on assessing the ability of the Segment Anything Model (SAM) family of deep learning architectures to separate highly variable objects during robotic waste sorting. The proposed two-step procedure for generic versatile visual waste sorting is based on the SAM architectures (original SAM, FastSAM, MobileSAMv2, and EfficientSAM) for waste object extraction from raw images, and the use of classification architecture (MobileNetV2, VGG19, Dense-Net, Squeeze-Net, ResNet, and Inception-v3) for accurate waste sorting. Such a pipeline brings two key advantages that make it more applicable in industry practice by: 1) eliminating the necessity for developing dedicated waste detection and segmentation algorithms for waste object localization, and 2) significantly reducing the time and costs required for adapting the solution to different use cases. With the proposed procedure, switching to a new waste type sorting is reduced to only two steps: The use of SAM for the automatic object extraction, followed by their separation into corresponding classes used to fine-tune the classifier. Validation on four use cases (floating waste, municipal waste, e-waste, and smart bins) shows robust results, with accuracy ranging from 86 to 97% when using the MobileNetV2 with SAM and FastSAM architectures. The proposed approach has a high potential to facilitate deployment, increase productivity, lower expenses, and minimize errors in robotic waste sorting while enhancing overall recycling and material utilization in the manufacturing industry.
PMID:39885307 | DOI:10.1038/s41598-025-87226-x
Hierarchical image classification using transfer learning to improve deep learning model performance for amazon parrots
Sci Rep. 2025 Jan 30;15(1):3790. doi: 10.1038/s41598-025-88103-3.
ABSTRACT
Numerous studies have proven the potential of deep learning models for classifying wildlife. Such models can reduce the workload of experts by automating species classification to monitor wild populations and global trade. Although deep learning models typically perform better with more input data, the available wildlife data are ordinarily limited, specifically for rare or endangered species. Recently, citizen science programs have helped accumulate valuable wildlife data, but such data is still not enough to achieve the best performance of deep learning models compared to benchmark datasets. Recent studies have applied the hierarchical classification of a given wildlife dataset to improve model performance and classification accuracy. This study applied hierarchical classification by transfer learning for classifying Amazon parrot species. Specifically, a hierarchy was built based on diagnostic morphological features. Upon evaluating model performance, the hierarchical model outperformed the non-hierarchical model in detecting and classifying Amazon parrots. Notably, the hierarchical model achieved the mean Average Precision (mAP) of 0.944, surpassing the mAP of 0.908 achieved by the non-hierarchical model. Moreover, the hierarchical model improved classification accuracy between morphologically similar species. The outcomes of this study may facilitate the monitoring of wild populations and the global trade of Amazon parrots for conservation purposes.
PMID:39885290 | DOI:10.1038/s41598-025-88103-3
Improved lung nodule segmentation with a squeeze excitation dilated attention based residual UNet
Sci Rep. 2025 Jan 30;15(1):3770. doi: 10.1038/s41598-025-85199-5.
ABSTRACT
The diverse types and sizes, proximity to non-nodule structures, identical shape characteristics, and varying sizes of nodules make them challenging for segmentation methods. Although many efforts have been made in automatic lung nodule segmentation, most of them have not sufficiently addressed the challenges related to the type and size of nodules, such as juxta-pleural and juxta-vascular nodules. The current research introduces a Squeeze-Excitation Dilated Attention-based Residual U-Net (SEDARU-Net) with a robust intensity normalization technique to address the challenges related to different types and sizes of lung nodules and to achieve an improved lung nodule segmentation. After preprocessing the images with the intensity normalization method and extracting the Regions of Interest by YOLOv3, they are fed into the SEDARU-Net with dilated convolutions in the encoder part. Then, the extracted features are given to the decoder part, which involves transposed convolutions, Squeeze-Excitation Dilated Residual blocks, and skip connections equipped with an Attention Gate, to decode the feature maps and construct the segmentation mask. The proposed model was evaluated using the publicly available Lung Nodule Analysis 2016 (LUNA16) dataset, achieving a Dice Similarity Coefficient of 97.86%, IoU of 96.40%, sensitivity of 96.54%, and precision of 98.84%. Finally, it was shown that each added component to the U-Net's structure and the intensity normalization technique increased the Dice Similarity Coefficient by more than 2%. The proposed method suggests a potential clinical tool to address challenges related to the segmentation of lung nodules with different types located in the proximity of non-nodule structures.
PMID:39885263 | DOI:10.1038/s41598-025-85199-5
Optimized deep learning model with integrated spectrum focus transformer for pavement distress recognition and classification
Sci Rep. 2025 Jan 30;15(1):3803. doi: 10.1038/s41598-025-88251-6.
ABSTRACT
In the task of pavement distress recognition and classification, the complexity of the pavement environment, the small proportion of distresses in images, significant variation in distress scales, and the influence of features such as vehicles and traffic signs in the data make distress feature extraction challenging. This paper proposes a spectrum focus transformer (SFT) layer, which processes the signal spectrum and focuses on important frequency components. Initially, by thoroughly analyzing the frequency domain characteristics of image data, frequency value distribution information is obtained to achieve fine-tuning of different frequency components. Subsequently, frequency information and images are learned and weighted in the frequency domain, thereby enhancing the capability to capture pavement distress regions. Experiments conducted on the road pavement distress dataset revealed through heatmap analysis that distress regions received increased attention, achieving an accuracy of 97.73%. This performance demonstrates a higher accuracy compared to other models.
PMID:39885250 | DOI:10.1038/s41598-025-88251-6
Deep learning-based malaria parasite detection: convolutional neural networks model for accurate species identification of Plasmodium falciparum and Plasmodium vivax
Sci Rep. 2025 Jan 30;15(1):3746. doi: 10.1038/s41598-025-87979-5.
ABSTRACT
Accurate malaria diagnosis with precise identification of Plasmodium species is crucial for an effective treatment. While microscopy is still the gold standard in malaria diagnosis, it relies heavily on trained personnel. Artificial intelligence (AI) advances, particularly convolutional neural networks (CNNs), have significantly improved diagnostic capabilities and accuracy by enabling the automated analysis of medical images. Previous models efficiently detected malaria parasites in red blood cells but had difficulty differentiating between species. We propose a CNN-based model for classifying cells infected by P. falciparum, P. vivax, and uninfected white blood cells from thick blood smears. Our best-performing model utilizes a seven-channel input and correctly predicted 12,876 out of 12,954 cases. We also generated a cross-validation confusion matrix that showed the results of five iterations, achieving 63,654 out of 64,126 true predictions. The model's accuracy reached 99.51%, a precision of 99.26%, a recall of 99.26%, a specificity of 99.63%, an F1 score of 99.26%, and a loss of 2.3%. We are now developing a system based on real-world quality images to create a comprehensive detection tool for remote regions where trained microscopists are unavailable.
PMID:39885248 | DOI:10.1038/s41598-025-87979-5
A deep learning based model for diabetic retinopathy grading
Sci Rep. 2025 Jan 30;15(1):3763. doi: 10.1038/s41598-025-87171-9.
ABSTRACT
Diabetic retinopathy stands as a leading cause of blindness among people. Manual examination of DR images is labor-intensive and prone to error. Existing methods to detect this disease often rely on handcrafted features which limit the adaptability and classification accuracy. Thus, the aim of this research is to develop an automated and efficient system for early detection and accurate grading of diabetic retinopathy severity with less time consumption. In our research, we have developed a deep neural network named RSG-Net (Retinopathy Severity Grading) to classify DR into 4 stages (multi-class classification) and 2 stages (binary classification). The dataset utilized in this study is Messidor-1. In preprocessing, we have used Histogram Equalization to improve image contrast and denoising techniques to remove noise and artifacts which enhanced the clarity of the fundus images. We applied data augmentation techniques to our preprocessed images in order to tackle class imbalance issues. Augmentation techniques involve flipping, rotation, zooming and adjustment of color, contrast and brightness. The proposed RSG-Net model contains convolutional layers to perform automatic feature extraction from the input images and batch normalization layers to improve training speed and performance. The model also contains max pooling, drop out and fully connected layers. Our proposed RSG-Net model achieved a testing accuracy of 99.36%, specificity of 99.79% and a sensitivity of 99.41% in classifying diabetic retinopathy into 4 grades and it achieved 99.37% accuracy, 100% sensitivity and 98.62% specificity in classifying DR into 2 grades. The performance of RSG-Net is also compared with other state-of-the-art methodologies where it outperformed these methods.
PMID:39885230 | DOI:10.1038/s41598-025-87171-9
An optimized lightweight real-time detection network model for IoT embedded devices
Sci Rep. 2025 Jan 30;15(1):3839. doi: 10.1038/s41598-025-88439-w.
ABSTRACT
With the rapid development of Internet of Things (IoT) technology, embedded devices in various computer vision scenarios can realize real-time target detection and recognition tasks, such as intelligent manufacturing, automatic driving, smart home, and so on. YOLOv8, as an advanced deep learning model in the field of target detection, has attracted much attention for its excellent detection speed, high precision, and multi-task processing capability. However, since IoT embedded devices typically own limited computing resources, direct deployment of YOLOv8 is a big challenge, especially for real-time detection tasks. To address this vital issue, this work proposes and deploys an optimized lightweight real-time detection network model that well-suits for IoT embedded devices, denoted as FRYOLO. To evaluate its performance, a case study based on real-time fresh and defective fruit detection in the production line is performed. Characterized by low training cost and high detection performance, this model accurately detects various types of fruits and their states, as the experimental results show that FRYOLO achieves 84.7% in recall and 89.0% in mean Average Precision (mAP), along with a precision of 92.5%. In addition, it provides a detection frame rate of up to 33 Frames Per Second (FPS), satisfying the real-time requirement. Finally, an intelligent production line system based on FRYOLO is implemented, which not only provides robust technical support for the efficient operation of fruit production processes but also demonstrates the availability of the proposed network model in practical IoT applications.
PMID:39885208 | DOI:10.1038/s41598-025-88439-w
MMFW-UAV dataset: multi-sensor and multi-view fixed-wing UAV dataset for air-to-air vision tasks
Sci Data. 2025 Jan 30;12(1):185. doi: 10.1038/s41597-025-04482-2.
ABSTRACT
We present an air-to-air multi-sensor and multi-view fixed-wing UAV dataset, MMFW-UAV, in this work. MMFW-UAV contains a total of 147,417 fixed-wing UAVs images captured by multiple types of sensors (zoom, wide-angle, and thermal imaging sensors), displaying the flight status of fixed-wing UAVs of different sizes, appearances, structures, and stabilized flight velocities from multiple aerial perspectives (top-down, horizontal, and bottom-up views), aiming to cover the full-range of perspectives with multi-modal image data. Quality control processes of semi-automatic annotation, manual check, and secondary refinement are performed on each image. To the best of our knowledge, MMFW-UAV is the first one-to-one multi-modal image dataset for fixed-wing UAVs with high-quality annotations. Several mainstream deep learning-based object detection architectures are evaluated on MMFW-UAV and the experimental results demonstrate that MMFW-UAV can be utilized for fixed-wing UAV identification, detection, and monitoring. We believe that MMFW-UAV will contribute to various fixed-wing UAVs-based research and applications.
PMID:39885165 | DOI:10.1038/s41597-025-04482-2
Identification of Intracranial Germ Cell Tumors Based on Facial Photos: Exploratory Study on the Use of Deep Learning for Software Development
J Med Internet Res. 2025 Jan 30;27:e58760. doi: 10.2196/58760.
ABSTRACT
BACKGROUND: Primary intracranial germ cell tumors (iGCTs) are highly malignant brain tumors that predominantly occur in children and adolescents, with an incidence rate ranking third among primary brain tumors in East Asia (8%-15%). Due to their insidious onset and impact on critical functional areas of the brain, these tumors often result in irreversible abnormalities in growth and development, as well as cognitive and motor impairments in affected children. Therefore, early diagnosis through advanced screening techniques is vital for improving patient outcomes and quality of life.
OBJECTIVE: This study aimed to investigate the application of facial recognition technology in the early detection of iGCTs in children and adolescents. Early diagnosis through advanced screening techniques is vital for improving patient outcomes and quality of life.
METHODS: A multicenter, phased approach was adopted for the development and validation of a deep learning model, GVisageNet, dedicated to the screening of midline brain tumors from normal controls (NCs) and iGCTs from other midline brain tumors. The study comprised the collection and division of datasets into training (n=847, iGCTs=358, NCs=300, other midline brain tumors=189) and testing (n=212, iGCTs=79, NCs=70, other midline brain tumors=63), with an additional independent validation dataset (n=336, iGCTs=130, NCs=100, other midline brain tumors=106) sourced from 4 medical institutions. A regression model using clinically relevant, statistically significant data was developed and combined with GVisageNet outputs to create a hybrid model. This integration sought to assess the incremental value of clinical data. The model's predictive mechanisms were explored through correlation analyses with endocrine indicators and stratified evaluations based on the degree of hypothalamic-pituitary-target axis damage. Performance metrics included area under the curve (AUC), accuracy, sensitivity, and specificity.
RESULTS: On the independent validation dataset, GVisageNet achieved an AUC of 0.938 (P<.01) in distinguishing midline brain tumors from NCs. Further, GVisageNet demonstrated significant diagnostic capability in distinguishing iGCTs from the other midline brain tumors, achieving an AUC of 0.739, which is superior to the regression model alone (AUC=0.632, P<.001) but less than the hybrid model (AUC=0.789, P=.04). Significant correlations were found between the GVisageNet's outputs and 7 endocrine indicators. Performance varied with hypothalamic-pituitary-target axis damage, indicating a further understanding of the working mechanism of GVisageNet.
CONCLUSIONS: GVisageNet, capable of high accuracy both independently and with clinical data, shows substantial potential for early iGCTs detection, highlighting the importance of combining deep learning with clinical insights for personalized health care.
PMID:39883924 | DOI:10.2196/58760
A comparative analysis of CNNs and LSTMs for ECG-based diagnosis of arrythmia and congestive heart failure
Comput Methods Biomech Biomed Engin. 2025 Jan 30:1-29. doi: 10.1080/10255842.2025.2456487. Online ahead of print.
ABSTRACT
Cardiac arrhythmias are major global health concern and their early detection is critical for diagnosis. This study comprehensively evaluates the effectiveness of CNNs and LSTMs for the classification of cardiac arrhythmias, considering three PhysioNet datasets. ECG records are segmented to accommodate around ∼10s of ECG data. Followed by transformation to scalograms using DWT for training VGG-16; and WTS for feature extraction and dimensionality reduction for training LSTM network. VGG-16 achieved 96.44% test accuracy while LSTM achieved 92%. Results also highlight the effectiveness of VGG-16 for short-duration ECG analysis, while LSTM excels in long-term monitoring on edge devices for personalized healthcare.
PMID:39883911 | DOI:10.1080/10255842.2025.2456487