Deep learning
Design and experimental research of on device style transfer models for mobile environments
Sci Rep. 2025 Apr 21;15(1):13724. doi: 10.1038/s41598-025-98545-4.
ABSTRACT
This study develops a neural style transfer (NST) model optimized for real-time execution on mobile devices through on-device AI, eliminating reliance on cloud servers. By embedding AI models directly into mobile hardware, this approach reduces operational costs and enhances user privacy. However, designing deep learning models for mobile deployment presents a trade-off between computational efficiency and visual quality, as reducing model size often leads to performance degradation. To address this challenge, we propose a set of lightweight NST models incorporating depthwise separable convolutions, residual bottlenecks, and optimized upsampling techniques inspired by MobileNet and ResNet architectures. Five model variations are designed and evaluated based on parameters, floating-point operations, memory usage, and image transformation quality. Experimental results demonstrate that our optimized models achieve a balance between efficiency and performance, enabling high-quality real-time style transfer on resource-constrained mobile environments. These findings highlight the feasibility of deploying NST applications on mobile devices, paving the way for advancements in real-time artistic image processing in mobile photography, augmented reality, and creative applications.
PMID:40259046 | DOI:10.1038/s41598-025-98545-4
DSIT UNet a dual stream iterative transformer based UNet architecture for segmenting brain tumors from FLAIR MRI images
Sci Rep. 2025 Apr 22;15(1):13815. doi: 10.1038/s41598-025-98464-4.
ABSTRACT
Brain tumor segmentation remains challenging in medical imaging with conventional therapies and rehabilitation owing to the complex morphology and heterogeneous nature of tumors. Although convolutional neural networks (CNNs) have advanced medical image segmentation, they struggle with long-range dependencies because of their limited receptive fields. We propose Dual-Stream Iterative Transformer UNet (DSIT-UNet), a novel framework that combines Iterative Transformer (IT) modules with a dual-stream encoder-decoder architecture. Our model incorporates a transformed spatial-hybrid attention optimization (TSHAO) module to enhance multiscale feature interactions and balance local details with the global context. We evaluated DSIT-UNet using three benchmark datasets: The Cancer Imaging Archive (TCIA) from The Cancer Genome Atlas (TCGA), BraTS2020, and BraTS2021. On TCIA, our model achieved a Mean Intersection over Union of 95.21%, mean Dice Coefficient of 96.23%, precision of 95.91%, and recall of 96.55%. BraTS2020 attained a Mean IoU of 95.88%, mDice of 96.32%, precision of 96.21%, and recall of 96.44%, surpassing the performance of the existing methods. The superior results of DSIT-UNet demonstrate its effectiveness in capturing tumor boundaries and improving segmentation robustness through hierarchical attention mechanisms and multiscale feature extraction. This architecture advances automated brain tumor segmentation, with potential applications in clinical neuroimaging and future extensions to 3D volumetric segmentation.
PMID:40259039 | DOI:10.1038/s41598-025-98464-4
The development of CC-TF-BiGRU model for enhancing accuracy in photovoltaic power forecasting
Sci Rep. 2025 Apr 21;15(1):13790. doi: 10.1038/s41598-025-99109-2.
ABSTRACT
In the face of escalating global energy crises and pressing challenges of environmental pollution, the imperative for sustainable energy solutions has never been more pronounced. Photovoltaic (PV) power generation is recognized as a cornerstone in transition towards a clean energy paradigm. This study introduces a groundbreaking short-term PV power forecasting methodology based on teacher forcing (TF) integrated with bi-directional gated recurrent unit (BiGRU). Firstly, the chaotic feature extraction is synergistically employed in conjunction with the C-C method to meticulously discern the pivotal factors that shape the dynamics of PV power, complemented by the inclusion for solar radiation data as an additional element. Besides, a potent fusion of gradient boosting decision trees (GBDT) and BiGRU is leveraged to adeptly process time series data. Moreover, teacher forcing is seamlessly integrated into the model to bolster forecasting accuracy and stability. Experimental validations demonstrate the remarkable performance of the proposed method under complex and diverse weather conditions, offering a pioneering technical approach and theoretical framework for PV power forecasting.
PMID:40258997 | DOI:10.1038/s41598-025-99109-2
Improving deep learning-based neural distinguisher with multiple ciphertext pairs for speck and Simon
Sci Rep. 2025 Apr 21;15(1):13696. doi: 10.1038/s41598-025-98251-1.
ABSTRACT
The neural network-based differential distinguisher has attracted significant interest from researchers due to its high efficiency in cryptanalysis since its introduction by Gohr in 2019. However, the accuracy of existing neural distinguishers remains limited for high-round-reduced cryptosystems. In this work, we explore the design principles of neural networks and propose a novel neural distinguisher based on a multi-scale convolutional block and dense residual connections. Two different ablation schemes are designed to verify the efficiency of the proposed neural distinguisher. Additionally, the concept of a linear attack is introduced to optimize the input dataset for the neural distinguisher. By combining ciphertext pairs, the differences between ciphertext pairs, the keys, and the differences between the keys, a novel dataset model is designed. The results show that the accuracy of the proposed neural distinguisher, utilizing the novel neural network and dataset, is 0.15-0.45% higher than Gohr's distinguisher for Speck 32/64 when using a single ciphertext pair as input. When using multiple ciphertext pairs as input, it is 1.24-3.5% higher than the best distinguishers for Speck 32/64 and 0.32-1.83% higher than the best distinguishers for Simon 32/64. Finally, a key recovery attack based on the proposed neural distinguisher using a single ciphertext pair is implemented, achieving a success rate of 61.8%, which is 9.7% higher than the distinguisher proposed by Gohr. Therefore, the proposed neural distinguisher demonstrates significant advantages in both accuracy and key recovery rate.
PMID:40258982 | DOI:10.1038/s41598-025-98251-1
Securing the CAN bus using deep learning for intrusion detection in vehicles
Sci Rep. 2025 Apr 22;15(1):13820. doi: 10.1038/s41598-025-98433-x.
ABSTRACT
The Controller Area Network (CAN) bus protocol is the essential communication backbone in vehicles within the Intelligent Transportation System (ITS), enabling interaction between electronic control units (ECUs). However, CAN messages lack authentication and security, making the system vulnerable to attacks such as DoS, fuzzing, impersonation, and spoofing. This paper evaluates deep learning methods to detect intrusions in the CAN bus network. Using the Car Hacking, Survival Analysis, and OTIDS datasets, we train and test models to identify automotive cyber threats. We explore recurrent neural network (RNN) variants, including LSTM, GRU, and VGG-16, to analyze temporal and spatial features in the data. LSTMs and GRUs handle long-term dependencies in sequential data, making them suitable for analyzing CAN messages. Bi-LSTMs enhance this by processing sequences in both directions, learning from past and future contexts to improve anomaly detection. Our results show that LSTM achieves 99.89% accuracy in binary classification, while VGG-16 reaches 100% accuracy in multiclass classification. These findings demonstrate the potential of deep learning techniques in improving the security and resilience of ITS by effectively detecting and mitigating CAN bus network attacks.
PMID:40258975 | DOI:10.1038/s41598-025-98433-x
Mitigating side channel attacks on FPGA through deep learning and dynamic partial reconfiguration
Sci Rep. 2025 Apr 21;15(1):13745. doi: 10.1038/s41598-025-98473-3.
ABSTRACT
This paper introduces a framework that combines Deep Learning (DL) models and Dynamic Partial Reconfiguration (DPR) in Field Programmable Gate Arrays (FPGA) to mitigate Side Channel Attacks (SCA). Traditional static defense mechanisms often fail to fully mitigate SCA because they lack the ability to adapt dynamically to attacks. The proposed approach overcomes this limitation by adaptively reconfiguring the FPGA resources in real-time, disrupting the SCA patterns, and reducing the effectiveness of potential attacks. One of the notable advantages of this approach is its ability to defend against side-channel attacks while the FPGA design is operational. The framework accomplishes this by reconfiguring the FPGA resources to optimize response times, achieving latency levels beyond the reach of traditional static defense mechanisms. In particular, this study concentrates on mitigating power side-channel attacks, highlighting the resilience of the DL-DPR integration. Beyond its demonstrated efficacy against power SCA, the proposed framework can be extended to be adaptable to other types of side-channel attacks, making it a potential solution for hardware security. The integration of DL models allows for sophisticated threat analysis, while DPR provides the flexibility to implement countermeasures dynamically. Experimental results show that the latency from detection to mitigation is within 20 clock cycles. This combination represents a paradigm shift in securing hardware systems, moving from reactive to proactive defense mechanisms. The framework's real-time adaptability ensures it stays ahead of attackers, continuously evolving to neutralize new threats. The findings presented in this paper underscore the potential of combining Artificial Intelligence (AI) and FPGA technologies to redefine hardware security. By addressing detection and mitigation in a unified framework, the proposed methodology significantly enhances the resilience of FPGA designs and lays the groundwork for future research in adaptive security mechanisms.
PMID:40258964 | DOI:10.1038/s41598-025-98473-3
Using deep learning for estimation of time-since-injury in pediatric accidental fractures
Pediatr Radiol. 2025 Apr 22. doi: 10.1007/s00247-025-06223-4. Online ahead of print.
ABSTRACT
BACKGROUND: Estimating time-since-injury of healing fractures is imprecise, encompassing excessively wide timeframes. Most injured children are evaluated at non-children's hospitals, yet pediatric radiologists can disagree with up to one in six skeletal imaging interpretations from referring community hospitals. There is a need to improve image interpretation by considering additional methods for fracture dating.
OBJECTIVE: To train and validate deep learning models to correctly estimate the age of pediatric accidental long bone fractures.
MATERIALS AND METHODS: This secondary data analysis used radiographic images of accidental long bone fractures in children <6 years at the time of injury seen at a large Midwestern children's hospital between 2000-2016. We built deep learning models both to classify fracture images into different age groups and to directly estimate fracture age (time-since-injury). We used cross-validation to evaluate model performance across various metrics, including confusion matrices, sensitivity/specificity, and activation maps for age classification, and mean absolute error (MAE) and root mean squared error (RMSE) for age estimation.
RESULTS: Our study cohort contained 2,328 radiographs from 399 patients. Overall, our models performed above baselines for fracture age classification and estimation, both when trained/validated across all bones and on specific bone types. The best model was able to estimate fracture age for any long bone with a MAE of 6.2 days and with 68% of estimates falling within 7 days of the correct fracture age.
CONCLUSION: Our study successfully demonstrated that, for radiographic dating of accidental fractures of long bones, deep learning models can estimate time-since-injury with above-baseline accuracy.
PMID:40258953 | DOI:10.1007/s00247-025-06223-4
A novel deep learning approach to classify 3D foot types of diabetic patients
Sci Rep. 2025 Apr 22;15(1):13819. doi: 10.1038/s41598-025-98471-5.
ABSTRACT
Diabetes mellitus is a worldwide epidemic that leads to significant changes in foot shape, deformities, and ulcers. Precise classification of diabetic foot not only helps identify foot abnormalities but also facilitates personalized treatment and preventive measures through the engineering design of foot orthoses. In this study, we propose a novel deep learning method based on DiffusionNet which incorporates a self-attention mechanism and external features to classify the foot types of diabetic patients into six categories by using simple 3D foot images directly. Our approach achieves a high accuracy of 82.9% surpassing existing machine and deep learning methods. The proposed model offers a cost-effective way to analyse foot shapes and facilitate the customization process for both the footwear industry and medical applications.
PMID:40258927 | DOI:10.1038/s41598-025-98471-5
Bio inspired multi agent system for distributed power and interference management in MIMO OFDM networks
Sci Rep. 2025 Apr 21;15(1):13740. doi: 10.1038/s41598-025-97944-x.
ABSTRACT
MIMO-OFDM systems are essential for high-capacity wireless networks, offering improved data throughput and spectral efficiency necessary for dense user environments. Effective power and interference management are pivotal for maintaining signal quality and enhancing resource utilization. Existing techniques for resource allocation and interference control in massive MIMO-OFDM networks face challenges related to scalability, adaptability, and energy efficiency. To address these limitations, this work proposes a novel bio-inspired Termite Colony Optimization-based Multi-Agent System (TCO-MAS) integrated with an LSTM model for predictive adaptability. The deep learning LSTM model aids agents in forecasting future network conditions, enabling dynamic adjustment of pheromone levels for optimized power allocation and interference management. By simulating termite behavior, agents utilize pheromone-based feedback to achieve localized optimization decisions with minimal communication overhead. Experimental analyses evaluated the proposed TCO-MAS across key metrics such as Sum Rate, Energy Efficiency, Spectral Efficiency, Latency, and Fairness Index. Results demonstrate that TCO-MAS outperformed conventional algorithms, achieving a 20% higher sum rate and 15% better energy efficiency under high-load conditions. Limitations include dependency on specific pheromone adjustment parameters, which may require fine-tuning for diverse scenarios. Practical implications highlight its potential for scalable and adaptive deployment in ultra-dense wireless networks, though additional field testing is recommended to ensure robustness in varied real-world environments.
PMID:40258916 | DOI:10.1038/s41598-025-97944-x
Sharper insights: Adaptive ellipse-template for robust fovea localization in challenging retinal landscapes
Comput Biol Med. 2025 Apr 20;191:110125. doi: 10.1016/j.compbiomed.2025.110125. Online ahead of print.
ABSTRACT
Automated identification of retinal landmarks, particularly the fovea is crucial for diagnosing diabetic retinopathy and other ocular diseases. But accurate identification is challenging due to varying contrast, color irregularities, anatomical structure and the presence of lesions near the macula in fundus images. Existing methods often struggle to maintain accuracy in these complex conditions, particularly when lesions obscure vital regions. To overcome these limitations, this paper introduces a novel adaptive ellipse-template-based approach for fovea localization, leveraging mathematical modeling of blood vessel (BV) trajectories and optic disc (OD) positioning. Unlike traditional fixed-template model, our method dynamically adjusts the ellipse parameters based on OD diameter, ensuring a generalized and adaptable template. This flexibility enables consistent detection performance, even in challenging images with significant lesion interference. Extensive validation on ten publicly available databases, including MESSIDOR, DRIVE, DIARETDB0, DIARETDB1, HRF, IDRiD, HEIMED, ROC, GEI, and NETRALAYA, demonstrates a superior detection efficiency of 99.5%. Additionally, the method achieves a low mean Euclidean distance of 13.48 pixels with a standard deviation of 15.5 pixels between the actual and detected fovea locations, highlighting its precision and reliability. The proposed approach significantly outperforms conventional template-based and deep learning methods, particularly in lesion-rich and low-contrast conditions. It is computationally efficient, interpretable, and robust, making it a valuable tool for automated retinal image analysis in clinical settings.
PMID:40258324 | DOI:10.1016/j.compbiomed.2025.110125
Advances in artificial intelligence for diabetes prediction: insights from a systematic literature review
Artif Intell Med. 2025 Apr 15;164:103132. doi: 10.1016/j.artmed.2025.103132. Online ahead of print.
ABSTRACT
Diabetes mellitus (DM), a prevalent metabolic disorder, has significant global health implications. The advent of machine learning (ML) has revolutionized the ability to predict and manage diabetes early, offering new avenues to mitigate its impact. This systematic review examined 53 articles on ML applications for diabetes prediction, focusing on datasets, algorithms, training methods, and evaluation metrics. Various datasets, such as the Singapore National Diabetic Retinopathy Screening Program, REPLACE-BG, National Health and Nutrition Examination Survey (NHANES), and Pima Indians Diabetes Database (PIDD), have been explored, highlighting their unique features and challenges, such as class imbalance. This review assesses the performance of various ML algorithms, such as Convolutional Neural Networks (CNN), Support Vector Machines (SVM), Logistic Regression, and XGBoost, for the prediction of diabetes outcomes from multiple datasets. In addition, it explores explainable AI (XAI) methods such as Grad-CAM, SHAP, and LIME, which improve the transparency and clinical interpretability of AI models in assessing diabetes risk and detecting diabetic retinopathy. Techniques such as cross-validation, data augmentation, and feature selection are discussed in terms of their influence on the versatility and robustness of the model. Some evaluation techniques involving k-fold cross-validation, external validation, and performance indicators such as accuracy, area under curve, sensitivity, and specificity are presented. The findings highlight the usefulness of ML in addressing the challenges of diabetes prediction, the value of sourcing different data types, the need to make models explainable, and the need to keep models clinically relevant. This study highlights significant implications for healthcare professionals, policymakers, technology developers, patients, and researchers, advocating interdisciplinary collaboration and ethical considerations when implementing ML-based diabetes prediction models. By consolidating existing knowledge, this SLR outlines future research directions aimed at improving diagnostic accuracy, patient care, and healthcare efficiency through advanced ML applications. This comprehensive review contributes to the ongoing efforts to utilize artificial intelligence technology for a better prediction of diabetes, ultimately aiming to reduce the global burden of this widespread disease.
PMID:40258308 | DOI:10.1016/j.artmed.2025.103132
An interpretable artificial intelligence approach to differentiate between blastocysts with similar or same morphological grades
Hum Reprod. 2025 Apr 21:deaf066. doi: 10.1093/humrep/deaf066. Online ahead of print.
ABSTRACT
STUDY QUESTION: Can a quantitative method be developed to differentiate between blastocysts with similar or same inner cell mass (ICM) and trophectoderm (TE) grades, while also reflecting their potential for live birth?
SUMMARY ANSWER: We developed BlastScoringNet, an interpretable deep-learning model that quantifies blastocyst ICM and TE morphology with continuous scores, enabling finer differentiation between blastocysts with similar or same grades, with higher scores significantly correlating with higher live birth rates.
WHAT IS KNOWN ALREADY: While the Gardner grading system is widely used by embryologists worldwide, blastocysts having similar or same ICM and TE grades cause challenges for embryologists in decision-making. Furthermore, human assessment is subjective and inconsistent in predicting which blastocysts have higher potential to result in live birth.
STUDY DESIGN, SIZE, DURATION: The study design consists of three main steps. First, BlastScoringNet was developed using a grading dataset of 2760 blastocysts with majority-voted Gardner grades. Second, the model was applied to a live birth dataset of 15 228 blastocysts with known live birth outcomes to generate blastocyst scores. Finally, the correlation between these scores and live birth outcomes was assessed. The blastocysts were collected from patients who underwent IVF treatments between 2016 and 2018. For external application study, an additional grading dataset of 1455 blastocysts and a live birth dataset of 476 blastocysts were collected from patients who underwent IVF treatments between 2021 and 2023 at an external IVF institution.
PARTICIPANTS/MATERIALS, SETTING, METHODS: In this retrospective study, we developed BlastScoringNet, an interpretable deep-learning model which outputs expansion degree grade and continuous scores quantifying a blastocyst's ICM morphology and TE morphology, based on the Gardner grading system. The continuous ICM and TE scores were calculated by weighting each base grade's predicted probability and summing the predicted probabilities. To represent each blastocyst's overall potential for live birth, we combined the ICM and TE scores using their odds ratios (ORs) for live birth. We further assessed the correlation between live birth rates and the ICM score, TE score, and the OR-combined score (adjusted for expansion degree) by applying BlastScoringNet to blastocysts with known live birth outcomes. To test its generalizability, we also applied BlastScoringNet to an external IVF institution, accounting for variations in imaging conditions, live birth rates, and embryologists' experience levels.
MAIN RESULTS AND THE ROLE OF CHANCE: BlastScoringNet was developed using data from 2760 blastocysts with majority-voted grades for expansion degree, ICM, and TE. The model achieved mean area under the receiver operating characteristic curve values of 0.997 (SD 0.004) for expansion degree, 0.903 (SD 0.031) for ICM, and 0.943 (SD 0.040) for TE, based on predicted probabilities for each base grade. From these predicted probabilities, BlastScoringNet generated continuous ICM and TE scores, as well as expansion degree grades, for an additional 15 228 blastocysts with known live birth outcomes. Higher ICM and TE scores, along with their OR-combined scores, were significantly correlated with increased live birth rates (P < 0.0001). By fine-tuning, BlastScoringNet was applied to an external IVF institution, where higher OR-combined ICM and TE scores also significantly correlated with increased live birth rates (P = 0.00078), demonstrating consistent results across both institutions.
LIMITATIONS, REASONS FOR CAUTION: This study is limited by its retrospective nature. Further prospective randomized trials are required to confirm the clinical impact of BlastScoringNet in assisting embryologists in blastocyst selection.
WIDER IMPLICATIONS OF THE FINDINGS: BlastScoringNet provides an interpretable and quantitative method for evaluating blastocysts, aligned with the widely used Gardner grading system. Higher OR-combined ICM and TE scores, representing each blastocyst's overall potential for live birth, were significantly correlated with increased live birth rates. The model's demonstrated generalizability across two institutions further supports its clinical utility. These findings suggest that BlastScoringNet is a valuable tool for assisting embryologists in selecting blastocysts with the highest potential for live birth. The code and pre-trained models are publicly available to facilitate further research and widespread implementation.
STUDY FUNDING/COMPETING INTEREST(S): This work was supported by the Vector Institute and the Temerty Faculty of Medicine at the University of Toronto, Toronto, Ontario, Canada, via a Clinical AI Integration Grant, and the Natural Science Foundation of Hunan Province of China (2023JJ30714). The authors declare no competing interests.
TRIAL REGISTRATION NUMBER: N/A.
PMID:40258298 | DOI:10.1093/humrep/deaf066
Use of deep learning model for paediatric elbow radiograph binomial classification: initial experience, performance and lessons learnt
Singapore Med J. 2025 Apr 1;66(4):208-214. doi: 10.4103/singaporemedj.SMJ-2022-078. Epub 2023 Nov 29.
ABSTRACT
INTRODUCTION: In this study, we aimed to compare the performance of a convolutional neural network (CNN)-based deep learning model that was trained on a dataset of normal and abnormal paediatric elbow radiographs with that of paediatric emergency department (ED) physicians on a binomial classification task.
METHODS: A total of 1,314 paediatric elbow lateral radiographs (patient mean age 8.2 years) were retrospectively retrieved and classified based on annotation as normal or abnormal (with pathology). They were then randomly partitioned to a development set (993 images); first and second tuning (validation) sets (109 and 100 images, respectively); and a test set (112 images). An artificial intelligence (AI) model was trained on the development set using the EfficientNet B1 network architecture. Its performance on the test set was compared to that of five physicians (inter-rater agreement: fair). Performance of the AI model and the physician group was tested using McNemar test.
RESULTS: The accuracy of the AI model on the test set was 80.4% (95% confidence interval [CI] 71.8%-87.3%), and the area under the receiver operating characteristic curve (AUROC) was 0.872 (95% CI 0.831-0.947). The performance of the AI model vs. the physician group on the test set was: sensitivity 79.0% (95% CI: 68.4%-89.5%) vs. 64.9% (95% CI: 52.5%-77.3%; P = 0.088); and specificity 81.8% (95% CI: 71.6%-92.0%) vs. 87.3% (95% CI: 78.5%-96.1%; P = 0.439).
CONCLUSION: The AI model showed good AUROC values and higher sensitivity, with the P-value at nominal significance when compared to the clinician group.
PMID:40258236 | DOI:10.4103/singaporemedj.SMJ-2022-078
NeuroPred-AIMP: Multimodal Deep Learning for Neuropeptide Prediction via Protein Language Modeling and Temporal Convolutional Networks
J Chem Inf Model. 2025 Apr 21. doi: 10.1021/acs.jcim.5c00444. Online ahead of print.
ABSTRACT
Neuropeptides are key signaling molecules that regulate fundamental physiological processes ranging from metabolism to cognitive function. However, accurate identification is a huge challenge due to sequence heterogeneity, obscured functional motifs and limited experimentally validated data. Accurate identification of neuropeptides is critical for advancing neurological disease therapeutics and peptide-based drug design. Existing neuropeptide identification methods rely on manual features combined with traditional machine learning methods, which are difficult to capture the deep patterns of sequences. To address these limitations, we propose NeuroPred-AIMP (adaptive integrated multimodal predictor), an interpretable model that synergizes global semantic representation of the protein language model (ESM) and the multiscale structural features of the temporal convolutional network (TCN). The model introduced the adaptive features fusion mechanism of residual enhancement to dynamically recalibrate feature contributions, to achieve robust integration of evolutionary and local sequence information. The experimental results demonstrated that the proposed model showed excellent comprehensive performance on the independence test set, with an accuracy of 92.3% and the AUROC of 0.974. Simultaneously, the model showed good balance in the ability to identify positive and negative samples, with a sensitivity of 92.6% and a specificity of 92.1%, with a difference of less than 0.5%. The result fully confirms the effectiveness of the multimodal features strategy in the task of neuropeptide recognition.
PMID:40258183 | DOI:10.1021/acs.jcim.5c00444
Estimating oxygen uptake in simulated team sports using machine learning models and wearable sensor data: A pilot study
PLoS One. 2025 Apr 21;20(4):e0319760. doi: 10.1371/journal.pone.0319760. eCollection 2025.
ABSTRACT
Accurate assessment of training status in team sports is crucial for optimising performance and reducing injury risk. This pilot study investigates the feasibility of using machine learning (ML) models to estimate oxygen uptake (VO2) with wearable sensors during team sports activities. Six healthy male team sports athletes participated in the study. Data were collected using inertial measurement units (IMU), heart rate monitors, and breathing rate sensors during incremental fitness tests. The performance of different ML models, including multiple linear regression (MLR), XGBoost, and deep learning models (LSTM, CNN, MLP), was compared using raw and engineered features from IMU data. Results indicate that while LSTM models with raw IMU data provided the most accurate predictions (RMSE: 4.976, MAE: 3.698 [Formula: see text]), MLR models remained competitive, especially with engineered features. Multi-sensor configurations, particularly those including sensors on the torso and limbs, enhanced prediction accuracy. The findings demonstrate the potential of ML models to monitor VO2 noninvasively in real-time, offering valuable insights into the internal physiological demand during team sports activities.
PMID:40258017 | DOI:10.1371/journal.pone.0319760
An Interventional Brain-Computer Interface for Long-Term EEG Collection and Motion Classification of a Quadruped Mammal
IEEE Trans Neural Syst Rehabil Eng. 2025 Apr 21;PP. doi: 10.1109/TNSRE.2025.3562922. Online ahead of print.
ABSTRACT
Brain-computer interfaces (BCI) acquire electroencephalogram (EEG) signals to effectively address postoperative motor dysfunction in stroke patients by discerning their motor intentions during significant movements. Traditionally, noninvasive BCIs have been constrained by limitations in their usage environments; whereas, invasive BCIs damage neural permanently. Therefore, we proposed a novel interventional BCI, in which electrodes are implanted along the veins into the brain to acquire intracerebral EEG signals without an open craniotomy. We collect EEG signals from the primary motor cortex in the superior sagittal sinus of sheep during three different significant movements: laying down; standing; and walking. The first three month data are used to train the neural network, and The fourth month of data were used to validate. The deep learning model achieved an 86% accuracy rate in classifying motion states in validation. Furthermore, the results of the power spectral density (PSD) show that the signal power in the main frequency band did not decrease over a period of five months, which demonstrates that the interventional BCI has the ability to effectively capture EEG signals over long periods of time.
PMID:40257874 | DOI:10.1109/TNSRE.2025.3562922
Fine extraction of multi-crop planting area based on deep learning with Sentinel- 2 time-series data
Environ Sci Pollut Res Int. 2025 Apr 21. doi: 10.1007/s11356-025-36405-4. Online ahead of print.
ABSTRACT
Accurate and timely access to the spatial distribution of crops is crucial for sustainable agricultural development and food security. However, extracting multi-crop areas based on high-resolution time-series data and deep learning still faces challenges. Therefore, this study aims to provide an effective model for multi-crop classification using high-resolution remote sensing time-series data. We designed two deep learning models based on convolutional neural network-long short-term memory (CNN-LSTM) and bidirectional long short-term memory (Bi-LSTM). The monthly synthetic time series of the normalized difference vegetation index (NDVI) from Sentinel-2 data will be used as input features to extract the multi-crop planting area in Shandong province's northwestern, southwestern, and eastern regions. The results showed that deep learning models achieved higher accuracy compared to the random forest (RF) and extreme gradient boosting (XGBoost) models, with CNN-LSTM achieving the highest overall accuracy of 96.48%. At the county level, the coefficients of determination (R2) for the CNN-LSTM model were 0.91 for wheat, 0.88 for maize, and 0.73 for spring cotton. This study demonstrates that the CNN-LSTM model combined with monthly synthetic time-series NDVI provides a feasible approach for accurately mapping high-resolution multi-crop planting areas and also contributes significantly to decision support and resource management in agricultural production.
PMID:40257731 | DOI:10.1007/s11356-025-36405-4
Ultrasound detection of nonalcoholic steatohepatitis using convolutional neural networks with dual-branch global-local feature fusion architecture
Med Biol Eng Comput. 2025 Apr 21. doi: 10.1007/s11517-025-03361-7. Online ahead of print.
ABSTRACT
Nonalcoholic steatohepatitis (NASH) is a contributing factor to liver cancer, with ultrasound B-mode imaging as the first-line diagnostic tool. This study applied deep learning to ultrasound B-scan images for NASH detection and introduced an ultrasound-specific data augmentation (USDA) technique with a dual-branch global-local feature fusion architecture (DG-LFFA) to improve model performance and adaptability across imaging conditions. A total of 137 participants were included. Ultrasound images underwent data augmentation (rotation and USDA) for training and testing convolutional neural networks-AlexNet, Inception V3, VGG16, VGG19, ResNet50, and DenseNet201. Gradient-weighted class activation mapping (Grad-CAM) analyzed model attention patterns, guiding the selection of the optimal backbone for DG-LFFA implementation. The models achieved testing accuracies of 0.81-0.83 with rotation-based data augmentation. Grad-CAM analysis showed that ResNet50 and DenseNet201 exhibited stronger liver attention. When USDA simulated datasets from different imaging conditions, DG-LFFA (based on ResNet50 and DenseNet201) improved accuracy (0.79 to 0.84 and 0.78 to 0.83), recall (0.72 to 0.81 and 0.70 to 0.78), and F1 score (0.80 to 0.84 for both models). In conclusion, deep architectures (ResNet50 and DenseNet201) enable focused analysis of liver regions for NASH detection. Under USDA-simulated imaging variations, the proposed DG-LFFA framework further improves diagnostic performance.
PMID:40257712 | DOI:10.1007/s11517-025-03361-7
Early operative difficulty assessment in laparoscopic cholecystectomy via snapshot-centric video analysis
Int J Comput Assist Radiol Surg. 2025 Apr 21. doi: 10.1007/s11548-025-03372-7. Online ahead of print.
ABSTRACT
PURPOSE: Laparoscopic cholecystectomy (LC) operative difficulty (LCOD) is highly variable and influences outcomes. Despite extensive LC studies in surgical workflow analysis, limited efforts explore LCOD using intraoperative video data. Early recognition of LCOD could allow prompt review by expert surgeons, enhance operating room (OR) planning, and improve surgical outcomes.
METHODS: We propose the clinical task of early LCOD assessment using limited video observations. We design SurgPrOD, a deep learning model to assess LCOD by analyzing features from global and local temporal resolutions (snapshots) of the observed LC video. Also, we propose a novel snapshot-centric attention (SCA) module, acting across snapshots, to enhance LCOD prediction. We introduce the CholeScore dataset, featuring video-level LCOD labels to validate our method.
RESULTS: We evaluate SurgPrOD on 3 LCOD assessment scales in the CholeScore dataset. On our new metric assessing early and stable correct predictions, SurgPrOD surpasses baselines by at least 0.22 points. SurgPrOD improves over baselines by at least 9 and 5 percentage points in F1 score and top1-accuracy, respectively, demonstrating its effectiveness in correct predictions.
CONCLUSION: We propose a new task for early LCOD assessment and a novel model, SurgPrOD, analyzing surgical video from global and local perspectives. Our results on the CholeScore dataset establish a new benchmark to study LCOD using intraoperative video data.
PMID:40257703 | DOI:10.1007/s11548-025-03372-7
Interpretable AI-assisted clinical decision making for treatment selection for brain metastases in radiation therapy
Med Phys. 2025 Apr 21. doi: 10.1002/mp.17844. Online ahead of print.
ABSTRACT
BACKGROUND: AI modeling CDM can improve the quality and efficiency of clinical practice or provide secondary opinion consultations for patients with limited medical resources to address healthcare disparities.
PURPOSE: In this study, we developed an interpretable AI model to select radiotherapy treatment options, that is, whole-brain radiation therapy (WBRT) versus stereotactic radiosurgery (SRS), for patients with brain metastases.
MATERIALS/METHODS: A total of 232 patients with brain metastases treated by radiation therapy from 2018 to 2023 were obtained. CT/MR images with contoured target lesions and organs-at-risk (OARs) as well as non-image-based clinical parameters were extracted and digitized as inputs to the model. These parameters included (1) tumor size, shape, location, and proximity of lesions to OARs; (2) age; (3) the number of brain metastases; (4) Eastern Cooperative Oncology Group (ECOG) performance status; (5) presence of neurologic symptoms; (6) if surgery was performed (either pre/post-op RT); (7) newly diagnosed cancer with brain metastases (de-novo) versus re-treatment (either local or distant in the brain); (8) primary cancer histology; (9) presence of extracranial metastases; (10) extent of extracranial disease (progression vs. stable); and (11) receipt of systemic therapy. One vanilla and two interpretable 3D convolutional neural networks (CNN) models were developed. The vanilla one-path model (VM-1) uses only images as input, while the two interpretable models use both images and clinical parameters as inputs with two (IM-2) and 11 (IM-11) independent paths, respectively. This novel design allowed the model to calculate a class activation score for each input to interpret its relative weighting and importance in decision-making. The actual radiotherapy treatment (WBRT or SRS) used for the patients was used as ground truth for model training. The model performance was assessed by Stratified-10-fold cross-validation, with each fold consisting of selected 184 training, 24 validation, and 24 testing subjects.
RESULT: A total of 232 brain metastases patients treated by WBRT or SRS were evaluated, including 80 WBRT and 152 SRS patients. Based on the images alone, the VM-1 model prescribed correctly for 143 (94%) SRS and 67 (84%) WBRT cases. Based on both images and clinical parameters, the IM-2 model prescribed correctly for 149 (98%) SRS and 74 (93%) WBRT cases. IM-11 provided the most interpretability with a relative weighting for each input as follows: CT image (59.5%), ECOG performance status (7.5%), re-treatment (5%), extracranial metastases (1.5%), number of brain metastases (9.5%), neurologic symptoms (3%), pre/post-surgery (2%), primary cancer histology (2%), age (1%), progressive extracranial disease (6%), and receipt of systemic therapy (4.5%), reflecting the importance of all these inputs in clinical decision-making.
CONCLUSION: Interpretable CNN models were successfully developed to use CT/MR images and non-image-based clinical parameters to predict the treatment selection between WBRT and SRS for brain metastases patients. The interpretability makes the model more transparent, carrying profound importance for the prospective integration of these models into routine clinical practice, particularly for informing real-time clinical decision-making.
PMID:40257121 | DOI:10.1002/mp.17844