Deep learning
Feature-Reinforced Strategy for Enhancing the Accuracy of Triboelectric Vibration Sensing Toward Mechanical Equipment Monitoring
Small. 2025 May 24:e2503997. doi: 10.1002/smll.202503997. Online ahead of print.
ABSTRACT
With the advancement of intelligent and refined manufacturing, the demand for vibration sensors in smart equipment has surged. Traditional commercial vibration sensors and triboelectric nanogenerator (TENG)-based sensors are limited to basic amplitude and frequency recognition, failing to address both self-powering and diagnostic needs due to inherent design constraints. To overcome these limitations, this study introduces a novel mechanism combining interface dipole energy and vacuum level optimization in triboelectric materials to explain charge generation and separation under vibration. A TENG device with polydimethylsiloxane (PDMS)-encapsulated metal electrode is designed and developed, enabling the precise recognition of equipment operating status through vibration waveform analysis. By optimizing interface contact area and electron transfer capacity, the device achieves enhanced signal clarity and the introduction of subtler characteristics in the signal waveform. Furthermore, the integration of a deep learning algorithm enables high-resolution classification of vibration states with an accuracy of 98.3% approximately, achieving effective monitoring of the operating status of the jaw crusher and vibrating screen. This work not only verifies the feasibility of designing a self-powered vibration sensor but also demonstrates its potential for real-time monitoring and diagnostic applications in smart equipment.
PMID:40411864 | DOI:10.1002/smll.202503997
Deep ensemble framework with Bayesian optimization for multi-lesion recognition in capsule endoscopy images
Med Biol Eng Comput. 2025 May 24. doi: 10.1007/s11517-025-03380-4. Online ahead of print.
ABSTRACT
In order to address the challenges posed by the large number of images acquired during wireless capsule endoscopy examinations and fatigue-induced leakage and misdiagnosis, a deep ensemble framework is proposed, which consists of CA-EfficientNet-B0, ECA-RegNetY, and Swin transformer as base learners. The ensemble model aims to automatically recognize four lesions in capsule endoscopy images, including angioectasia, bleeding, erosions, and polyps. All the three base learners employed transfer learning, with the inclusion of attention modules in EfficientNet-B0 and RegNetY for optimization. The recognition outcomes from the three base learners were subsequently combined and weighted to facilitate automatic recognition of multi-lesion images and normal images of the gastrointestinal (GI) tract. The weights were determined through the Bayesian optimization. The experiment collected a total of 8358 images of 281 cases at Shanghai East Hospital from 2017 to 2021. These images were organized and labeled by clinicians to verify the performance of the algorithm. The experimental results showed that the model achieved an accuracy of 84.31%, m-Precision of 88.60%, m-Recall of 79.36%, and m-F1-score of 81.08%. Compared to mainstream deep learning models, the ensemble model effectively improves the classification performance of GI diseases and can assist clinicians in making initial diagnoses of GI diseases.
PMID:40411689 | DOI:10.1007/s11517-025-03380-4
Deep learning reconstruction combined with contrast-enhancement boost in dual-low dose CT pulmonary angiography: a two-center prospective trial
Eur Radiol. 2025 May 24. doi: 10.1007/s00330-025-11681-3. Online ahead of print.
ABSTRACT
PURPOSE: To investigate whether the deep learning reconstruction (DLR) combined with contrast-enhancement-boost (CE-boost) technique can improve the diagnostic quality of CT pulmonary angiography (CTPA) at low radiation and contrast doses, compared with routine CTPA using hybrid iterative reconstruction (HIR).
MATERIALS AND METHODS: This prospective two-center study included 130 patients who underwent CTPA for suspected pulmonary embolism. Patients were randomly divided into two groups: the routine CTPA group, reconstructed using HIR; and the dual-low dose CTPA group, reconstructed using HIR and DLR, additionally combined with the CE-boost to generate HIR-boost and DLR-boost images. Signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) of pulmonary arteries were quantitatively assessed. Two experienced radiologists independently ordered CT images (5, best; 1, worst) based on overall image noise and vascular contrast. Diagnostic performance for PE detection was calculated for each dataset.
RESULTS: Patient demographics were similar between groups. Compared to HIR images of the routine group, DLR-boost images of the dual-low dose group were significantly better at qualitative scores (p < 0.001). The CT values of pulmonary arteries between the DLR-boost and the HIR images were comparable (p > 0.05), whereas the SNRs and CNRs of pulmonary arteries in the DLR-boost images were the highest among all five datasets (p < 0.001). The AUCs of DLR, HIR-boost, and DLR-boost were 0.933, 0.924, and 0.986, respectively (all p > 0.05).
CONCLUSION: DLR combined with CE-boost technique can significantly improve the image quality of CTPA with reduced radiation and contrast doses, facilitating a more accurate diagnosis of pulmonary embolism.
KEY POINTS: Question The dual-low dose protocol is essential for detecting pulmonary emboli (PE) in follow-up CT pulmonary angiography (PA), yet effective solutions are still lacking. Findings Deep learning reconstruction (DLR)-boost with reduced radiation and contrast doses demonstrated higher quantitative and qualitative image quality than hybrid-iterative reconstruction in the routine CTPA. Clinical relevance DLR-boost based low-radiation and low-contrast-dose CTPA protocol offers a novel strategy to further enhance the image quality and diagnosis accuracy for pulmonary embolism patients.
PMID:40411550 | DOI:10.1007/s00330-025-11681-3
Deep learning-based classification and segmentation of interictal epileptiform discharges using multichannel electroencephalography
Epilepsia. 2025 May 24. doi: 10.1111/epi.18463. Online ahead of print.
ABSTRACT
OBJECTIVE: This study was undertaken to develop a deep learning framework that can classify and segment interictal epileptiform discharges (IEDs) in multichannel electroencephalographic (EEG) recordings with high accuracy, preserving both spatial information and interchannel interactions.
METHODS: We proposed a novel deep learning framework, U-IEDNet, for detecting IEDs in multichannel EEG. The U-IEDNet framework employs convolutional layers and bidirectional gated recurrent units as a temporal encoder to extract temporal features from single-channel EEG, followed by the use of transformer networks as a spatial encoder to fuse multichannel features and extract interchannel interaction information. Transposed convolutional layers form a temporal decoder, creating a U-shaped architecture with the encoder. This upsamples features to estimate the probability of each EEG sampling point falling within the IED range, enabling segmentation of IEDs from background activity. Two datasets, a public database with 370 patient recordings and our own annotated database with 43 patient recordings, were used for model establishment and validation.
RESULTS: The results showed prominent advantage compared with other methods. U-IEDNet achieved a recall of .916, precision of .911, F1-score of .912, and false positive rate (FPR) of .030 on the public database. The classification performance in our own annotated database achieved a recall of .905, a precision of .902, an F1-score of .903, and an FPR of .072. The segmentation performance had a recall of .903, a precision of .916, and an F1-score of .909. Additionally, this study analyzes attention weights in the transformer network based on brain network theory to elucidate the spatial feature fusion process, enhancing the interpretability of the IED detection model.
SIGNIFICANCE: In this paper, we aim to present an artificial intelligence-based toolbox for IED detection, which may facilitate epilepsy diagnosis at the bedside in the future. U-IEDNet demonstrates great potential to improve the accuracy and efficiency of IED detection in multichannel EEG recordings.
PMID:40411529 | DOI:10.1111/epi.18463
Prostate cancer prediction through a hybrid deep learning method applied to histopathological image
Expert Rev Anticancer Ther. 2025 May 24. doi: 10.1080/14737140.2025.2512040. Online ahead of print.
ABSTRACT
BACKGROUND: Prostate Cancer (PCa) is a severe disease that affects males globally. The Gleason grading system is a widely recognized method for diagnosing the aggressiveness of PCa using histopathological images. This system evaluates prostate tissue to determine the severity of the disease and guide treatment decisions. However, manual analysis of histopathological images requires highly skilled professionals and is time-consuming.
METHODS: To address these challenges, deep learning (DL) is utilized, as it has shown promising results in medical image analysis. Although numerous DL networks have been developed for Gleason grading, many existing methods have limitations such as suboptimal accuracy and high computational complexity. The proposed network integrates MobileNet, an Attention Mechanism (AM), and a capsule network. MobileNet efficiently extracts features from images while addressing computational complexity. The AM focuses on selecting the most relevant features, enhancing the accuracy of Gleason grading. Finally, the capsule network classifies the Gleason grades from histopathological images.
RESULTS: The validation of the proposed network used two datasets, PANDA and Gleason-2019. Ablation studies were conducted and evaluated in the proposed architecture. The results highlight the effectiveness of the proposed network.
CONCLUSIONS: The proposed network outperformed existing approaches, achieving an accuracy of 98.08% on the PANDA dataset and 97.07% on the Gleason-2019 dataset.
PMID:40411485 | DOI:10.1080/14737140.2025.2512040
deepTFBS: Improving within- and Cross-Species Prediction of Transcription Factor Binding Using Deep Multi-Task and Transfer Learning
Adv Sci (Weinh). 2025 May 24:e03135. doi: 10.1002/advs.202503135. Online ahead of print.
ABSTRACT
The precise prediction of transcription factor binding sites (TFBSs) is crucial in understanding gene regulation. In this study, deepTFBS, a comprehensive deep learning (DL) framework that builds a robust DNA language model of TF binding grammar for accurately predicting TFBSs within and across plant species is presented. Taking advantages of multi-task DL and transfer learning, deepTFBS is capable of leveraging the knowledge learned from large-scale TF binding profiles to enhance the prediction of TFBSs under small-sample training and cross-species prediction tasks. When tested using available information on 359 Arabidopsis TFs, deepTFBS outperformed previously described prediction strategies, including position weight matrix, deepSEA and DanQ, with a 244.49%, 49.15%, and 23.32% improvement of the area under the precision-recall curve (PRAUC), respectively. Further cross-species prediction of TFBS in wheat showed that deepTFBS yielded a significant PRAUC improvement of 30.6% over these three baseline models. deepTFBS can also utilize information from gene conservation and binding motifs, enabling efficient TFBS prediction in species where experimental data availability is limited. A case study, focusing on the WUSCHEL (WUS) transcription factor, illustrated the potential use of deepTFBS in cross-species applications, in our example between Arabidopsis and wheat. deepTFBS is publically available at https://github.com/cma2015/deepTFBS.
PMID:40411397 | DOI:10.1002/advs.202503135
Individually optimized dynamic parallel transmit pulses for 3D high-resolution SPACE imaging at 7T
Magn Reson Med. 2025 May 24. doi: 10.1002/mrm.30565. Online ahead of print.
ABSTRACT
PURPOSE: Although clinical 7T MRI offers various advantages compared to lower field strengths, achieving spatially uniform flip angle distributions remains a challenge. Sampling Perfection with Application optimized Contrast using different flip angle Evolution (SPACE) sequences employing a long train of refocusing pulses with varying flip angles pose a particular challenge in that regard. In this study, we investigate scalable dynamic parallel transmission (pTx) pulses to achieve homogeneous 3D high-resolution SPACE brain imaging at 7T.
METHODS: Non-parametrized and scalable dynamic pTx pulses were designed for excitation, refocusing and inversion in SPACE sequences by using fast online customization (FOCUS). First, a database of B0 and multi-channel B 1 + $$ {\mathrm{B}}_1^{+} $$ maps were used for optimizing universal pulses and parameters for flip angle homogeneity under strict specific absorption rate (SAR) constraints. During each new examination, B0 and B 1 + $$ {\mathrm{B}}_1^{+} $$ maps were acquired as additional calibration step and pTx pulses were tailored to the subject. For scalability, a symmetry condition was enforced. T1, T2, fluid-attenuated inversion recovery (FLAIR) and double inversion recovery (DIR) SPACE images were acquired in five healthy subjects at 7T using the proposed FOCUS pulses and conventional circularly polarized (CP) pulses for comparison.
RESULTS: Improved SNR and better image homogeneity were observed in every image acquired with FOCUS pulses in comparison to CP. Quantitative analysis showed a significant reduction in the coefficient of variation (COV) of image intensities in the cerebellum, a region notably affected by B 1 + $$ {\mathrm{B}}_1^{+} $$ inhomogeneities across all contrasts. FLAIR images, for example, exhibited a 46% COV reduction.
CONCLUSION: Individually optimized dynamic pTx pulses for 3D high-resolution SPACE imaging delivered clinically acceptable image homogeneity, enabling the application of widely used clinical contrasts at 7T.
PMID:40411368 | DOI:10.1002/mrm.30565
Harnessing deep learning for wheat variety classification: a convolutional neural network and transfer learning approach
J Sci Food Agric. 2025 May 24. doi: 10.1002/jsfa.14378. Online ahead of print.
ABSTRACT
BACKGROUND: Computer vision and the use of image-based solutions are gaining traction as non-destructive food assessment methods because of the low costs of computational equipment. Research conducted on the development of wheat classification models has been based on limited data and a smaller number of classes compared to the availability of wheat varieties. To assess the applicability of convolutional neural network (CNN) models, the present study prepared multi-view images of 124 wheat varieties. Using deep learning (DL) methods, a four-layered CNN model was developed from scratch, and popular architectures, DenseNet201, MobileNet and InceptionV3 were trained using transfer learning.
RESULTS: The proposed CNN model, DenseNet201, MobileNet and InceptionV3 models achieved classification accuracies of 95.40%, 92.41%, 90.54% and 83.47%, respectively, and they were found to be both promising and successful. Despite the challenges related to high computational resource demands, the newly proposed CNN model outperformed the pretrained models. It can be inferred that the multi-view, large-image dataset contributed significantly to the model's success in achieving promising accuracy in the challenging task of classifying 124 wheat varieties.
CONCLUSION: The present study recommends further fine-tuning of hyperparameters to improve the accuracy of the proposed CNN model and to identify better configurations. Besides, other popular models should be evaluated. Moreover, by freezing specific early layers, fine-tuning should be performed to maximize accuracy. Additionally, the image datasets used will be publicly available to allow researchers to discover new methodologies to classify wheat varieties. © 2025 The Author(s). Journal of the Science of Food and Agriculture published by John Wiley & Sons Ltd on behalf of Society of Chemical Industry.
PMID:40411235 | DOI:10.1002/jsfa.14378
Multimodal attention fusion deep self-reconstruction presentation model for Alzheimer's disease diagnosis and biomarker identification
Artif Cells Nanomed Biotechnol. 2025 Dec;53(1):231-243. doi: 10.1080/21691401.2025.2506591. Epub 2025 May 23.
ABSTRACT
The unknown pathogenic mechanisms of Alzheimer's disease (AD) make treatment challenging. Neuroimaging genetics offers a method for identifying disease biomarkers for early diagnosis, but traditional approaches struggle with complex non-linear, multimodal and multi-expression data. However, traditional association analysis methods face challenges in handling nonlinear, multimodal and multi-expression data. Therefore, a multimodal attention fusion deep self-restructuring presentation (MAFDSRP) model is proposed to solve the above problem. First, multimodal brain imaging data are processed through a novel histogram-matching multiple attention mechanisms to dynamically adjust the weight of each input brain image data. Simultaneous, the genetic data are preprocessed to remove low-quality samples. Subsequently, the genetic data and fused neuroimaging data are separately input into the self-reconstruction network to learn the nonlinear relationships and perform subspace clustering at the top layer of the network. Finally, the learned genetic data and fused neuroimaging data are analysed through expression association analysis to identify AD-related biomarkers. The identified biomarkers underwent systematic multi-level analysis, revealing biomarker roles at molecular, tissue and functional levels, highlighting processes like inflammation, lipid metabolism, memory and emotional processing linked to AD. The experimental results show that MAFDSRP achieved 0.58 in association analysis, demonstrating its great potential in accurately identifying AD-related biomarkers.
PMID:40411137 | DOI:10.1080/21691401.2025.2506591
Screening of oral potentially malignant disorders and oral cancer using deep learning models
Sci Rep. 2025 May 23;15(1):17949. doi: 10.1038/s41598-025-02802-5.
ABSTRACT
Oral cancer though preventable, shows high mortality and affect the overall quality of life when detected in late stages. Screening techniques that enable early diagnosis are the need of the hour. The present work aims to evaluate the effectiveness of AI screening tools in the diagnosis of OPMDs and Oral cancers via native or web-application (cloud) using smart phone devices. We trained and tested two deep learning models namely DenseNet201 and FixCaps using 518 images of the oral cavity. While DenseNet201 is a pre-trained model, we modified the FixCaps model from capsule network and trained it ground up. Standardized protocols were used to annotate and classify the lesions (suspicious vs. non-suspicious). In terms of model performance, DenseNet201 achieved an F1 score of 87.50% and AUC of 0.97; while FixCaps exhibited F1 score of 82.8% and AUC of 0.93. DenseNet201 model (20 M) serves as a robust screening model (accuracy 88.6%) that can be hosted on a web-application in the cloud servers; while the adapted FixCaps model with its low parameter size of 0.83 M exhibits comparable accuracy (83.8%) allowing easy transitioning into a native phone-based screening application.
PMID:40410364 | DOI:10.1038/s41598-025-02802-5
Development and validation of a radiomics model using plain radiographs to predict spine fractures with posterior wall injury
Eur Spine J. 2025 May 23. doi: 10.1007/s00586-025-08948-0. Online ahead of print.
ABSTRACT
PURPOSE: When spine fractures involve posterior wall damage, they pose a heightened risk of instability, consequently influencing treatment strategies. To enhance early diagnosis and refine treatment planning for these fractures, we implemented a radiomics analysis using deep learning techniques, based on both anteroposterior and lateral plain X-ray images.
METHODS: Retrospective data were collected for 130 patients with spine fractures who underwent anteroposterior and lateral imaging at two centers (Center 1, training cohort; Center 2, validation cohort) between January 2010 and June 2024. The Vision Transformer (ViT) technique was employed to extract imaging features. The features selected through multiple methods were then used to construct a machine learning model using NaiveBayes and Support Vector Machine (SVM). The model's performance was evaluated using the area under the curve (AUC) metric.
RESULTS: 12 features were selected to form the deep learning features. The SVM model using a combination of anteroposterior and lateral plain images showed good performance in both centers with a high AUC for predicting spine fractures with posterior wall injury (Center 1, AUC: 0.909, 95% CI: 0.763-1.000; Center 2, AUC: 0.837, 95% CI: 0.678-0.996). The SVM model based on the combined images outperformed both the individual position images and a spine surgeon with 3 years of clinical experience in classification performance.
CONCLUSIONS: Our study demonstrates that a radiomic model created by integrating anteroposterior and lateral plain X-ray images of the spine can more effectively predict spine fractures with posterior wall injury, aiding clinicians in making accurate diagnoses and treatment decisions.
PMID:40410361 | DOI:10.1007/s00586-025-08948-0
Efficient adaptation of deep neural networks for semantic segmentation in space applications
Sci Rep. 2025 May 23;15(1):18046. doi: 10.1038/s41598-025-99192-5.
ABSTRACT
In recent years, the application of Deep Learning techniques has shown remarkable success in various computer vision tasks, paving the way for their deployment in extraterrestrial exploration. Transfer learning has emerged as a powerful strategy for addressing the scarcity of labeled data in these novel environments. This paper represents one of the first efforts in evaluating the feasibility of employing adapters toward efficient transfer learning for rock segmentation in extraterrestrial landscapes, mainly focusing on lunar and martian terrains. Our work suggests that the use of adapters, strategically integrated into a pre-trained backbone model, can be successful in reducing both bandwidth and memory requirements for the target extraterrestrial device. In this study, we considered two memory-saving strategies: layer fusion (to reduce to zero the inference overhead) and an "adapter ranking" (to also reduce the transmission cost). Finally, we evaluate these results in terms of task performance, memory, and computation on embedded devices, evidencing trade-offs that open the road to more research in the field. The code will be open-sourced upon acceptance of the article.
PMID:40410339 | DOI:10.1038/s41598-025-99192-5
End-to-end prognostication in pancreatic cancer by multimodal deep learning: a retrospective, multicenter study
Eur Radiol. 2025 May 23. doi: 10.1007/s00330-025-11694-y. Online ahead of print.
ABSTRACT
OBJECTIVES: Pancreatic cancer treatment plans involving surgery and/or chemotherapy are highly dependent on disease stage. However, current staging systems are ineffective and poorly correlated with survival outcomes. We investigate how artificial intelligence (AI) can enhance prognostic accuracy in pancreatic cancer by integrating multiple data sources.
MATERIALS AND METHODS: Patients with histopathology and/or radiology/follow-up confirmed pancreatic ductal adenocarcinoma (PDAC) from a Dutch center (2004-2023) were included in the development cohort. Two additional PDAC cohorts from a Dutch and Spanish center were used for external validation. Prognostic models including clinical variables, contrast-enhanced CT images, and a combination of both were developed to predict high-risk short-term survival. All models were trained using five-fold cross-validation and assessed by the area under the time-dependent receiver operating characteristic curve (AUC).
RESULTS: The models were developed on 401 patients (203 females, 198 males, median survival (OS) = 347 days, IQR: 171-585), with 98 (24.4%) short-term survivors (OS < 230 days) and 303 (75.6%) long-term survivors. The external validation cohorts included 361 patients (165 females, 138 males, median OS = 404 days, IQR: 173-736), with 110 (30.5%) short-term survivors and 251 (69.5%) longer survivors. The best AUC for predicting short vs. long-term survival was achieved with the multi-modal model (AUC = 0.637 (95% CI: 0.500-0.774)) in the internal validation set. External validation showed AUCs of 0.571 (95% CI: 0.453-0.689) and 0.675 (95% CI: 0.593-0.757).
CONCLUSION: Multimodal AI can predict long vs. short-term survival in PDAC patients, showing potential as a prognostic tool in clinical decision-making.
KEY POINTS: Question Prognostic tools for pancreatic ductal adenocarcinoma (PDAC) remain limited, with TNM staging offering suboptimal accuracy in predicting patient survival outcomes. Findings The multimodal AI model demonstrated improved prognostic performance over TNM and unimodal models for predicting short- and long-term survival in PDAC patients. Clinical relevance Multimodal AI provides enhanced prognostic accuracy compared to current staging systems, potentially improving clinical decision-making and personalized management strategies for PDAC patients.
PMID:40410330 | DOI:10.1007/s00330-025-11694-y
Automated depression detection via cloud based EEG analysis with transfer learning and synchrosqueezed wavelet transform
Sci Rep. 2025 May 23;15(1):18008. doi: 10.1038/s41598-025-02452-7.
ABSTRACT
Post-COVID-19, depression rates have risen sharply, increasing the need for early diagnosis using electroencephalogram (EEG) and deep learning. To tackle this, we developed a cloud-based computer-aided depression diagnostic (CCADD) system that utilizes EEG signals from local databases. This system was optimized through a series of experiments to identify the most accurate model. The experiments employed a pre-trained convolutional neural network, ResNet18, fine-tuned on time-frequency synchrosqueezed wavelet transform (SSWT) images derived from EEG signals. Various data augmentation methods, including image processing techniques and noises, were applied to identify the best model for CCADD. To offer this device with minimal electrodes, we aimed to balance high accuracy with fewer electrodes. Two publicly databases were evaluated using this approach. Dataset I included 31 individuals detected with major depressive disorder and a control class of 27 age-matched healthy subjects. Dataset II comprised 90 participants, with 45 diagnosed with depression and 45 healthy controls. The leave-subjects-out cross-validation method with 20 subjects was used to validate the proposed method. The highest average accuracies for the selected model are 98%, 97%, 91%, and 88% for the parietal and central lobes in Databases I and II, respectively. The corresponding highest f-scores are 96.27%, 94.87%, 90.56%, and 89.65%. The highest intra-database accuracy and F1-score are 75.10% and 73.56% when training with SSWT images from Database II and testing with parietal images from Database I. This study introduces a novel cloud-based model for depression detection, paving the way for effective diagnostic tools and potentially revolutionizing depression management.
PMID:40410314 | DOI:10.1038/s41598-025-02452-7
Multimodal ultrasound-based radiomics and deep learning for differential diagnosis of O-RADS 4-5 adnexal masses
Cancer Imaging. 2025 May 23;25(1):64. doi: 10.1186/s40644-025-00883-z.
ABSTRACT
BACKGROUND: Accurate differentiation between benign and malignant adnexal masses is crucial for patients to avoid unnecessary surgical interventions. Ultrasound (US) is the most widely utilized diagnostic and screening tool for gynecological diseases, with contrast-enhanced US (CEUS) offering enhanced diagnostic precision by clearly delineating blood flow within lesions. According to the Ovarian and Adnexal Reporting and Data System (O-RADS), masses classified as categories 4 and 5 carry the highest risk of malignancy. However, the diagnostic accuracy of US remains heavily reliant on the expertise and subjective interpretation of radiologists. Radiomics has demonstrated significant value in tumor differential diagnosis by extracting microscopic information imperceptible to the human eye. Despite this, no studies to date have explored the application of CEUS-based radiomics for differentiating adnexal masses. This study aims to develop and validate a multimodal US-based nomogram that integrates clinical variables, radiomics, and deep learning (DL) features to effectively distinguish adnexal masses classified as O-RADS 4-5.
METHODS: From November 2020 to March 2024, we enrolled 340 patients who underwent two-dimensional US (2DUS) and CEUS and had masses categorized as O-RADS 4-5. These patients were randomly divided into a training cohort and a test cohort in a 7:3 ratio. Adnexal masses were manually segmented from 2DUS and CEUS images. Using machine learning (ML) and DL techniques, five models were developed and validated to differentiate adnexal masses. The diagnostic performance of these models was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), accuracy, sensitivity, specificity, precision, and F1-score. Additionally, a nomogram was constructed to visualize outcome measures.
RESULTS: The CEUS-based radiomics model outperformed the 2DUS model (AUC: 0.826 vs. 0.737). Similarly, the CEUS-based DL model surpassed the 2DUS model (AUC: 0.823 vs. 0.793). The ensemble model combining clinical variables, radiomics, and DL features achieved the highest AUC (0.929).
CONCLUSIONS: Our study confirms the effectiveness of CEUS-based radiomics for distinguishing adnexal masses with high accuracy and specificity using a multimodal US-based radiomics DL nomogram. This approach holds significant promise for improving the diagnostic precision of adnexal masses classified as O-RADS 4-5.
PMID:40410823 | DOI:10.1186/s40644-025-00883-z
A deep learning model integrating domain-specific features for enhanced glaucoma diagnosis
BMC Med Inform Decis Mak. 2025 May 23;25(1):195. doi: 10.1186/s12911-025-02925-9.
ABSTRACT
Glaucoma is a group of serious eye diseases that can cause incurable blindness. Despite the critical need for early detection, over 60% of cases remain undiagnosed, especially in less developed regions. Glaucoma diagnosis is a costly task and some models have been proposed to automate diagnosis based on images of the retina, specifically the area known as the optic cup and the associated disc where retinal blood vessels and nerves enter and leave the eye. However, diagnosis is complicated because both normal and glaucoma-affected eyes can vary greatly in appearance. Some normal cases, like glaucoma, exhibit a larger cup-to-disc ratio, one of the main diagnostic criteria, making it challenging to distinguish between them. We propose a deep learning model with domain features (DLMDF) to combine unstructured and structured features to distinguish between glaucoma and physiologic large cups. The structured features were based upon the known cup-to-disc ratios of the four quadrants of the optic discs in normal, physiologic large cups, and glaucomatous optic cups. We segmented each cup and disc using a fully convolutional neural network and then calculated the cup size, disc size, and cup-to-disc ratio of each quadrant. The unstructured features were learned from a deep convolutional neural network. The average precision (AP) for disc segmentation was 98.52%, and for cup segmentation it was also 98.57%. Thus, the relatively high AP values enabled us to calculate the 15 reliable features from each segmented disc and cup. In classification tasks, the DLMDF outperformed other models, achieving superior accuracy, precision, and recall. These results validate the effectiveness of combining deep learning-derived features with domain-specific structured features, underscoring the potential of this approach to advance glaucoma diagnosis.
PMID:40410768 | DOI:10.1186/s12911-025-02925-9
Artificial intelligence automated measurements of spinopelvic parameters in adult spinal deformity-a systematic review
Spine Deform. 2025 May 23. doi: 10.1007/s43390-025-01111-1. Online ahead of print.
ABSTRACT
PURPOSE: This review evaluates advances made in deep learning (DL) applications to automatic spinopelvic parameter estimation, comparing their accuracy to manual measurements performed by surgeons.
METHODS: The PubMed database was queried for studies on DL measurement of adult spinopelvic parameters between 2014 and 2024. Studies were excluded if they focused on pediatric patients, non-deformity-related conditions, non-human subjects, or if they lacked sufficient quantitative data comparing DL models to human measurements. Included studies were assessed based on model architecture, patient demographics, training, validation, testing methods, and sample sizes, as well as performance compared to manual methods.
RESULTS: Of 442 screened articles, 16 were included, with sample sizes ranging from 15 to 9,832 radiograph images and reporting interclass correlation coefficients (ICCs) of 0.56 to 1.00. Measurements of pelvic tilt, pelvic incidence, T4-T12 kyphosis, L1-L4 lordosis, and SVA showed consistently high ICCs (>0.80) and low mean absolute deviations (MADs <6°), with substantial number of studies reporting pelvic tilt achieving an excellent ICC of 0.90 or greater. In contrast, T1-T12 kyphosis and L4-S1 lordosis exhibited lower ICCs and higher measurement errors. Overall, most DL models demonstrated strong correlations (>0.80) with clinician measurements and minimal differences compared to manual references, except for T1-T12 kyphosis (average Pearson correlation: 0.68), L1-L4 lordosis (average Pearson correlation: 0.75), and L4-S1 lordosis (average Pearson correlation: 0.65).
CONCLUSION: Novel computer vision algorithms show promising accuracy in measuring spinopelvic parameters, comparable to manual surgeon measurements. Future research should focus on external validation, additional imaging modalities, and the feasibility of integration in clinical settings to assess model reliability and predictive capacity.
PMID:40410653 | DOI:10.1007/s43390-025-01111-1
Evaluation of a deep-learning segmentation model for patients with colorectal cancer liver metastases (COALA) in the radiological workflow
Insights Imaging. 2025 May 23;16(1):110. doi: 10.1186/s13244-025-01984-w.
ABSTRACT
OBJECTIVES: For patients with colorectal liver metastases (CRLM), total tumor volume (TTV) is prognostic. A deep-learning segmentation model for CRLM to assess TTV called COlorectal cAncer Liver metastases Assessment (COALA) has been developed. This study evaluated COALA's performance and practical utility in the radiological picture archiving and communication system (PACS). A secondary aim was to provide lessons for future researchers on the implementation of artificial intelligence (AI) models.
METHODS: Patients discussed between January and December 2023 in a multidisciplinary meeting for CRLM were included. In those patients, CRLM was automatically segmented in portal-venous phase CT scans by COALA and integrated with PACS. Eight expert abdominal radiologists completed a questionnaire addressing segmentation accuracy and PACS integration. They were also asked to write down general remarks.
RESULTS: In total, 57 patients were evaluated. Of those patients, 112 contrast-enhanced portal-venous phase CT scans were analyzed. Of eight radiologists, six (75%) evaluated the model as user-friendly in their radiological workflow. Areas of improvement of the COALA model were the segmentation of small lesions, heterogeneous lesions, and lesions at the border of the liver with involvement of the diaphragm or heart. Key lessons for implementation were a multidisciplinary approach, a robust method prior to model development and organizing evaluation sessions with end-users early in the development phase.
CONCLUSION: This study demonstrates that the deep-learning segmentation model for patients with CRLM (COALA) is user-friendly in the radiologist's PACS. Future researchers striving for implementation should have a multidisciplinary approach, propose a robust methodology and involve end-users prior to model development.
CRITICAL RELEVANCE STATEMENT: Many segmentation models are being developed, but none of those models are evaluated in the (radiological) workflow or clinically implemented. Our model is implemented in the radiological work system, providing valuable lessons for researchers to achieve clinical implementation.
KEY POINTS: Developed segmentation models should be implemented in the radiological workflow. Our implemented segmentation model provides valuable lessons for future researchers. If implemented in clinical practice, our model could allow for objective radiological evaluation.
PMID:40410643 | DOI:10.1186/s13244-025-01984-w
Facial emotion based smartphone addiction detection and prevention using deep learning and video based learning
Sci Rep. 2025 May 23;15(1):18025. doi: 10.1038/s41598-025-99681-7.
ABSTRACT
Smartphone addiction among students has emerged as a critical issue, negatively impacting their academic performance, emotional well-being, and social behavior. This paper introduces the Theory of Mind integrated with Video Modelling (TMVM) framework, a novel deep learning-based approach aimed at recognizing and mitigating smartphone addiction. The TMVM framework leverages Theory of Mind AI to analyze students' facial emotions via smartphone cameras while watching videos. Based on detected emotions such as happiness, sadness, or anger, the system dynamically shuffles motivational videos using advanced algorithms like Fisher-Yates and Durstenfeld shuffling techniques to promote behavioral change. The framework also incorporates Behavior Parameters (BHP) evaluation, grounded in the Social Identity Model of Deindividuation Effects (SIDE) theory, to assess key behavioral metrics such as social identity, self-awareness, anonymity, responsibility, and accountability. Additionally, face emotion detection algorithms tuned with MnasNet-Teaching Learning Based Optimization (TLBO) and Convolution Neural Networks (CNN)-Cuckoo Search Optimization (CSO) are employed for accurate emotion recognition. Experimental results demonstrate significant improvements in students' behavior and reductions in smartphone usage post-intervention. The TMVM system achieves high accuracy in emotion detection and behavioral outcome prediction while fostering engagement in school and social activities. . TMVM method is tested in 750 students with low BHP and evaluated the behavioural parameters. After the intervention of TMVM the students showed more than 90% improvement in their BHP parameters. A paired sample t-test revealed notable reductions in mean scores from pre- to post-intervention across all measured dimensions. Social identity decreased from 4.07 to 2.21 (t(55) = 16.125, p < 0.001), anonymity from 4.11 to 2.01 (t(55) = 15.699, p < 0.001), self-awareness from 3.95 to 1.93 (t(55) = 15.103, p < 0.001), loss of individuality from 4.04 to 2.07 (t(55) = 13.364, p < 0.001), while sense of responsibility and accountability improved with mean differences of 1.18 and 2.0, respectively, both statistically significant at p < 0.001.The results showed 85% improvement in students' knowledge and attitudes.
PMID:40410532 | DOI:10.1038/s41598-025-99681-7
Vibration area localization and event recognition for underground power optical cable in multiple laying scenarios based on deep learning
Sci Rep. 2025 May 23;15(1):17920. doi: 10.1038/s41598-025-99588-3.
ABSTRACT
The current ϕ-OTDR vibration localization and recognition methods based on predominantly relies on assumptions such as bare fiber sensing, simulated experimental environments, or single known laying scenario. Most of them either focus on the localization or recognition of events, while even some studies that consider both ignore the improvement of performance to meet real-time requirements, which limits their practical application in multiple laying scenarios. To solve the above problems, we propose a method for vibration area localization and event recognition of the underground power optical cable based on PGSD-YOLO and 1DCNN-BiGRU-AFM. First, with real multiple laying scenarios of buried underground and manholes, using an underground power optical cable as distributed optical fiber vibration sensing, a ϕ-OTDR system is built to collect signals of vibration events. And then, high-pass and low-pass filters are combined for denoising to improve the signal quality. Secondly, PGSD-YOLO is designed to localize the vibration area and obtain its laying scenario. PGSD-YOLO combines the YOLOv11 with the multi-scale attention of PMSAM to enhance the feature extraction ability. Through the dynamic sampling strategy of DySample, the information loss of signals is reduced, and GSConv and VoVGSCSP are used to optimize feature fusion. Finally, based on the obtained scenario labels and the time-domain signals, 1DCNN-BiGRU-AFM is designed to recognize vibration events. 1DCNN-BiGRU-AFM combines the feature extraction ability of 1DCNN and the timing analysis ability of BiGRU, and optimizes feature fusion through the AFM mechanism. From experimental results, both PGSD-YOLO and 1DCNN-BiGRU-AFM meet the real-time and performance requirements in multiple scenarios.
PMID:40410528 | DOI:10.1038/s41598-025-99588-3