Deep learning
A novel deep learning-based model for automated tooth detection and numbering in mixed and permanent dentition in occlusal photographs
BMC Oral Health. 2025 Mar 29;25(1):455. doi: 10.1186/s12903-025-05803-y.
ABSTRACT
BACKGROUND: While artificial intelligence-driven approaches have shown great promise in dental diagnosis and treatment planning, most research focuses on dental radiographs. Only three studies have explored automated tooth numbering in oral photographs, all focusing on permanent dentition. Our study aimed to introduce an automated system for detection and numbering of teeth across mixed and permanent dentitions in occlusal photographs.
METHODS: A total of 3215 occlusal view images of maxilla and mandible were included. Five senior dental students, trained under the guidance of an associate professor in dental public health, annotated the dataset. Samples were distributed across the training, validation, and test sets using a ratio of 7:1.5:1.5, respectively. We employed two separate convolutional neural network (CNN) models working in conjunction. The first model detected tooth presence and position, generating bounding boxes, while the second model localized these boxes, conducted classification, and assigned tooth numbers. Python and YOLOv8 were utilized in model development. Overall performance was assessed using sensitivity, precision, and F1 score.
RESULTS: The model demonstrated a sensitivity of 99.89% and an overall precision of 95.72% across all tooth types, with an F1 score of 97.76%. Misclassifications were primarily observed in underrepresented teeth, including primary incisors and permanent third molars. Among primary teeth, maxillary molars showed the highest performance, with precisions above 94%, 100% sensitivities, and F1 scores exceeding 97%. The mandibular primary canine showed the lowest results, with a precision of 88.52% and an F1 score of 93.91%.
CONCLUSION: Our study advances dental diagnostics by developing a highly precise artificial intelligence model for detecting and numbering primary and permanent teeth on occlusal photographs. The model's performance, highlights its potential for real-world applications, including tele-dentistry and epidemiological studies in underserved areas. The model could be integrated with other systems to identify dental problems such as caries and orthodontic issues.
PMID:40158107 | DOI:10.1186/s12903-025-05803-y
N6-methyladenine identification using deep learning and discriminative feature integration
BMC Med Genomics. 2025 Mar 29;18(1):58. doi: 10.1186/s12920-025-02131-6.
ABSTRACT
N6-methyladenine (6 mA) is a pivotal DNA modification that plays a crucial role in epigenetic regulation, gene expression, and various biological processes. With advancements in sequencing technologies and computational biology, there is an increasing focus on developing accurate methods for 6 mA site identification to enhance early detection and understand its biological significance. Despite the rapid progress of machine learning in bioinformatics, accurately detecting 6 mA sites remains a challenge due to the limited generalizability and efficiency of existing approaches. In this study, we present Deep-N6mA, a novel Deep Neural Network (DNN) model incorporating optimal hybrid features for precise 6 mA site identification. The proposed framework captures complex patterns from DNA sequences through a comprehensive feature extraction process, leveraging k-mer, Dinucleotide-based Cross Covariance (DCC), Trinucleotide-based Auto Covariance (TAC), Pseudo Single Nucleotide Composition (PseSNC), Pseudo Dinucleotide Composition (PseDNC), and Pseudo Trinucleotide Composition (PseTNC). To optimize computational efficiency and eliminate irrelevant or noisy features, an unsupervised Principal Component Analysis (PCA) algorithm is employed, ensuring the selection of the most informative features. A multilayer DNN serves as the classification algorithm to identify N6-methyladenine sites accurately. The robustness and generalizability of Deep-N6mA were rigorously validated using fivefold cross-validation on two benchmark datasets. Experimental results reveal that Deep-N6mA achieves an average accuracy of 97.70% on the F. vesca dataset and 95.75% on the R. chinensis dataset, outperforming existing methods by 4.12% and 4.55%, respectively. These findings underscore the effectiveness of Deep-N6mA as a reliable tool for early 6 mA site detection, contributing to epigenetic research and advancing the field of computational biology.
PMID:40158097 | DOI:10.1186/s12920-025-02131-6
An integration of ensemble deep learning with hybrid optimization approaches for effective underwater object detection and classification model
Sci Rep. 2025 Mar 29;15(1):10902. doi: 10.1038/s41598-025-95596-5.
ABSTRACT
Underwater object detection (UOD) is essential in maritime environmental study and underwater species protection. The development of associated technology holds real-world importance. While current object recognition methods have attained an outstanding performance on terrestrial, they are less suitable in underwater conditions because of dual restrictions: the underwater objects are generally smaller, closely spread, and disposed to obstruction features, and underwater embedding tools have temporary storing and computation abilities. Image-based UOD has progressed fast recently, in addition to deep learning (DL) applications and development in computer vision (CV). Investigators utilize DL models to identify possible objects inside an image. Convolutional neural network (CNN) is the major technique of DL, which enhances the learning qualities. In this manuscript, an Underwater Object Detection and Classification Utilizing the Ensemble Deep Learning Approach and Hybrid Optimization Algorithms (UODC-EDLHOA) technique is developed. The UODC-EDLHOA technique mainly detects and classifies underwater objects using advanced DL and hyperparameter models. Initially, the UODC-EDLHOA model involved several levels of pre-processing and noise removal to improve the clearness of the underwater images. The backbone of EfficientNetB7, which has an attention mechanism, is employed for feature extraction. Furthermore, the YOLOv9-based object detection is utilized. For underwater object detection, an ensemble of three techniques, namely deep neural network (DNN), deep belief network (DBN), and long short-term memory (LSTM), is implemented. Finally, the hyperparameter selection uses the hybrid Siberian tiger and sand cat swarm optimization (STSC) methods. Extensive experimentation is conducted on the UOD dataset to illustrate the robust classification performance of the UODC-EDLHOA model. The performance validation of the UODC-EDLHOA model portrayed a superior accuracy value of 92.78% over existing techniques.
PMID:40158003 | DOI:10.1038/s41598-025-95596-5
Impact of optimized and conventional facility designs on outpatient abdominal MRI workflow efficiency
Sci Rep. 2025 Mar 30;15(1):10942. doi: 10.1038/s41598-025-94799-0.
ABSTRACT
PURPOSE: The goal of this study was to evaluate the outpatient workflow efficiency of an optimized facility (OF) compared to an established reference facility (RF) for abdominal magnetic resonance imaging (MRI).
METHODS: In this retrospective study, we analyzed 2,723 contrast-enhanced liver and prostate MRI examinations conducted between March 2022 and April 2024. All examinations were performed on 3T scanners (MAGNETOM Vida, Siemens Healthineers) at two different imaging facilities within our institution. The optimized facility featured a three-bay setup, with each bay consisting of one magnet, two dockable tables, and one dedicated preparation room, while the reference facility utilized a single scanner-single table setup with one dedicated preparation room. Workflow metrics were extracted from scanner logs and electronic health records. Three-way ANOVA and chi-square tests were used to assess the impact of facility design, body region, and date on workflow metrics.
RESULTS: The OF significantly reduced mean table turnaround times (4.6 min vs. 8.3 min, p < 0.001) and achieved shorter total process cycle times for both liver (30.6 min vs. 32.7 min, p < 0.01) and prostate exams (32.5 min vs. 36.4 min, p < 0.001) compared to the RF. Additionally, the OF achieved turnaround times of ≤ 1 min in 37.2% of exams, compared to just 0.6% at the RF (p < 0.001). On-time performance was also notably higher at the OF (79.4% vs. 66.0%, p < 0.001). Furthermore, the mean time from patient arrival to exam start was reduced by 9 min at the OF (p < 0.001). Minor differences in acquisition times were observed between facilities, with both benefiting from deep learning reconstruction techniques.
CONCLUSION: The optimized MRI facility demonstrated superior outpatient workflow efficiency compared to an already efficient reference facility, particularly in table turnover time, resulting in increased patient throughput for abdominal MRI examinations. These findings highlight that even highly efficient MRI facilities can significantly benefit from comprehensive redesign strategies.
PMID:40157988 | DOI:10.1038/s41598-025-94799-0
An ESG-ConvNeXt network for steel surface defect classification based on hybrid attention mechanism
Sci Rep. 2025 Mar 29;15(1):10926. doi: 10.1038/s41598-025-88958-6.
ABSTRACT
Defect recognition is crucial in steel production and quality control, but performing this detection task accurately presents significant challenges. ConvNeXt, a model based on self-attention mechanism, has shown excellent performance in image classification tasks. To further enhance ConvNeXt's ability to classify defects on steel surfaces, we propose a network architecture called ESG-ConvNeXt. First, in the image processing stage, we introduce a serial multi-attention mechanism approach. This method fully leverages the extracted information and improves image information retention by combining the strengths of each module. Second, we design a parallel multi-scale residual module to adaptively extract diverse discriminative features from the input image, thereby enhancing the model's feature extraction capability. Finally, in the downsampling stage, we incorporate a PReLU activation function to mitigate the problem of neuron death during downsampling. We conducted extensive experiments using the NEU-CLS-64 steel surface defect dataset, and the results demonstrate that our model outperforms other methods in terms of detection performance, achieving an average recognition accuracy of 97.5%. Through ablation experiments, we validated the effectiveness of each module; through visualization experiments, our model exhibited strong classification capability. Additionally, experiments on the X-SDD dataset confirm that the ESG-ConvNeXt network achieves solid classification results. Therefore, the proposed ESG-ConvNeXt network shows great potential in steel surface defect classification.
PMID:40157949 | DOI:10.1038/s41598-025-88958-6
Multimodal Deep Learning for Grading Carpal Tunnel Syndrome: A Multicenter Study in China
Acad Radiol. 2025 Mar 28:S1076-6332(25)00187-4. doi: 10.1016/j.acra.2025.02.043. Online ahead of print.
ABSTRACT
RATIONALE AND OBJECTIVES: Ultrasound (US)-based deep learning (DL) models for grading the severity of carpal tunnel syndrome (CTS) are scarce. We aimed to advance CTS grading by developing a joint-DL model integrating clinical information and multimodal US features.
MATERIALS AND METHODS: A retrospective dataset of CTS patients from three hospitals was randomly divided into the training (n=680) and internal validation (n=173) sets. An external validation set was prospectively recruited from another hospital (n=174). To further test the model's generalizability, cross-vendor testing was conducted at three additional hospitals utilizing different US systems in the external validation set 2 (n=224). An US-based model was developed to grade CTS severity utilizing multimodal sonographic features, including cross-sectional area [CSA], echogenicity, longitudinal nerve appearance, and intraneural vascularity. A joint-DL model (CTSGrader) was constructed integrating sonographic features and clinical information. Diagnostic performance of both models was verified based on electrophysiological results. In the validation sets, the better-performing model was compared to two junior and two senior radiologists. Additionally, the radiologists' diagnostic performance with artificial intelligence (AI) assistance was evaluated in external validation sets.
RESULTS: CTSGrader achieved areas under the curve (AUCs) of 0.951, 0.910, and 0.897 in the validation sets. The accuracies of CTSGrader were 0.849, 0.833, and 0.827, which were higher than those of US-based model (all p<.05). It outperformed two junior and one senior radiologists (all p<.05) and was equivalent to 1 senior radiologist (all p>.05). With its assistance, the accuracies of two junior and one senior radiologists were improved (all p<.05).
CONCLUSION: The joint-DL model (CTSGrader) developed in our study outperformed the single-modality model. The AI-aided strategy suggested its potential to support clinical decision-making for grading CTS severity.
PMID:40157849 | DOI:10.1016/j.acra.2025.02.043
GPT4LFS (generative pre-trained transformer 4 omni for lumbar foramina stenosis): enhancing lumbar foraminal stenosis image classification through large multimodal models
Spine J. 2025 Mar 27:S1529-9430(25)00165-2. doi: 10.1016/j.spinee.2025.03.011. Online ahead of print.
ABSTRACT
BACKGROUND CONTEXT: Lumbar foraminal stenosis (LFS) is a common spinal condition that requires accurate assessment. Current magnetic resonance imaging (MRI) reporting processes are often inefficient, and while deep learning has potential for improvement, challenges in generalization and interpretability limit its diagnostic effectiveness compared to physician expertise.
PURPOSE: The present study aimed to leverage a multimodal large language model to improve the accuracy and efficiency of LFS image classification, thereby enabling rapid and precise automated diagnosis, reducing the dependence on manually annotated data, and enhancing diagnostic efficiency.
STUDY DESIGN/SETTING: Retrospective study conducted from April 2017 to March 2023.
PATIENT SAMPLE: Sagittal T1-weighted MRI data for the lumbar spine were collected from 1,200 patients across three medical centers. A total of 810 patient cases were included in the final analysis, with data collected from seven different MRI devices.
OUTCOME MEASURES: Automated classification of LFS using the multi modal large language model. Accuracy, sensitivity, Specificity and Cohen's Kappa coefficient were calculated.
METHODS: An advanced multimodal fusion framework GPT4LFS was developed with the primary objective of integrating imaging data and natural language descriptions to comprehensively capture the complex LFS features. The model employed a pre-trained ConvNeXt as the image processing module for extracting high-dimensional imaging features. Concurrently, medical descriptive texts generated by the multimodal large language model GPT-4o and encoded and feature-extracted using RoBERTa were utilized to optimize the model's contextual understanding capabilities. The Mamba architecture was implemented during the feature fusion stage, effectively integrating imaging and textual features and thereby enhancing the performance of the classification task. Finally, the stability of the model's detection results was validated by evaluating classification task metrics, such as the accuracy, sensitivity, specificity, and Kappa coefficients.
RESULTS: The training set comprised 6,299 images from 635 patients, the internal test set included 820 images from 82 patients, and the external test set was composed of 930 images from 93 patients. The GPT4LFS model demonstrated an overall accuracy of 93.7%, sensitivity of 95.8%, and specificity of 94.5% in the internal test set (Kappa = 0.89,95% confidence interval (CI): 0.84-0.96, p<.001). In the external test set, the overall accuracy was 92.2%, with a sensitivity of 92.2% and a specificity of 97.4% (Kappa = 0.88, 95% CI: 0.84-0.89, p<.001). Both the internal and external test sets showed excellent consistency in the model. After the article is published, we will make the full code publicly available on GitHub.
CONCLUSIONS: Using the GPT4LFS model for LFS image categorization demonstrated accuracy and the capacity for feature description at a level commensurate with that of professional clinicians.
PMID:40157428 | DOI:10.1016/j.spinee.2025.03.011
Near-term prediction of sustained ventricular arrhythmias applying artificial intelligence to single-lead ambulatory electrocardiogram
Eur Heart J. 2025 Mar 30:ehaf073. doi: 10.1093/eurheartj/ehaf073. Online ahead of print.
ABSTRACT
BACKGROUND AND AIMS: Accurate near-term prediction of life-threatening ventricular arrhythmias would enable pre-emptive actions to prevent sudden cardiac arrest/death. A deep learning-enabled single-lead ambulatory electrocardiogram (ECG) may identify an ECG profile of individuals at imminent risk of sustained ventricular tachycardia (VT).
METHODS: This retrospective study included 247 254, 14 day ambulatory ECG recordings from six countries. The first 24 h were used to identify patients likely to experience sustained VT occurrence (primary outcome) in the subsequent 13 days using a deep learning-based model. The development set consisted of 183 177 recordings. Performance was evaluated using internal (n = 43 580) and external (n = 20 497) validation data sets. Saliency mapping visualized features influencing the model's risk predictions.
RESULTS: Among all recordings, 1104 (.5%) had sustained ventricular arrhythmias. In both the internal and external validation sets, the model achieved an area under the receiver operating characteristic curve of .957 [95% confidence interval (CI) .943-.971] and .948 (95% CI .926-.967). For a specificity fixed at 97.0%, the sensitivity reached 70.6% and 66.1% in the internal and external validation sets, respectively. The model accurately predicted future VT occurrence of recordings with rapid sustained VT (≥180 b.p.m.) in 80.7% and 81.1%, respectively, and 90.0% of VT that degenerated into ventricular fibrillation. Saliency maps suggested the role of premature ventricular complex burden and early depolarization time as predictors for VT.
CONCLUSIONS: A novel deep learning model utilizing dynamic single-lead ambulatory ECGs accurately identifies patients at near-term risk of ventricular arrhythmias. It also uncovers an early depolarization pattern as a potential determinant of ventricular arrhythmias events.
PMID:40157386 | DOI:10.1093/eurheartj/ehaf073
Enhancing visual speech perception through deep automatic lipreading: A systematic review
Comput Biol Med. 2025 Mar 28;190:110019. doi: 10.1016/j.compbiomed.2025.110019. Online ahead of print.
ABSTRACT
Communication involves exchanging information between individuals or groups through various media sources. However, limitations such as hearing loss can make it difficult for some individuals to understand the information delivered during speech communication. Conventional methods, including sign language, written text, and manual lipreading, offer some solutions; however, emerging software-based tools using artificial intelligence (AI) are introducing more effective approaches. Many approaches rely on AI to improve communication quality, with the current trend of leveraging deep learning being a particularly effective tool. This paper presents a comprehensive Systematic Literature Review (SLR) of research trends in automatic lipreading technologies, a critical field in enhancing communication among individuals with hearing impairments. The SLR, which followed the Preferred Reporting Items for Systematic Literature Review and Meta-Analysis (PRISMA) protocol, identified 114 original research articles published between 2014 and mid-2024. The essential information from these articles was summarized, including the trends in automatic lipreading research, dataset availability, task categories, existing approaches, and architectures for automatic lipreading systems. The results showed that various techniques and advanced deep learning models achieved convincing performance to become state-of-the-art in automatic lipreading tasks. However, several challenges, such as insufficient data quantity, inadequate environmental conditions, and language diversity, must be resolved in the future. Furthermore, many improvements have been made to the deep learning models to overcome these challenges and become a massive solution, particularly for automatic lipreading tasks in the near future.
PMID:40157316 | DOI:10.1016/j.compbiomed.2025.110019
ResTransUNet: A hybrid CNN-transformer approach for liver and tumor segmentation in CT images
Comput Biol Med. 2025 Mar 28;190:110048. doi: 10.1016/j.compbiomed.2025.110048. Online ahead of print.
ABSTRACT
BACKGROUND AND OBJECTIVE: Accurate medical tumor segmentation is critical for early diagnosis and treatment planning, significantly improving patient outcomes. This study aims to enhance liver and tumor segmentation from CT and liver images by developing a novel model, ResTransUNet, which combines convolutional and transformer blocks to improve segmentation accuracy.
METHODS: The proposed ResTransUNet model is a custom implementation inspired by the TransUNet architecture, featuring a Standalone Transformer Block and ResNet50 as the backbone for the encoder. The hybrid architecture leverages the strengths of Convolutional Neural Networks (CNNs) and Transformer blocks to capture both local features and global context effectively. The encoder utilizes a pre-trained ResNet50 to extract rich hierarchical features, with key feature maps to preserved it as skip connections. The Standalone Transformer Block, integrated into the model, employs multi-head attention mechanisms to capture long-range dependencies across the image, enhancing segmentation performance in complex cases. The decoder reconstructs the segmentation mask by progressively upsampling encoded features while integrating skip connections, ensuring both semantic information and spatial details are retained. This process culminates in a precise binary segmentation mask that effectively distinguishes liver and tumor regions.
RESULTS: The ResTransUNet model achieved superior Dice Similarity Coefficient (DSC) for liver segmentation (98.3% on LiTS and 98.4% on 3D-IRCADb-01) and for tumor segmentation from CT images (94.7% on LiTS and 89.8% on 3D-IRCADb-01) as well as from liver images (94.6% on LiTS and 91.1% on 3D-IRCADb-01). The model also demonstrated high precision, sensitivity, and specificity, outperforming current state-of-the-art methods in these tasks.
CONCLUSIONS: The ResTransUNet model demonstrates robust and accurate performance in complex medical image segmentation tasks, particularly in liver and tumor segmentation. These findings suggest that ResTransUNet has significant potential for improving the precision of surgical interventions and therapy planning in clinical settings.
PMID:40157314 | DOI:10.1016/j.compbiomed.2025.110048
Proton dose calculation with transformer: Transforming spot map to dose
Med Phys. 2025 Mar 29. doi: 10.1002/mp.17794. Online ahead of print.
ABSTRACT
BACKGROUND: Conventional proton dose calculation methods are either time- and resource-intensive, like Monte Carlo (MC) simulations, or they sacrifice accuracy, as seen with analytical methods. This trade-off between computational efficiency and accuracy highlights the need for improved dose calculation approaches in clinical settings.
PURPOSE: This study aims to develop a deep-learning-based model that calculates dose-to-water (DW) and dose-to-medium (DM) using patient anatomy and proton spot map (PSM), achieving approaching MC-level accuracy with significantly reduced computation time. Additionally, the study seeks to generalize the model to different treatment sites using transfer learning.
METHODS: A SwinUNetr model was developed using 259 four-field prostate proton stereotactic body radiation therapy (SBRT) plans to calculate patient-specific DW and DM distributions from CT and projected PSM (PPSM). The PPSM was created by projecting PSM into the CT scans using spot coordinates, stopping power ratio, beam divergence, and water-equivalent thickness. Fine-tuning was then performed for the central nervous system (CNS) site using 84 CNS plans. The model's accuracy was evaluated against MC simulation benchmarks using mean absolute error (MAE), gamma analysis (2% local dose difference, 2-mm distance-to-agreement, 10% low dose threshold), and relevant clinical indices on the test dataset.
RESULTS: The trained model achieved a single-field dose calculation time of 0.07 s on a Nvidia-A100 GPU, over 100 times faster than MC simulators. For the prostate site, the best-performing model showed an average MAE of 0.26 ± 0.17 Gy and a gamma index of 92.2% ± 3.1% in dose regions above 10% of the maximum dose for DW calculations, and an MAE of 0.30 ± 0.19 Gy with a gamma index of 89.7% ± 3.9% for DM calculations. After transfer learning for CNS plans, the model achieved an MAE of 0.49 ± 0.24 Gy and a gamma index of 90.1% ± 2.7% for DW computations, and an MAE of 0.47 ± 0.25 Gy with a gamma index of 85.4% ± 7.1% for DM computations.
CONCLUSIONS: The SwinUNetr model provides an efficient and accurate method for computing dose distributions in proton therapy. It also opens the possibility of reverse-engineering PSM from DW, potentially speeding up treatment planning while maintaining accuracy.
PMID:40156258 | DOI:10.1002/mp.17794
Deep Learning Based on Ultrasound Images Differentiates Parotid Gland Pleomorphic Adenomas and Warthin Tumors
Ultrason Imaging. 2025 Mar 29:1617346251319410. doi: 10.1177/01617346251319410. Online ahead of print.
ABSTRACT
Exploring the clinical significance of employing deep learning methodologies on ultrasound images for the development of an automated model to accurately identify pleomorphic adenomas and Warthin tumors in salivary glands. A retrospective study was conducted on 91 patients who underwent ultrasonography examinations between January 2016 and December 2023 and were subsequently diagnosed with pleomorphic adenoma or Warthin's tumor based on postoperative pathological findings. A total of 526 ultrasonography images were collected for analysis. Convolutional neural network (CNN) models, including ResNet18, MobileNetV3Small, and InceptionV3, were trained and validated using these images for the differentiation of pleomorphic adenoma and Warthin's tumor. Performance evaluation metrics such as receiver operating characteristic (ROC) curves, area under the curve (AUC), sensitivity, specificity, positive predictive value, and negative predictive value were utilized. Two ultrasound physicians, with varying levels of expertise, conducted independent evaluations of the ultrasound images. Subsequently, a comparative analysis was performed between the diagnostic outcomes of the ultrasound physicians and the results obtained from the best-performing model. Inter-rater agreement between routine ultrasonography interpretation by the two expert ultrasonographers and the automatic identification diagnosis of the best model in relation to pathological results was assessed using kappa tests. The deep learning models achieved favorable performance in differentiating pleomorphic adenoma from Warthin's tumor. The ResNet18, MobileNetV3Small, and InceptionV3 models exhibited diagnostic accuracies of 82.4% (AUC: 0.932), 87.0% (AUC: 0.946), and 77.8% (AUC: 0.811), respectively. Among these models, MobileNetV3Small demonstrated the highest performance. The experienced ultrasonographer achieved a diagnostic accuracy of 73.5%, with sensitivity, specificity, positive predictive value, and negative predictive value of 73.7%, 73.3%, 77.8%, and 68.8%, respectively. The less-experienced ultrasonographer achieved a diagnostic accuracy of 69.0%, with sensitivity, specificity, positive predictive value, and negative predictive value of 66.7%, 71.4%, 71.4%, and 66.7%, respectively. The kappa test revealed strong consistency between the best-performing deep learning model and postoperative pathological diagnoses (kappa value: .778, p-value < .001). In contrast, the less-experienced ultrasonographer demonstrated poor consistency in image interpretations (kappa value: .380, p-value < .05). The diagnostic accuracy of the best deep learning model was significantly higher than that of the ultrasonographers, and the experienced ultrasonographer exhibited higher diagnostic accuracy than the less-experienced one. This study demonstrates the promising performance of a deep learning-based method utilizing ultrasonography images for the differentiation of pleomorphic adenoma and Warthin's tumor. The approach reduces subjective errors, provides decision support for clinicians, and improves diagnostic consistency.
PMID:40156239 | DOI:10.1177/01617346251319410
Research on adversarial identification methods for AI-generated image software Craiyon V3
J Forensic Sci. 2025 Mar 29. doi: 10.1111/1556-4029.70034. Online ahead of print.
ABSTRACT
With the rapid development of diffusion models, AI generation technology can now generate very realistic images. If such AI-generated images are used as evidence, they may threaten judicial fairness. Taking the adversarial identification of images generated by Craiyon V3 software as an example, this paper studies the adversarial identification methods for AI-generated image software. First, an AI-generated image set containing 18,000 images is constructed using Craiyon V3; then, an AI-generated image detection model based on deep learning is selected, and a score-based likelihood ratio method is introduced to evaluate the strength of evidence. Experimental results show that the proposed method achieves an accuracy of over 99% on multiple threshold classifiers including Swin-Transformer, ResNet-18, and so on, and the fitted likelihood ratio model also passes a series of validation criteria including Tippett plots. The research results of this paper are expected to be applied to judicial practice in the future, providing judges with a reliable and powerful decision-making basis, and laying a foundation for further exploration of AI-generated image identification methods.
PMID:40156229 | DOI:10.1111/1556-4029.70034
A deep learning model for classification of chondroid tumors on CT images
BMC Cancer. 2025 Mar 28;25(1):561. doi: 10.1186/s12885-025-13951-1.
ABSTRACT
BACKGROUND: Differentiating chondroid tumors is crucial for proper patient management. This study aimed to develop a deep learning model (DLM) for classifying enchondromas, atypical cartilaginous tumors (ACT), and high-grade chondrosarcomas using CT images.
METHODS: This retrospective study analyzed chondroid tumors from two independent cohorts. Tumors were segmented on CT images. A 2D convolutional neural network was developed and tested using split-sample and geographical validation. Four radiologists blinded to patient data and the DLM results with various levels of experience performed readings of the external test dataset for comparison. Performance metrics included accuracy, sensitivity, specificity, and area under the curve (AUC).
RESULTS: CTs from 344 patients (175 women; age = 50.3 ± 14.3 years;) with diagnosed enchondroma (n = 124), ACT (n = 92) or high-grade chondrosarcoma (n = 128) were analyzed. The DLM demonstrated comparable performance to radiologists (p > 0.05), achieving an AUC of 0.88 for distinguishing enchondromas from chondrosarcomas and 0.82 for differentiating enchondromas from ACTs. The DLM and musculoskeletal expert showed similar performance in differentiating ACTs from high-grade chondrosarcomas (p = 0.26), with an AUC of 0.64 and 0.56, respectively.
CONCLUSIONS: The DLM reliably differentiates benign from malignant cartilaginous tumors and is particularly useful for the differentiation between ACTs and Enchondromas, which is challenging based on CT images only. However, the differentiation between ACTs and high-grade chondrosarcomas remains difficult, reflecting known diagnostic challenges in radiology.
PMID:40155859 | DOI:10.1186/s12885-025-13951-1
Author Correction: Deep learning based decision-making and outcome prediction for adolescent idiopathic scoliosis patients with posterior surgery
Sci Rep. 2025 Mar 28;15(1):10764. doi: 10.1038/s41598-025-95425-9.
NO ABSTRACT
PMID:40155745 | DOI:10.1038/s41598-025-95425-9
Atomic context-conditioned protein sequence design using LigandMPNN
Nat Methods. 2025 Mar 28. doi: 10.1038/s41592-025-02626-1. Online ahead of print.
ABSTRACT
Protein sequence design in the context of small molecules, nucleotides and metals is critical to enzyme and small-molecule binder and sensor design, but current state-of-the-art deep-learning-based sequence design methods are unable to model nonprotein atoms and molecules. Here we describe a deep-learning-based protein sequence design method called LigandMPNN that explicitly models all nonprotein components of biomolecular systems. LigandMPNN significantly outperforms Rosetta and ProteinMPNN on native backbone sequence recovery for residues interacting with small molecules (63.3% versus 50.4% and 50.5%), nucleotides (50.5% versus 35.2% and 34.0%) and metals (77.5% versus 36.0% and 40.6%). LigandMPNN generates not only sequences but also sidechain conformations to allow detailed evaluation of binding interactions. LigandMPNN has been used to design over 100 experimentally validated small-molecule and DNA-binding proteins with high affinity and high structural accuracy (as indicated by four X-ray crystal structures), and redesign of Rosetta small-molecule binder designs has increased binding affinity by as much as 100-fold. We anticipate that LigandMPNN will be widely useful for designing new binding proteins, sensors and enzymes.
PMID:40155723 | DOI:10.1038/s41592-025-02626-1
Comparative analysis of daily global solar radiation prediction using deep learning models inputted with stochastic variables
Sci Rep. 2025 Mar 28;15(1):10786. doi: 10.1038/s41598-025-95281-7.
ABSTRACT
Photovoltaic power plant outputs depend on the daily global solar radiation (DGSR). The main issue with DGSR data is its lack of precision. The potential unavailability of DGSR data for several sites can be attributed to the high cost of measuring instruments and the intermittent nature of time series data due to equipment malfunctions. Therefore, DGSR prediction research is crucial nowadays to produce photovoltaic power. Different artificial neural network (ANN) models will give different DGSR predictions with varying levels of accuracy, so it is essential to compare the different ANN model inputs with various sets of meteorological stochastic variables. In this study, radial basis function neural network (RBFNN), long short-term memory neural network (LSTMNN), modular neural network (MNN), and transformer model (TM) are developed to investigate the performances of these algorithms for the DGSR prediction using different combinations of meteorological stochastic variables. These models employ five stochastic variables: wind speed, relative humidity, minimum, maximum, and average temperatures. The mean absolute relative error for the transformer model with input variables as average, maximum, and minimum temperatures is 1.98. ANN models outperform traditional models in predictive accuracy.
PMID:40155686 | DOI:10.1038/s41598-025-95281-7
Fine-tuned deep learning models for early detection and classification of kidney conditions in CT imaging
Sci Rep. 2025 Mar 28;15(1):10741. doi: 10.1038/s41598-025-94905-2.
ABSTRACT
The kidney plays a vital role in maintaining homeostasis, but lifestyle factors and diseases can lead to kidney failures. Early detection of kidney disease is crucial for effective intervention, often challenging due to unnoticeable symptoms in the initial stages. Computed tomography (CT) imaging aids specialists in detecting various kidney conditions. The research focuses on classifying CT images of cysts, normal states, stones, and tumors using a hyperparameter fine-tuned approach with convolutional neural networks (CNNs), VGG16, ResNet50, CNNAlexnet, and InceptionV3 transfer learning models. It introduces an innovative methodology that integrates finely tuned transfer learning, advanced image processing, and hyperparameter optimization to enhance the accuracy of kidney tumor classification. By applying these sophisticated techniques, the study aims to significantly improve diagnostic precision and reliability in identifying various kidney conditions, ultimately contributing to better patient outcomes in medical imaging. The methodology implements image-processing techniques to enhance classification accuracy. Feature maps are derived through data normalization and augmentation (zoom, rotation, shear, brightness adjustment, horizontal/vertical flip). Watershed segmentation and Otsu's binarization thresholding further refine the feature maps, which are optimized and combined using the relief method. Wide neural network classifiers are employed, achieving the highest accuracy of 99.96% across models. This performance positions the proposed approach as a high-performance solution for automatic and accurate kidney CT image classification, significantly advancing medical imaging and diagnostics. The research addresses the pressing need for early kidney disease detection using an innovative methodology, highlighting the proposed approach's capability to enhance medical imaging and diagnostic capabilities.
PMID:40155680 | DOI:10.1038/s41598-025-94905-2
An optimized deep learning based hybrid model for prediction of daily average global solar irradiance using CNN SLSTM architecture
Sci Rep. 2025 Mar 28;15(1):10761. doi: 10.1038/s41598-025-95118-3.
ABSTRACT
Global horizontal irradiance prediction is essential for balancing the supply-demand and minimizing the energy costs for effective integration of solar photovoltaic system in electric power grid. However, its stochastic nature makes it difficult to get accurate prediction results. This study aims to develop a hybrid deep learning model that integrates a Convolutional Neural Network and Stacked Long Short-Term Memory (CNN-SLSTM) to predict the daily average global solar irradiance using real time meteorological parameters and daily solar irradiance data recorded in the study site. First, we have selected 14 significant relevant features from the dataset using recursive feature elimination techniques. The hyperparameters of the developed models are optimized using metaheuristic algorithm, a Slime Mould Optimization method. The efficacy of the model performance is evaluated using tenfold cross validation techniques. By using statistical performances metrics, the predictive performance of the developed model is compared with Gated Recurrent Unit, LSTM, CNN-LSTM, SLSTM and machine learning regressor models like Support Vector Machine, Decision Tree, and Random Forest. From the experimental results, the developed CNN-SLSTM model outperformed other models with a MSE, R2 and Adj_R2 of 0.0359, 0.9790 and 0.9789, respectively.
PMID:40155655 | DOI:10.1038/s41598-025-95118-3
UrbanEV: An Open Benchmark Dataset for Urban Electric Vehicle Charging Demand Prediction
Sci Data. 2025 Mar 28;12(1):523. doi: 10.1038/s41597-025-04874-4.
ABSTRACT
The recent surge in electric vehicles (EVs), driven by a collective push to enhance global environmental sustainability, has underscored the significance of exploring EV charging prediction. To catalyze further research in this domain, we introduce UrbanEV - an open dataset showcasing EV charging space availability and electricity consumption in a pioneering city for vehicle electrification, namely Shenzhen, China. UrbanEV offers a rich repository of charging data (i.e., charging occupancy, duration, volume, and price) captured at hourly intervals across an extensive six-month span for over 20,000 individual charging stations. Beyond these core attributes, the dataset also encompasses diverse influencing factors like weather conditions and spatial proximity. Comprehensive experiments have been conducted to showcase the predictive capabilities of various models, including statistical, deep learning, and transformer-based approaches, using the UrbanEV dataset. This dataset is poised to propel advancements in EV charging prediction and management, positioning itself as a benchmark resource within this burgeoning field.
PMID:40155635 | DOI:10.1038/s41597-025-04874-4