Deep learning
Enhancing the visual environment of urban coastal roads through deep learning analysis of street-view images: A perspective of aesthetic and distinctiveness
PLoS One. 2025 Jan 14;20(1):e0317585. doi: 10.1371/journal.pone.0317585. eCollection 2025.
ABSTRACT
Urban waterfront areas, which are essential natural resources and highly perceived public areas in cities, play a crucial role in enhancing urban environment. This study integrates deep learning with human perception data sourced from street view images to study the relationship between visual landscape features and human perception of urban waterfront areas, employing linear regression and random forest models to predict human perception along urban coastal roads. Based on aesthetic and distinctiveness perception, urban coastal roads in Xiamen were classified into four types with different emphasis and priorities for improvement. The results showed that: 1) the degree of coastal openness had the greatest influence on human perception while the coastal landscape with a high green visual index decreases the distinctiveness perception; 2) the random forest model can effectively predict human perception on urban coastal roads with an accuracy rate of 87% and 77%; 3) The proportion of low perception road sections with potential for improvement is 60.6%, among which the proportion of low aesthetic perception and low distinctiveness perception road sections is 10.5%. These findings offer crucial evidence regarding human perception of urban coastal roads, and can provide targeted recommendations for enhancing the visual environment of urban coastal road landscapes.
PMID:39808675 | DOI:10.1371/journal.pone.0317585
Metastatic Lung Lesion Changes in Follow-up Chest CT: The Advantage of Deep Learning Simultaneous Analysis of Prior and Current Scans With SimU-Net
J Thorac Imaging. 2024 Sep 20. doi: 10.1097/RTI.0000000000000808. Online ahead of print.
ABSTRACT
PURPOSE: Radiological follow-up of oncology patients requires the detection of metastatic lung lesions and the quantitative analysis of their changes in longitudinal imaging studies. Our aim was to evaluate SimU-Net, a novel deep learning method for the automatic analysis of metastatic lung lesions and their temporal changes in pairs of chest CT scans.
MATERIALS AND METHODS: SimU-Net is a simultaneous multichannel 3D U-Net model trained on pairs of registered prior and current scans of a patient. It is part of a fully automatic pipeline for the detection, segmentation, matching, and classification of metastatic lung lesions in longitudinal chest CT scans. A data set of 5040 metastatic lung lesions in 344 pairs of 208 prior and current chest CT scans from 79 patients was used for training/validation (173 scans, 65 patients) and testing (35 scans, 14 patients) of a standalone 3D U-Net models and 3 simultaneous SimU-Net models. Outcome measures were the lesion detection and segmentation precision, recall, Dice score, average symmetric surface distance (ASSD), lesion matching, and classification of lesion changes from computed versus manual ground-truth annotations by an expert radiologist.
RESULTS: SimU-Net achieved a mean lesion detection recall and precision of 0.93±0.13 and 0.79±0.24 and a mean lesion segmentation Dice and ASSD of 0.84±0.09 and 0.33±0.22 mm. These results outperformed the standalone 3D U-Net model by 9.4% in the recall, 2.4% in Dice, and 15.4% in ASSD, with a minor 3.6% decrease in precision. The SimU-Net pipeline achieved perfect precision and recall (1.0±0.0) for lesion matching and classification of lesion changes.
CONCLUSIONS: Simultaneous deep learning analysis of metastatic lung lesions in prior and current chest CT scans with SimU-Net yields superior accuracy compared with individual analysis of each scan. Implementation of SimU-Net in the radiological workflow may enhance efficiency by automatically computing key metrics used to evaluate metastatic lung lesions and their temporal changes.
PMID:39808543 | DOI:10.1097/RTI.0000000000000808
The Role of Artificial Intelligence in Predicting Optic Neuritis Subtypes From Ocular Fundus Photographs
J Neuroophthalmol. 2024 Dec 1;44(4):462-468. doi: 10.1097/WNO.0000000000002229. Epub 2024 Aug 1.
ABSTRACT
BACKGROUND: Optic neuritis (ON) is a complex clinical syndrome that has diverse etiologies and treatments based on its subtypes. Notably, ON associated with multiple sclerosis (MS ON) has a good prognosis for recovery irrespective of treatment, whereas ON associated with other conditions including neuromyelitis optica spectrum disorders or myelin oligodendrocyte glycoprotein antibody-associated disease is often associated with less favorable outcomes. Delay in treatment of these non-MS ON subtypes can lead to irreversible vision loss. It is important to distinguish MS ON from other ON subtypes early, to guide appropriate management. Yet, identifying ON and differentiating subtypes can be challenging as MRI and serological antibody test results are not always readily available in the acute setting. The purpose of this study is to develop a deep learning artificial intelligence (AI) algorithm to predict subtype based on fundus photographs, to aid the diagnostic evaluation of patients with suspected ON.
METHODS: This was a retrospective study of patients with ON seen at our institution between 2007 and 2022. Fundus photographs (1,599) were retrospectively collected from a total of 321 patients classified into 2 groups: MS ON (262 patients; 1,114 photographs) and non-MS ON (59 patients; 485 photographs). The dataset was divided into training and holdout test sets with an 80%/20% ratio, using stratified sampling to ensure equal representation of MS ON and non-MS ON patients in both sets. Model hyperparameters were tuned using 5-fold cross-validation on the training dataset. The overall performance and generalizability of the model was subsequently evaluated on the holdout test set.
RESULTS: The receiver operating characteristic (ROC) curve for the developed model, evaluated on the holdout test dataset, yielded an area under the ROC curve of 0.83 (95% confidence interval [CI], 0.72-0.92). The model attained an accuracy of 76.2% (95% CI, 68.4-83.1), a sensitivity of 74.2% (95% CI, 55.9-87.4) and a specificity of 76.9% (95% CI, 67.6-85.0) in classifying images as non-MS-related ON.
CONCLUSIONS: This study provides preliminary evidence supporting a role for AI in differentiating non-MS ON subtypes from MS ON. Future work will aim to increase the size of the dataset and explore the role of combining clinical and paraclinical measures to refine deep learning models over time.
PMID:39808513 | DOI:10.1097/WNO.0000000000002229
Characterization of adrenal glands on computed tomography with a 3D V-Net-based model
Insights Imaging. 2025 Jan 14;16(1):17. doi: 10.1186/s13244-025-01898-7.
ABSTRACT
OBJECTIVES: To evaluate the performance of a 3D V-Net-based segmentation model of adrenal lesions in characterizing adrenal glands as normal or abnormal.
METHODS: A total of 1086 CT image series with focal adrenal lesions were retrospectively collected, annotated, and used for the training of the adrenal lesion segmentation model. The dice similarity coefficient (DSC) of the test set was used to evaluate the segmentation performance. The other cohort, consisting of 959 patients with pathologically confirmed adrenal lesions (external validation dataset 1), was included for validation of the classification performance of this model. Then, another consecutive cohort of patients with a history of malignancy (N = 479) was used for validation in the screening population (external validation dataset 2). Parameters of sensitivity, accuracy, etc., were used, and the performance of the model was compared to the radiology report in these validation scenes.
RESULTS: The DSC of the test set of the segmentation model was 0.900 (0.810-0.965) (median (interquartile range)). The model showed sensitivities and accuracies of 99.7%, 98.3% and 87.2%, 62.2% in external validation datasets 1 and 2, respectively. It showed no significant difference comparing to radiology reports in external validation datasets 1 and lesion-containing groups of external validation datasets 2 (p = 1.000 and p > 0.05, respectively).
CONCLUSION: The 3D V-Net-based segmentation model of adrenal lesions can be used for the binary classification of adrenal glands.
CRITICAL RELEVANCE STATEMENT: A 3D V-Net-based segmentation model of adrenal lesions can be used for the detection of abnormalities of adrenal glands, with a high accuracy in the pre-surgical scene as well as a high sensitivity in the screening scene.
KEY POINTS: Adrenal lesions may be prone to inter-observer variability in routine diagnostic workflow. The study developed a 3D V-Net-based segmentation model of adrenal lesions with DSC 0.900 in the test set. The model showed high sensitivity and accuracy of abnormalities detection in different scenes.
PMID:39808346 | DOI:10.1186/s13244-025-01898-7
VirDetect-AI: a residual and convolutional neural network-based metagenomic tool for eukaryotic viral protein identification
Brief Bioinform. 2024 Nov 22;26(1):bbaf001. doi: 10.1093/bib/bbaf001.
ABSTRACT
This study addresses the challenging task of identifying viruses within metagenomic data, which encompasses a broad array of biological samples, including animal reservoirs, environmental sources, and the human body. Traditional methods for virus identification often face limitations due to the diversity and rapid evolution of viral genomes. In response, recent efforts have focused on leveraging artificial intelligence (AI) techniques to enhance accuracy and efficiency in virus detection. However, existing AI-based approaches are primarily binary classifiers, lacking specificity in identifying viral types and reliant on nucleotide sequences. To address these limitations, VirDetect-AI, a novel tool specifically designed for the identification of eukaryotic viruses within metagenomic datasets, is introduced. The VirDetect-AI model employs a combination of convolutional neural networks and residual neural networks to effectively extract hierarchical features and detailed patterns from complex amino acid genomic data. The results demonstrated that the model has outstanding results in all metrics, with a sensitivity of 0.97, a precision of 0.98, and an F1-score of 0.98. VirDetect-AI improves our comprehension of viral ecology and can accurately classify metagenomic sequences into 980 viral protein classes, hence enabling the identification of new viruses. These classes encompass an extensive array of viral genera and families, as well as protein functions and hosts.
PMID:39808116 | DOI:10.1093/bib/bbaf001
Deep Learning to Simulate Contrast-Enhanced MRI for Evaluating Suspected Prostate Cancer
Radiology. 2025 Jan;314(1):e240238. doi: 10.1148/radiol.240238.
ABSTRACT
Background Multiparametric MRI, including contrast-enhanced sequences, is recommended for evaluating suspected prostate cancer, but concerns have been raised regarding potential contrast agent accumulation and toxicity. Purpose To evaluate the feasibility of generating simulated contrast-enhanced MRI from noncontrast MRI sequences using deep learning and to explore their potential value for assessing clinically significant prostate cancer using Prostate Imaging Reporting and Data System (PI-RADS) version 2.1. Materials and Methods Male patients with suspected prostate cancer who underwent multiparametric MRI were retrospectively included from three centers from April 2020 to April 2023. A deep learning model (pix2pix algorithm) was trained to synthesize contrast-enhanced MRI scans from four noncontrast MRI sequences (T1-weighted imaging, T2-weighted imaging, diffusion-weighted imaging, and apparent diffusion coefficient maps) and then tested on an internal and two external datasets. The reference standard for model training was the second postcontrast phase of the dynamic contrast-enhanced sequence. Similarity between simulated and acquired contrast-enhanced images was evaluated using the multiscale structural similarity index. Three radiologists independently scored T2-weighted and diffusion-weighted MRI with either simulated or acquired contrast-enhanced images using PI-RADS, version 2.1; agreement was assessed with Cohen κ. Results A total of 567 male patients (mean age, 66 years ± 11 [SD]) were divided into a training test set (n = 244), internal test set (n = 104), external test set 1 (n = 143), and external test set 2 (n = 76). Simulated and acquired contrast-enhanced images demonstrated high similarity (multiscale structural similarity index: 0.82, 0.71, and 0.69 for internal test set, external test set 1, and external test set 2, respectively) with excellent reader agreement of PI-RADS scores (Cohen κ, 0.96; 95% CI: 0.94, 0.98). When simulated contrast-enhanced imaging was added to biparametric MRI, 34 of 323 (10.5%) patients were upgraded to PI-RADS 4 from PI-RADS 3. Conclusion It was feasible to generate simulated contrast-enhanced prostate MRI using deep learning. The simulated and acquired contrast-enhanced MRI scans exhibited high similarity and demonstrated excellent agreement in assessing clinically significant prostate cancer based on PI-RADS, version 2.1. © RSNA, 2025 Supplemental material is available for this article. See also the editorial by Neji and Goh in this issue.
PMID:39807983 | DOI:10.1148/radiol.240238
Erratum: Volumetric Breast Density Estimation From Three-Dimensional Reconstructed Digital Breast Tomosynthesis Images Using Deep Learning
JCO Clin Cancer Inform. 2025 Jan;9:e2400325. doi: 10.1200/CCI-24-00325. Epub 2025 Jan 14.
NO ABSTRACT
PMID:39807853 | DOI:10.1200/CCI-24-00325
Sleep stages classification based on feature extraction from music of brain
Heliyon. 2024 Dec 12;11(1):e41147. doi: 10.1016/j.heliyon.2024.e41147. eCollection 2025 Jan 15.
ABSTRACT
Sleep stages classification one of the essential factors concerning sleep disorder diagnoses, which can contribute to many functional disease treatments or prevent the primary cognitive risks in daily activities. In this study, A novel method of mapping EEG signals to music is proposed to classify sleep stages. A total of 4.752 selected 1-min sleep records extracted from the capsleep database are applied as the statistical population for this assessment. In this process, first, the tempo and scale parameters are extracted from the signal according to the rules of music, and next by applying them and changing the dominant frequency of the pre-processed single-channel EEG signal, a sequence of musical notes is produced. A total of 19 features are extracted from the sequence of notes and fed into feature reduction algorithms; the selected features are applied to a two-stage classification structure: 1) the classification of 5 classes (merging S1 and REM-S2-S3-S4-W) is made with an accuracy of 89.5 % (Cap sleep database), 85.9 % (Sleep-EDF database), 86.5 % (Sleep-EDF expanded database), and 2) the classification of 2 classes (S1 vs. REM) is made with an accuracy of 90.1 % (Cap sleep database),88.9 % (Sleep-EDF database), 90.1 % (Sleep-EDF expanded database). The overall percentage of correct classification for 6 sleep stages are 88.13 %, 84.3 % and 86.1 % for those databases, respectively. The other objective of this study is to present a new single-channel EEG sonification method, The classification accuracy obtained is higher or comparable to contemporary methods. This shows the efficiency of our proposed method.
PMID:39807512 | PMC:PMC11728888 | DOI:10.1016/j.heliyon.2024.e41147
AxonFinder: Automated segmentation of tumor innervating neuronal fibers
Heliyon. 2024 Dec 15;11(1):e41209. doi: 10.1016/j.heliyon.2024.e41209. eCollection 2025 Jan 15.
ABSTRACT
Neurosignaling is increasingly recognized as a critical factor in cancer progression, where neuronal innervation of primary tumors contributes to the disease's advancement. This study focuses on segmenting individual axons within the prostate tumor microenvironment, which have been challenging to detect and analyze due to their irregular morphologies. We present a novel deep learning-based approach for the automated segmentation of axons, AxonFinder, leveraging a U-Net model with a ResNet-101 encoder, based on a multiplexed imaging approach. Utilizing a dataset of whole-slide images from low-, intermediate-, and high-risk prostate cancer patients, we manually annotated axons to train our model, achieving significant accuracy in detecting axonal structures that were previously hard to segment. Our method achieves high performance, with a validation F1-score of 94 % and IoU of 90.78 %. Besides, the morphometric analysis that shows strong alignment between manual annotations and automated segmentation with nerve length and tortuosity closely matching manual measurements. Furthermore, our analysis includes a comprehensive assessment of axon density and morphological features across different CAPRA-S prostate cancer risk categories revealing a significant decline in axon density correlating with higher CAPRA-S prostate cancer risk scores. Our paper suggests the potential utility of neuronal markers in the prognostic assessment of prostate cancer in aiding the pathologist's assessment of tumor sections and advancing our understanding of neurosignaling in the tumor microenvironment.
PMID:39807499 | PMC:PMC11728976 | DOI:10.1016/j.heliyon.2024.e41209
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
Vis Intell. 2024;2(1):36. doi: 10.1007/s44267-024-00070-x. Epub 2024 Dec 30.
ABSTRACT
The LLaMA family, a collection of foundation language models ranging from 7B to 65B parameters, has become one of the most powerful open-source large language models (LLMs) and the popular LLM backbone of multi-modal large language models (MLLMs), widely used in computer vision and natural language understanding tasks. In particular, LLaMA3 models have recently been released and have achieved impressive performance in various domains with super-large scale pre-training on over 15T tokens of data. Given the wide application of low-bit quantization for LLMs in resource-constrained scenarios, we explore LLaMA3's capabilities when quantized to low bit-width. This exploration can potentially provide new insights and challenges for the low-bit quantization of LLaMA3 and other future LLMs, especially in addressing performance degradation issues that suffer in LLM compression. Specifically, we comprehensively evaluate the 10 existing post-training quantization and LoRA fine-tuning (LoRA-FT) methods of LLaMA3 on 1-8 bits and various datasets to reveal the low-bit quantization performance of LLaMA3. To uncover the capabilities of low-bit quantized MLLM, we assessed the performance of the LLaMA3-based LLaVA-Next-8B model under 2-4 ultra-low bits with post-training quantization methods. Our experimental results indicate that LLaMA3 still suffers from non-negligible degradation in linguistic and visual contexts, particularly under ultra-low bit widths. This highlights the significant performance gap at low bit-width that needs to be addressed in future developments. We expect that this empirical study will prove valuable in advancing future models, driving LLMs and MLLMs to achieve higher accuracy at lower bit to enhance practicality.
PMID:39807379 | PMC:PMC11728678 | DOI:10.1007/s44267-024-00070-x
Advances in modeling cellular state dynamics: integrating omics data and predictive techniques
Anim Cells Syst (Seoul). 2025 Jan 10;29(1):72-83. doi: 10.1080/19768354.2024.2449518. eCollection 2025.
ABSTRACT
Dynamic modeling of cellular states has emerged as a pivotal approach for understanding complex biological processes such as cell differentiation, disease progression, and tissue development. This review provides a comprehensive overview of current approaches for modeling cellular state dynamics, focusing on techniques ranging from dynamic or static biomolecular network models to deep learning models. We highlight how these approaches integrated with various omics data such as transcriptomics, and single-cell RNA sequencing could be used to capture and predict cellular behavior and transitions. We also discuss applications of these modeling approaches in predicting gene knockout effects, designing targeted interventions, and simulating organ development. This review emphasizes the importance of selecting appropriate modeling strategies based on scalability and resolution requirements, which vary according to the complexity and size of biological systems under study. By evaluating strengths, limitations, and recent advancements of these methodologies, we aim to guide future research in developing more robust and interpretable models for understanding and manipulating cellular state dynamics in various biological contexts, ultimately advancing therapeutic strategies and precision medicine.
PMID:39807350 | PMC:PMC11727055 | DOI:10.1080/19768354.2024.2449518
Assessment of the Accuracy of a Deep Learning Algorithm- and Video-based Motion Capture System in Estimating Snatch Kinematics
Int J Exerc Sci. 2024 Dec 1;17(1):1629-1647. doi: 10.70252/PRVV4165. eCollection 2024.
ABSTRACT
In weightlifting, quantitative kinematic analysis is essential for evaluating snatch performance. While marker-based (MB) approaches are commonly used, they are impractical for training or competitions. Markerless video-based (VB) systems utilizing deep learning-based pose estimation algorithms could address this issue. This study assessed the comparability and applicability of VB systems in obtaining snatch kinematics by comparing the outcomes to an MB reference system. 21 weightlifters (15 Male, 6 Female) performed 2-3 snatches at 65%, 75%, and 80% of their one-repetition maximum. Snatch kinematics were analyzed using an MB (Vicon Nexus) and VB (Contemplas along with Theia3D) system. Analysis of 131 trials revealed that corresponding lower limb joint center positions of the systems on average differed by 4.7 ± 1.2 cm, and upper limb joint centers by 5.7 ± 1.5 cm. VB and MB lower limb joint angles showed highest agreement in the frontal plane (root mean square difference (RMSD): 11.2 ± 5.9°), followed by the sagittal plane (RMSD: 13.6 ± 4.7°). Statistical Parametric Mapping analysis revealed significant differences throughout most of the movement for all degrees of freedom. Maximum extension angles and velocities during the second pull displayed significant differences (p < .05) for the lower limbs. Our data showed significant differences in estimated kinematics between both systems, indicating a lack of comparability. These differences are likely due to differing models and assumptions, rather than measurement accuracy. However, given the rapid advancements of neural network-based approaches, it holds promise to become a suitable alternative to MB systems in weightlifting analysis.
PMID:39807293 | PMC:PMC11728585 | DOI:10.70252/PRVV4165
Glomerular and Nephron Size and Kidney Disease Outcomes: A Comparison of Manual Versus Deep Learning Methods in Kidney Pathology
Kidney Med. 2024 Nov 28;7(1):100939. doi: 10.1016/j.xkme.2024.100939. eCollection 2025 Jan.
NO ABSTRACT
PMID:39807248 | PMC:PMC11728938 | DOI:10.1016/j.xkme.2024.100939
Deep learning radiomics analysis for prediction of survival in patients with unresectable gastric cancer receiving immunotherapy
Eur J Radiol Open. 2024 Dec 19;14:100626. doi: 10.1016/j.ejro.2024.100626. eCollection 2025 Jun.
ABSTRACT
OBJECTIVE: Immunotherapy has become an option for the first-line therapy of advanced gastric cancer (GC), with improved survival. Our study aimed to investigate unresectable GC from an imaging perspective combined with clinicopathological variables to identify patients who were most likely to benefit from immunotherapy.
METHOD: Patients with unresectable GC who were consecutively treated with immunotherapy at two different medical centers of Chinese PLA General Hospital were included and divided into the training and validation cohorts, respectively. A deep learning neural network, using a multimodal ensemble approach based on CT imaging data before immunotherapy, was trained in the training cohort to predict survival, and an internal validation cohort was constructed to select the optimal ensemble model. Data from another cohort were used for external validation. The area under the receiver operating characteristic curve was analyzed to evaluate performance in predicting survival. Detailed clinicopathological data and peripheral blood prior to immunotherapy were collected for each patient. Univariate and multivariable logistic regression analysis of imaging models and clinicopathological variables was also applied to identify the independent predictors of survival. A nomogram based on multivariable logistic regression was constructed.
RESULT: A total of 79 GC patients in the training cohort and 97 patients in the external validation cohort were enrolled in this study. A multi-model ensemble approach was applied to train a model to predict the 1-year survival of GC patients. Compared to individual models, the ensemble model showed improvement in performance metrics in both the internal and external validation cohorts. There was a significant difference in overall survival (OS) among patients with different imaging models based on the optimum cutoff score of 0.5 (HR = 0.20, 95 % CI: 0.10-0.37, P < 0.001). Multivariate Cox regression analysis revealed that the imaging models, PD-L1 expression, and lung immune prognostic index were independent prognostic factors for OS. We combined these variables and built a nomogram. The calibration curves showed that the C-index of the nomogram was 0.85 and 0.78 in the training and validation cohorts.
CONCLUSION: The deep learning model in combination with several clinical factors showed predictive value for survival in patients with unresectable GC receiving immunotherapy.
PMID:39807092 | PMC:PMC11728962 | DOI:10.1016/j.ejro.2024.100626
A semi-supervised deep neuro-fuzzy iterative learning system for automatic segmentation of hippocampus brain MRI
Math Biosci Eng. 2024 Dec 11;21(12):7830-7853. doi: 10.3934/mbe.2024344.
ABSTRACT
The hippocampus is a small, yet intricate seahorse-shaped tiny structure located deep within the brain's medial temporal lobe. It is a crucial component of the limbic system, which is responsible for regulating emotions, memory, and spatial navigation. This research focuses on automatic hippocampus segmentation from Magnetic Resonance (MR) images of a human head with high accuracy and fewer false positive and false negative rates. This segmentation technique is significantly faster than the manual segmentation methods used in clinics. Unlike the existing approaches such as UNet and Convolutional Neural Networks (CNN), the proposed algorithm generates an image that is similar to a real image by learning the distribution much more quickly by the semi-supervised iterative learning algorithm of the Deep Neuro-Fuzzy (DNF) technique. To assess its effectiveness, the proposed segmentation technique was evaluated on a large dataset of 18,900 images from Kaggle, and the results were compared with those of existing methods. Based on the analysis of results reported in the experimental section, the proposed scheme in the Semi-Supervised Deep Neuro-Fuzzy Iterative Learning System (SS-DNFIL) achieved a 0.97 Dice coefficient, a 0.93 Jaccard coefficient, a 0.95 sensitivity (true positive rate), a 0.97 specificity (true negative rate), a false positive value of 0.09 and a 0.08 false negative value when compared to existing approaches. Thus, the proposed segmentation techniques outperform the existing techniques and produce the desired result so that an accurate diagnosis is made at the earliest stage to save human lives and to increase their life span.
PMID:39807055 | DOI:10.3934/mbe.2024344
Enhanced Pneumonia Detection in Chest X-Rays Using Hybrid Convolutional and Vision Transformer Networks
Curr Med Imaging. 2025 Jan 9. doi: 10.2174/0115734056326685250101113959. Online ahead of print.
ABSTRACT
OBJECTIVE: The objective of this research is to enhance pneumonia detection in chest X-rays by leveraging a novel hybrid deep learning model that combines Convolutional Neural Networks (CNNs) with modified Swin Transformer blocks. This study aims to significantly improve diagnostic accuracy, reduce misclassifications, and provide a robust, deployable solution for underdeveloped regions where access to conventional diagnostics and treatment is limited.
METHODS: The study developed a hybrid model architecture integrating CNNs with modified Swin Transformer blocks to work seamlessly within the same model. The CNN layers perform initial feature extraction, capturing local patterns within the images. At the same time, the modified Swin Transformer blocks handle long-range dependencies and global context through window-based self-attention mechanisms. Preprocessing steps included resizing images to 224x224 pixels and applying Contrast Limited Adaptive Histogram Equalization (CLAHE) to enhance image features. Data augmentation techniques, such as horizontal flipping, rotation, and zooming, were utilized to prevent overfitting and ensure model robustness. Hyperparameter optimization was conducted using Optuna, employing Bayesian optimization (Tree-structured Parzen Estimator) to fine-tune key parameters of both the CNN and Swin Transformer components, ensuring optimal model performance.
RESULTS: The proposed hybrid model was trained and validated on a dataset provided by the Guangzhou Women and Children's Medical Center. The model achieved an overall accuracy of 98.72% and a loss of 0.064 on an unseen dataset, significantly outperforming a baseline CNN model. Detailed performance metrics indicated a precision of 0.9738 for the normal class and 1.0000 for the pneumonia class, with an overall F1-score of 0.9872. The hybrid model consistently outperformed the CNN model across all performance metrics, demonstrating higher accuracy, precision, recall, and F1-score. Confusion matrices revealed high sensitivity and specificity with minimal misclassifications.
CONCLUSION: The proposed hybrid CNN-ViT model, which integrates modified Swin Transformer blocks within the CNN architecture, provides a significant advancement in pneumonia detection by effectively capturing both local and global features within chest X-ray images. The modifications to the Swin Transformer blocks enable them to work seamlessly with the CNN layers, enhancing the model's ability to understand complex visual patterns and dependencies. This results in superior classification performance. The lightweight design of the model eliminates the need for extensive hardware, facilitating easy deployment in resource-constrained settings. This innovative approach not only improves pneumonia diagnosis but also has the potential to enhance patient outcomes and support healthcare providers in underdeveloped regions. Future research will focus on further refining the model architecture, incorporating more advanced image processing techniques, and exploring explainable AI methods to provide deeper insights into the model's decision-making process.
PMID:39806960 | DOI:10.2174/0115734056326685250101113959
Comparing prediction accuracy for 30-day readmission following primary total knee arthroplasty: the ACS-NSQIP risk calculator versus a novel artificial neural network model
Knee Surg Relat Res. 2025 Jan 13;37(1):3. doi: 10.1186/s43019-024-00256-z.
ABSTRACT
BACKGROUND: Unplanned readmission, a measure of surgical quality, occurs after 4.8% of primary total knee arthroplasties (TKA). Although the prediction of individualized readmission risk may inform appropriate preoperative interventions, current predictive models, such as the American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) surgical risk calculator (SRC), have limited utility. This study aims to compare the predictive accuracy of the SRC with a novel artificial neural network (ANN) algorithm for 30-day readmission after primary TKA, using the same set of clinical variables from a large national database.
METHODS: Patients undergoing primary TKA between 2013 and 2020 were identified from the ACS-NSQIP database and randomly stratified into training and validation cohorts. The ANN was developed using data from the training cohort with fivefold cross-validation performed five times. ANN and SRC performance were subsequently evaluated in the distinct validation cohort, and predictive performance was compared on the basis of discrimination, calibration, accuracy, and clinical utility.
RESULTS: The overall cohort consisted of 365,394 patients (trainingN = 362,559; validationN = 2835), with 11,392 (3.1%) readmitted within 30 days. While the ANN demonstrated good discrimination and calibration (area under the curve (AUC)ANN = 0.72, slope = 1.32, intercept = -0.09) in the validation cohort, the SRC demonstrated poor discrimination (AUCSRC = 0.55) and underestimated readmission risk (slope = -0.21, intercept = 0.04). Although both models possessed similar accuracy (Brier score: ANN = 0.03; SRC = 0.02), only the ANN demonstrated a higher net benefit than intervening in all or no patients on the decision curve analysis. The strongest predictors of readmission were body mass index (> 33.5 kg/m2), age (> 69 years), and male sex.
CONCLUSIONS: This study demonstrates the superior predictive ability and potential clinical utility of the ANN over the conventional SRC when constrained to the same variables. By identifying the most important predictors of readmission following TKA, our findings may assist in the development of novel clinical decision support tools, potentially improving preoperative counseling and postoperative monitoring practices in at-risk patients.
PMID:39806502 | DOI:10.1186/s43019-024-00256-z
Effect of flipped classroom method on the reflection ability in nursing students in the professional ethics course; Solomon four-group design
BMC Med Educ. 2025 Jan 13;25(1):56. doi: 10.1186/s12909-024-06556-y.
ABSTRACT
BACKGROUND AND PURPOSE: The purpose of reflection in the learning process is to create meaningful and deep learning. Considering the importance of emphasizing active and student-centered methods in learning and the necessity of learners' participation in the education process, the present study was conducted to investigate the effect of flipped classroom teaching method on the amount of reflection ability in nursing students and the course of professional ethics.
STUDY METHOD: The current study is a quasi-experimental study using Solomon's four-group method. The statistical population included all nursing students who were taking the professional ethics course at Kermanshah University of Medical Sciences. The study tool was a 26-item questionnaire with acceptable validity and reliability. The sample size was 80 nursing students by simple random method and divided into four groups, which included: 1- experimental group 1 2- experimental group 2 3- control group 1, and control 2. The collected data were used by SPSS software and using descriptive statistics methods and two-way analysis of variance and analysis of covariance analysis.
FINDINGS: The findings showed that the four investigated groups do not have statistically significant differences in terms of gender composition (p = 0.599). There was no significant difference between the control and experimental groups in terms of all 5 reflection components in the pre-test. A significant difference was observed between the amount of reflection of the experimental and control groups.
CONCLUSION: Considering that there are controversial issues in the course of professional ethics, this method can be effective in the field of deep learning of students.
PMID:39806386 | DOI:10.1186/s12909-024-06556-y
Optimizing hip MRI: enhancing image quality and elevating inter-observer consistency using deep learning-powered reconstruction
BMC Med Imaging. 2025 Jan 13;25(1):17. doi: 10.1186/s12880-025-01554-y.
ABSTRACT
BACKGROUND: Conventional hip joint MRI scans necessitate lengthy scan durations, posing challenges for patient comfort and clinical efficiency. Previously, accelerated imaging techniques were constrained by a trade-off between noise and resolution. Leveraging deep learning-based reconstruction (DLR) holds the potential to mitigate scan time without compromising image quality.
METHODS: We enrolled a cohort of sixty patients who underwent DL-MRI, conventional MRI, and No-DL MRI examinations to evaluate image quality. Key metrics considered in the assessment included scan duration, overall image quality, quantitative assessments of Relative Signal-to-Noise Ratio (rSNR), Relative Contrast-to-Noise Ratio (rCNR), and diagnostic efficacy. Two experienced radiologists independently assessed image quality using a 5-point scale (5 indicating the highest quality). To gauge interobserver agreement for the assessed pathologies across image sets, we employed weighted kappa statistics. Additionally, the Wilcoxon signed rank test was employed to compare image quality and quantitative rSNR and rCNR measurements.
RESULTS: Scan time was significantly reduced with DL-MRI and represented an approximate 66.5% reduction. DL-MRI consistently exhibited superior image quality in both coronal T2WI and axial T2WI when compared to both conventional MRI (p < 0.01) and No-DL-MRI (p < 0.01). Interobserver agreement was robust, with kappa values exceeding 0.735. For rSNR data, coronal fat-saturated(FS) T2WI and axial FS T2WI in DL-MRI consistently outperformed No-DL-MRI, with statistical significance (p < 0.01) observed in all cases. Similarly, rCNR data revealed significant improvements (p < 0.01) in coronal FS T2WI of DL-MRI when compared to No-DL-MRI. Importantly, our findings indicated that DL-MRI demonstrated diagnostic performance comparable to conventional MRI.
CONCLUSION: Integrating deep learning-based reconstruction methods into standard clinical workflows has the potential to the promise of accelerating image acquisition, enhancing image clarity, and increasing patient throughput, thereby optimizing diagnostic efficiency.
TRIAL REGISTRATION: Retrospectively registered.
PMID:39806303 | DOI:10.1186/s12880-025-01554-y
MDFGNN-SMMA: prediction of potential small molecule-miRNA associations based on multi-source data fusion and graph neural networks
BMC Bioinformatics. 2025 Jan 13;26(1):13. doi: 10.1186/s12859-025-06040-4.
ABSTRACT
BACKGROUND: MicroRNAs (miRNAs) are pivotal in the initiation and progression of complex human diseases and have been identified as targets for small molecule (SM) drugs. However, the expensive and time-intensive characteristics of conventional experimental techniques for identifying SM-miRNA associations highlight the necessity for efficient computational methodologies in this field.
RESULTS: In this study, we proposed a deep learning method called Multi-source Data Fusion and Graph Neural Networks for Small Molecule-MiRNA Association (MDFGNN-SMMA) to predict potential SM-miRNA associations. Firstly, MDFGNN-SMMA extracted features of Atom Pairs fingerprints and Molecular ACCess System fingerprints to derive fusion feature vectors for small molecules (SMs). The K-mer features were employed to generate the initial feature vectors for miRNAs. Secondly, cosine similarity measures were computed to construct the adjacency matrices for SMs and miRNAs, respectively. Thirdly, these feature vectors and adjacency matrices were input into a model comprising GAT and GraphSAGE, which were utilized to generate the final feature vectors for SMs and miRNAs. Finally, the averaged final feature vectors were utilized as input for a multilayer perceptron to predict the associations between SMs and miRNAs.
CONCLUSIONS: The performance of MDFGNN-SMMA was assessed using 10-fold cross-validation, demonstrating superior compared to the four state-of-the-art models in terms of both AUC and AUPR. Moreover, the experimental results of an independent test set confirmed the model's generalization capability. Additionally, the efficacy of MDFGNN-SMMA was substantiated through three case studies. The findings indicated that among the top 50 predicted miRNAs associated with Cisplatin, 5-Fluorouracil, and Doxorubicin, 42, 36, and 36 miRNAs, respectively, were corroborated by existing literature and the RNAInter database.
PMID:39806287 | DOI:10.1186/s12859-025-06040-4