Deep learning
Opening the deep learning box
Nat Neurosci. 2025 Apr 4. doi: 10.1038/s41593-025-01938-x. Online ahead of print.
NO ABSTRACT
PMID:40186074 | DOI:10.1038/s41593-025-01938-x
Fast and Robust Single-Shot Cine Cardiac MRI Using Deep Learning Super-Resolution Reconstruction
Invest Radiol. 2025 Apr 7. doi: 10.1097/RLI.0000000000001186. Online ahead of print.
ABSTRACT
OBJECTIVE: The aim of the study was to compare the diagnostic quality of deep learning (DL) reconstructed balanced steady-state free precession (bSSFP) single-shot (SSH) cine images with standard, multishot (also: segmented) bSSFP cine (standard cine) in cardiac MRI.
METHODS AND MATERIALS: This prospective study was performed in a cohort of participants with clinical indication for cardiac MRI. SSH compressed-sensing bSSFP cine and standard multishot cine were acquired with breath-holding and electrocardiogram-gating in short-axis view at 1.5 Tesla. SSH cine images were reconstructed using an industry-developed DL super-resolution algorithm (DL-SSH cine). Two readers evaluated diagnostic quality (endocardial edge definition, blood pool to myocardium contrast and artifact burden) from 1 (nondiagnostic) to 5 (excellent). Functional left ventricular (LV) parameters were assessed in both sequences. Edge rise distance, apparent signal-to-noise ratio (aSNR) and contrast-to-noise ratio were calculated. Statistical analysis for the comparison of DL-SSH cine and standard cine included the Student's t-test, Wilcoxon signed-rank test, Bland-Altman analysis, and Pearson correlation.
RESULTS: Forty-five participants (mean age: 50 years ±18; 30 men) were included. Mean total scan time was 65% lower for DL-SSH cine compared to standard cine (92 ± 8 s vs 265 ± 33 s; P < 0.0001). DL-SSH cine showed high ratings for subjective image quality (eg, contrast: 5 [interquartile range {IQR}, 5-5] vs 5 [IQR, 5-5], P = 0.01; artifacts: 4.5 [IQR, 4-5] vs 5 [IQR, 4-5], P = 0.26), with superior values for sharpness parameters (endocardial edge definition: 5 [IQR, 5-5] vs 5 [IQR, 4-5], P < 0.0001; edge rise distance: 1.9 [IQR, 1.8-2.3] vs 2.5 [IQR, 2.3-2.6], P < 0.0001) compared to standard cine. No significant differences were found in the comparison of objective metrics between DL-SSH and standard cine (eg, aSNR: 49 [IQR, 38.5-70] vs 52 [IQR, 38-66.5], P = 0.74). Strong correlation was found between DL-SSH cine and standard cine for the assessment of functional LV parameters (eg, ejection fraction: r = 0.95). Subgroup analysis of participants with arrhythmia or unreliable breath-holding (n = 14/45, 31%) showed better image quality ratings for DL-SSH cine compared to standard cine (eg, artifacts: 4 [IQR, 4-5] vs 4 [IQR, 3-5], P = 0.04).
CONCLUSIONS: DL reconstruction of SSH cine sequence in cardiac MRI enabled accelerated acquisition times and noninferior diagnostic quality compared to standard cine imaging, with even superior diagnostic quality in participants with arrhythmia or unreliable breath-holding.
PMID:40184545 | DOI:10.1097/RLI.0000000000001186
Relationships Between Familial Factors, Learning Motivation, Learning Approaches, and Cognitive Flexibility Among Vocational Education and Training Students
J Psychol. 2025 Apr 4:1-24. doi: 10.1080/00223980.2025.2456801. Online ahead of print.
ABSTRACT
This study investigated the relationships between familial factors in terms of parental autonomy support and parental support and Vocational Education and Training (VET) students' learning motivation, learning approaches, and cognitive flexibility. In this cross-sectional study, a convenient sample of 557 VET students (males = 56.7% and females = 43.35; mean age = 18.41 and SD = 0.85) from ten vocational schools in Bangkok areas, Thailand, responded to a questionnaire of adapted scales on familial factors (i.e., parental autonomy support and parental support), learning motivation (i.e., intrinsic motivation, extrinsic motivation, and utility value), learning approaches (i.e., deep learning approaches and surface learning approaches), and cognitive flexibility (i.e., alternatives). Structural equation analyses revealed that parental autonomy support had indirect relationship with alternatives via learning motivation and deep learning approaches, whereas parental support had both direct and indirect association with alternatives through learning motivation and deep learning approaches. Surface learning approaches were not found to significantly predict alternatives. These findings suggest that a familial context that stresses autonomy support and helpful support from parents can motivate VET students to learn and adopt deep approaches to learning, which in turn encourages the development of their cognitive flexibility.
PMID:40184534 | DOI:10.1080/00223980.2025.2456801
MIST: An interpretable and flexible deep learning framework for single-T cell transcriptome and receptor analysis
Sci Adv. 2025 Apr 4;11(14):eadr7134. doi: 10.1126/sciadv.adr7134. Epub 2025 Apr 4.
ABSTRACT
Joint analysis of transcriptomic and T cell receptor (TCR) features at single-cell resolution provides a powerful approach for in-depth T cell immune function research. Here, we introduce a deep learning framework for single-T cell transcriptome and receptor analysis, MIST (Multi-insight for T cell). MIST features three latent spaces: gene expression, TCR, and a joint latent space. Through analyses of antigen-specific T cells, and T cell datasets related to lung cancer immunotherapy and COVID19, we demonstrate MIST's interpretability and flexibility. MIST easily and accurately resolves cell function and antigen specificity by vectorizing and integrating transcriptome and TCR data of T cells. In addition, using MIST, we identified the heterogeneity of CXCL13+ subsets in lung cancer infiltrating CD8+ T cells and their association with immunotherapy, providing additional insights into the functional transition of CXCL13+ T cells related to anti-PD-1 therapy that were not reported in the original study.
PMID:40184452 | DOI:10.1126/sciadv.adr7134
Deep learning-based uncertainty quantification for quality assurance in hepatobiliary imaging-based techniques
Oncotarget. 2025 Apr 4;16:249-255. doi: 10.18632/oncotarget.28709.
ABSTRACT
Recent advances in deep learning models have transformed medical imaging analysis, particularly in radiology. This editorial outlines how uncertainty quantification through embedding-based approaches enhances diagnostic accuracy and reliability in hepatobiliary imaging, with a specific focus on oncological conditions and early detection of precancerous lesions. We explore modern architectures like the Anisotropic Hybrid Network (AHUNet), which leverages both 2D imaging and 3D volumetric data through innovative convolutional approaches. We consider the implications for quality assurance in radiological practice and discuss recent clinical applications.
PMID:40184325 | DOI:10.18632/oncotarget.28709
Hessian-Aware Zeroth-Order Optimization
IEEE Trans Pattern Anal Mach Intell. 2025 Mar 7;PP. doi: 10.1109/TPAMI.2025.3548810. Online ahead of print.
ABSTRACT
Zeroth-order optimization algorithms recently emerge as a popular research theme in optimization and machine learning, playing important roles in many deep-learning related tasks such as black-box adversarial attack, deep reinforcement learning, as well as hyper-parameter tuning. Mainstream zeroth-order optimization algorithms, however, concentrate on exploiting zeroth-order-estimated first-order gradient information of the objective landscape. In this paper, we propose a novel meta-algorithm called Hessian-Aware Zeroth-Order (ZOHA) optimization algorithm, which utilizes several canonical variants of zeroth-order-estimated second-order Hessian information of the objective: power-method-based, and Gaussian-smoothing-based. We conclude theoretically that ZOHA enjoys an improved convergence rate compared with existing work without incorporating in zeroth-order optimization second-order Hessian information. Empirical studies on logistic regression as well as the black-box adversarial attack are provided to validate the effectiveness and improved success rates with reduced query complexity of the zeroth-order oracle.
PMID:40184293 | DOI:10.1109/TPAMI.2025.3548810
Short-Term Residential Load Forecasting Framework Based on Spatial-Temporal Fusion Adaptive Gated Graph Convolution Networks
IEEE Trans Neural Netw Learn Syst. 2025 Apr 4;PP. doi: 10.1109/TNNLS.2025.3551778. Online ahead of print.
ABSTRACT
Enhancing the prediction of volatile and intermittent electric loads is one of the pivotal elements that contributes to the smooth functioning of modern power grids. However, conventional deep learning-based forecasting techniques fall short in simultaneously taking into account both the temporal dependencies of historical loads and the spatial structure between residential units, resulting in a subpar prediction performance. Furthermore, the representation of the spatial graph structure is frequently inadequate and constrained, along with the complexities inherent in Spatial-Temporal data, impeding the effective learning among different households. To alleviate those shortcomings, this article proposes a novel framework: Spatial-Temporal fusion adaptive gated graph convolution networks (STFAG-GCNs), tailored for residential short-term load forecasting (STLF). Spatial-Temporal fusion graph construction is introduced to compensate for several existing correlations where additional information are not known or unreflected in advance. Through an innovative gated adaptive fusion graph convolution (AFG-Conv) mechanism, Spatial-Temporal fusion graph convolution network (STFGCN) dynamically model the Spatial-Temporal correlations implicitly. Meanwhile, by integrating a gated temporal convolutional network (Gated TCN) and multiple STFGCNs into a unified Spatial-Temporal fusion layer, STFAG-GCN handles long sequences by stacking layers. Experimental results on real-world datasets validate the accuracy and robustness of STFAG-GCN in forecasting short-term residential loads, highlighting its advancements over state-of-the-art methods. Ablation experiments further reveal its effectiveness and superiority.
PMID:40184286 | DOI:10.1109/TNNLS.2025.3551778
Unknown-Aware Bilateral Dependency Optimization for Defending Against Model Inversion Attacks
IEEE Trans Pattern Anal Mach Intell. 2025 Apr 4;PP. doi: 10.1109/TPAMI.2025.3558267. Online ahead of print.
ABSTRACT
By abusing access to a well-trained classifier, model inversion (MI) attacks pose a significant threat as they can recover the original training data, leading to privacy leakage. Previous studies mitigated MI attacks by imposing regularization to reduce the dependency between input features and outputs during classifier training, a strategy known as unilateral dependency optimization. However, this strategy contradicts the objective of minimizing the supervised classification loss, which inherently seeks to maximize the dependency between input features and outputs. Consequently, there is a trade-off between improving the model's robustness against MI attacks and maintaining its classification performance. To address this issue, we propose the bilateral dependency optimization strategy (BiDO), a dual-objective approach that minimizes the dependency between input features and latent representations, while simultaneously maximizing the dependency between latent representations and labels. BiDO is remarkable for its privacy-preserving capabilities. However, models trained with BiDO exhibit diminished capabilities in out-of-distribution (OOD) detection compared to models trained with standard classification supervision. Given the open-world nature of deep learning systems, this limitation could lead to significant security risks, as encountering OOD inputs-whose label spaces do not overlap with the in-distribution (ID) data used during training-is inevitable. To address this, we leverage readily available auxiliary OOD data to enhance the OOD detection performance of models trained with BiDO. This leads to the introduction of an upgraded framework, unknown-aware BiDO (BiDO+), which mitigates both privacy and security concerns. As a highlight, with comparable model utility, BiDO-HSIC+ reduces the FPR95 by $55.02\%$ and enhances the AUCROC by $9.52\%$ compared to BiDO-HSIC, while also providing superior MI robustness.
PMID:40184277 | DOI:10.1109/TPAMI.2025.3558267
LETA: Tooth Alignment Prediction Based on Dual-branch Latent Encoding
IEEE Trans Vis Comput Graph. 2024 Jun 20;PP. doi: 10.1109/TVCG.2024.3413857. Online ahead of print.
ABSTRACT
Accurately determining the clinical positions for each tooth is essential in orthodontics, while most existing solutions heavily rely on inefficient manual design. In this paper, we present the LETA, a dual-branch Latent Encoding based 3D Tooth Alignment. Our system takes as input the segmented individual 3D tooth meshes in the Intra-oral Scanner (IOS) dental surfaces, and automatically predicts the proper 3D pose transformation for each tooth. LETA includes three components: an Encoder that learns a latent code of dental pointcloud, a Projector that transforms the latent code of misaligned teeth to predicted aligned ones, and a Solver to estimate the transformation between different dental latent codes. A key novelty of LETA is that we extract the features from the ground truth (GT) aligned teeth to guide network learning during training. To effectively learn tooth features, our Encoder employs an improved point-wise convolutional operation and an attention-based network to extract local shape features and global context features respectively. Extensive experimental results on a large-scale dataset with 9,868 IOS surfaces demonstrate that LETA can achieve state-of-the-art performance. A further clinical applicability study reveals that our method can reduce orthodontists' workload over 60% compared to starting tooth alignment from scratch, demonstrating the strong potential of deep learning for future digital dentistry.
PMID:40184274 | DOI:10.1109/TVCG.2024.3413857
Using generative adversarial deep learning networks to synthesize cerebrovascular reactivity imaging from pre-acetazolamide arterial spin labeling in moyamoya disease
Neuroradiology. 2025 Apr 4. doi: 10.1007/s00234-025-03605-1. Online ahead of print.
ABSTRACT
BACKGROUND: Cerebrovascular reactivity (CVR) assesses vascular health in various brain conditions, but CVR measurement requires a challenge to cerebral perfusion such as the administration of acetazolamide(ACZ), thus limiting widespread use. We determined whether generative adversarial networks (GANs) can create CVR images from baseline pre-ACZ arterial spin labeling (ASL) MRI.
METHODS: This study included 203 Moyamoya cases with a total of 3248 pre- and post-ACZ ASL Cerebral Blood Flow (CBF) images. Reference CVRs were generated from these CBF slices. From this set, 2640 slices were used to train a Pixel-to-Pixel GAN consisting of a generator and discriminator network, with the remaining 608 slices reserved as a testing set. Following training, the pre-ACZ CBF in the testing set was introduced to the trained model to generate synthesized CVR. The quality of the synthesized CVR was evaluated with structural similarity index(SSI), spatial correlation coefficient(SCC), and the root mean squared error(RMSE), compared with reference CVR. The segmentations of the low CVR regions were compared using the Dice similarity coefficient (DSC). Reference and synthesized CVRs in single-slice and individual-hemisphere settings were reviewed to assess CVR status, with Cohen's Kappa measuring consistency.
RESULTS: The mean SSIs of the CVR of training and testing sets were 0.943 ± 0.019 and 0.943 ± 0.020. The mean SCCs of the CVR of training and testing sets were 0.988 ± 0.009 and 0.987 ± 0.011. The mean RMSEs of the CVR are 0.077 ± 0.015 and 0.079 ± 0.018. Mean DSC of low CVR area of testing sets was 0.593 ± 0.128. Visual interpretation yielded Cohen's Kappa values of 0.896 and 0.813 for the training and testing sets in the single-slice setting, and 0.781 and 0.730 in the individual-hemisphere setting.
CONCLUSIONS: Synthesized CVR by GANs from baseline ASL without challenge may be a useful alternative in detecting vascular deficits in clinical applications when ACZ challenge is not feasible.
PMID:40183965 | DOI:10.1007/s00234-025-03605-1
Interpretable multimodal deep learning model for predicting post-surgical international society of urological pathology grade in primary prostate cancer
Eur J Nucl Med Mol Imaging. 2025 Apr 4. doi: 10.1007/s00259-025-07248-5. Online ahead of print.
ABSTRACT
PURPOSE: To address heterogeneity in prostate cancer (PCa) pathological grading, we developed an interpretable multimodal fusion model integrating 18F prostate-specific membrane antigen (18F-PSMA)-targeted positron emission tomography/computed tomography (18F-PSMA-PET/CT) imaging features with clinical variables for predicting post-surgical ISUP grade (psISUP ≥ 4 vs. < 4).
METHODS: This retrospective study analyzed 222 patients with PCa (2020-2024) undergoing 18F-PSMA PET/CT. We constructed a deep transfer learning framework incorporating radiomic features from PET/CT and clinical parameters. Model performance was validated against three established methods and preoperative biopsy Gleason scores. Additionally, SHapley Additive exPlanations (SHAP) values elucidated feature contributions, and a radiomic nomogram was developed for clinical translation.
RESULTS: The fusion model achieved superior discrimination in psISUP grading (test set area under the curve (AUC) = 0.850, 95% confidence interval [CI] 0.769-0.932; validation set AUC = 0.833, 95% CI 0.657-1.000), significantly outperforming preoperative Gleason scores. SHAP analysis identified PSMA uptake heterogeneity and PSA density as key predictive features. The nomogram demonstrated clinical interpretability through visualised risk stratification.
CONCLUSION: Our deep learning-based multimodal fusion model enables accurate preoperative prediction of aggressive PCa pathology (ISUP ≥ 4), potentially optimising surgical planning and personalised therapeutic strategies. The interpretable framework enhances clinical trustworthiness in artificial intelligence-assisted decision-making.
PMID:40183953 | DOI:10.1007/s00259-025-07248-5
Hypermetabolic pulmonary lesions detection and diagnosis based on PET/CT imaging and deep learning models
Eur J Nucl Med Mol Imaging. 2025 Apr 4. doi: 10.1007/s00259-025-07215-0. Online ahead of print.
ABSTRACT
PURPOSE: This study aims to develop and evaluate deep learning models for the detection and classification of hypermetabolic lung lesions into four categories: benign, lung cancer, pulmonary lymphoma, and metastasis. These categories are defined by their pathological origin, clinical relevance, and therapeutic implications.
METHODS: A lesion localisation model was first developed using manually annotated PET/CT images. For classification, a multi-dimensional joint network was employed, incorporating both image patches and two-dimensional projections. Classification performance was quantified by metrics like accuracy, and compared to that of a radiomics model. Additionally, false-positive segmentations were manually reviewed and analysed for clinical evaluation.
RESULTS: The study retrospectively included 647 cases (409 males/238 females) over more than 8 years from five centres, divided into an internal dataset (426 cases from Shanghai Ruijin Hospital), an external test set I (151 cases from four other institutions), and an external test set II (70 cases from a new imaging device). The localisation model achieved detection rates of 81.19%, 75.48%, and 77.59% on the internal, external test set I, and external test set II, respectively. The classification model outperformed the radiomics approach, with area-under-curves of 88.4%, 80.7%, and 66.6%, respectively. Most false-positive segmentations were clinically acceptable, corresponding to suspicious lesions in adjacent regions, particularly lymph nodes.
CONCLUSION: Deep learning models based on PET/CT imaging can effectively detect, segment, and classify hypermetabolic lung lesions, and identify suspicious adjacent lesions. These results highlight the potential of artificial intelligence in clinical decision-making and lung disease diagnosis.
PMID:40183951 | DOI:10.1007/s00259-025-07215-0
Intelligent meningioma grading based on medical features
Med Phys. 2025 Apr 4. doi: 10.1002/mp.17808. Online ahead of print.
ABSTRACT
BACKGROUND: Meningiomas are the most common primary intracranial tumors in adults. Low-grade meningiomas have a low recurrence rate, whereas high-grade meningiomas are highly aggressive and recurrent. Therefore, the pathological grading information is crucial for treatment, as well as follow-up and prognostic guidance. Most previous studies have used radiomics or deep learning methods to extract feature information for grading meningiomas. However, some radiomics features are pixel-level features that can be influenced by factors such as image resolution and sharpness. Additionally, deep learning models that perform grading directly from MRI images often rely on image features that are ambiguous and uncontrollable, which reduces the reliability of the results to a certain extent.
PURPOSE: We aim to validate that combining medical features with deep neural networks can effectively improve the accuracy and reliability of meningioma grading.
METHODS: We construct a SNN-Tran model for grading meningiomas by analyzing medical features including tumor volume, peritumoral edema volume, dural tail sign, tumor location, the ratio of peritumoral edema volume to tumor volume, age and gender. This method is able to better capture the complex relationships and interactions in the medical features and enhance the reliability of the prediction results.
RESULTS: Our model achieve an accuracy of 0.875, sensitivity of 0.886, specificity of 0.847, and AUC of 0.872. And the method is superior to the deep learning, radiomics and SOTA methods.
CONCLUSION: We demonstrate that combining medical features with SNN-Tran can effectively improve the accuracy and reliability of meningioma grading. The SNN-Tran model excel in capturing long-range dependencies in the medical feature sequence.
PMID:40183528 | DOI:10.1002/mp.17808
Attention-based Vision Transformer Enables Early Detection of Radiotherapy-Induced Toxicity in Magnetic Resonance Images of a Preclinical Model
Technol Cancer Res Treat. 2025 Jan-Dec;24:15330338251333018. doi: 10.1177/15330338251333018. Epub 2025 Apr 4.
ABSTRACT
IntroductionEarly identification of patients at risk for toxicity induced by radiotherapy (RT) is essential for developing personalized treatments and mitigation plans. Preclinical models with relevant endpoints are critical for systematic evaluation of normal tissue responses. This study aims to determine whether attention-based vision transformers can classify MR images of irradiated and control mice, potentially aiding early identification of individuals at risk of developing toxicity.MethodC57BL/6J mice (n = 14) were subjected to 66 Gy of fractionated RT targeting the oral cavity, swallowing muscles, and salivary glands. A control group (n = 15) received no irradiation but was otherwise treated identically. T2-weighted MR images were obtained 3-5 days post-irradiation. Late toxicity in terms of saliva production in individual mice was assessed at day 105 after treatment. A pre-trained vision transformer model (ViT Base 16) was employed to classify the images into control and irradiated groups.ResultsThe ViT Base 16 model classified the MR images with an accuracy of 69%, with identical overall performance for control and irradiated animals. The ViT's model predictions showed a significant correlation with late toxicity (r = 0.65, p < 0.01). One of the attention maps from the ViT model highlighted the irradiated regions of the animals.ConclusionsAttention-based vision transformers using MRI have the potential to predict individuals at risk of developing early toxicity. This approach may enhance personalized treatment and follow-up strategies in head and neck cancer radiotherapy.
PMID:40183426 | DOI:10.1177/15330338251333018
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
ABSTRACT
BackgroundIn this research, we explore the application of Convolutional Neural Networks (CNNs) for the development of an automated cancer detection system, particularly for MRI images. By leveraging deep learning and image processing techniques, we aim to build a system that can accurately detect and classify tumors in medical images. The system's performance depends on several stages, including image enhancement, segmentation, data augmentation, feature extraction, and classification. Through these stages, CNNs can be effectively trained to detect tumors in MRI images with high accuracy. This automated cancer detection system can assist healthcare professionals in diagnosing cancer more quickly and accurately, improving patient outcomes. The integration of deep learning and image processing in medical diagnostics has the potential to revolutionize healthcare, making it more efficient and accessible.MethodsIn this paper, we examine the failure of semantic segmentation by predicting the mean intersection over union (mIoU), which is a standard evaluation metric for segmentation tasks. mIoU calculates the overlap between the predicted segmentation map and the ground truth segmentation map, offering a way to evaluate the model's performance. A low mIoU indicates poor segmentation, suggesting that the model has failed to accurately classify parts of the image. To further improve the robustness of the system, we introduce a deep neural network capable of predicting the mIoU of a segmentation map. The key innovation here is the ability to predict the mIoU without needing access to ground truth data during testing. This allows the system to estimate how well the model is performing on a given image and detect potential failure cases early in the process. The proposed method not only predicts the mIoU but also uses the expected mIoU value to detect failure events. For instance, if the predicted mIoU falls below a certain threshold, the system can flag this as a potential failure, prompting further investigation or triggering a safety mechanism in the autonomous vehicle. This mechanism can prevent the vehicle from making decisions based on faulty segmentation, improving safety and performance. Furthermore, the system is designed to handle imbalanced data, which is a common challenge in training deep learning models. In autonomous driving, certain objects, such as pedestrians or cyclists, might appear much less frequently than other objects like vehicles or roads. The imbalance can cause the model to be biased toward the more frequent objects. By leveraging the expected mIoU, the method can effectively balance the influence of different object classes, ensuring that the model does not overlook critical elements in the scene. This approach offers a novel way of not only training the model to be more accurate but also incorporating failure prediction as an additional layer of safety. It is a significant step forward in ensuring that autonomous systems, especially self-driving cars, operate in a safe and reliable manner, minimizing the risk of accidents caused by misinterpretations of visual data. In summary, this research introduces a deep learning model that predicts segmentation performance and detects failure events by using the mIoU metric. By improving both the accuracy of semantic segmentation and the detection of failures, we contribute to the development of more reliable autonomous driving systems. Moreover, the technique can be extended to other domains where segmentation plays a critical role, such as medical imaging or robotics, enhancing their ability to function safely and effectively in complex environments.Results and DiscussionBrain tumor detection from MRI images is a critical task in medical image analysis that can significantly impact patient outcomes. By leveraging a hybrid approach that combines traditional image processing techniques with modern deep learning methods, this research aims to create an automated system that can segment and identify brain tumors with high accuracy and efficiency. Deep learning techniques, particularly CNNs, have proven to be highly effective in medical image analysis due to their ability to learn complex features from raw image data. The use of deep learning for automated brain tumor segmentation provides several benefits, including faster processing times, higher accuracy, and more consistent results compared to traditional manual methods. As a result, this research not only contributes to the development of advanced methods for brain tumor detection but also demonstrates the potential of deep learning in revolutionizing medical image analysis and assisting healthcare professionals in diagnosing and treating brain tumors more effectively.ConclusionIn conclusion, this research demonstrates the potential of deep learning techniques, particularly CNNs, in automating the process of brain tumor detection from MRI images. By combining traditional image processing methods with deep learning, we have developed an automated system that can quickly and accurately segment tumors from MRI scans. This system has the potential to assist healthcare professionals in diagnosing and treating brain tumors more efficiently, ultimately improving patient outcomes. As deep learning continues to evolve, we expect these systems to become even more accurate, robust, and widely applicable in clinical settings. The use of deep learning for brain tumor detection represents a significant step forward in medical image analysis, and its integration into clinical workflows could greatly enhance the speed and accuracy of diagnosis, ultimately saving lives. The suggested plan also includes a convolutional neural network-based classification technique to improve accuracy and save computation time. Additionally, the categorization findings manifest as images depicting either a healthy brain or one that is cancerous. CNN, a form of deep learning, employs a number of feed-forward layers. Additionally, it functions using Python. The Image Net database groups the images. The database has already undergone training and preparation. Therefore, we have completed the final training layer. Along with depth, width, and height feature information, CNN also extracts raw pixel values.We then use the Gradient decent-based loss function to achieve a high degree of precision. We can determine the training accuracy, validation accuracy, and validation loss separately. 98.5% of the training is accurate. Similarly, both validation accuracy and validation loss are high.
PMID:40183298 | DOI:10.1177/18758592241311184
Code-Free Deep Learning Glaucoma Detection on Color Fundus Images
Ophthalmol Sci. 2025 Jan 30;5(4):100721. doi: 10.1016/j.xops.2025.100721. eCollection 2025 Jul-Aug.
ABSTRACT
OBJECTIVE: Code-free deep learning (CFDL) allows clinicians with no coding experience to build their own artificial intelligence models. This study assesses the performance of CFDL in glaucoma detection from fundus images in comparison to expert-designed models.
DESIGN: Deep learning model development, testing, and validation.
SUBJECTS: A total of 101 442 labeled fundus images from the Rotterdam EyePACS Artificial Intelligence for Robust Glaucoma Screening (AIROGS) dataset were included.
METHODS: Ophthalmology trainees without coding experience designed a CFDL binary model using the Rotterdam EyePACS AIROGS dataset of fundus images (101 442 labeled images) to differentiate glaucoma from normal optic nerves. We compared our results with bespoke models from the literature. We then proceeded to externally validate our model using 2 datasets, the Retinal Fundus Glaucoma Challenge (REFUGE) and the Glaucoma grading from Multi-Modality imAges (GAMMA) at 0.1, 0.3, and 0.5 confidence thresholds.
MAIN OUTCOME MEASURES: Area under the precision-recall curve (AuPRC), sensitivity at 95% specificity (SE@95SP), accuracy, area under the receiver operating curve (AUC), and positive predictive value (PPV).
RESULTS: The CFDL model showed high performance metrics that were comparable to the bespoke deep learning models. Our single-label classification model had an AuPRC of 0.988, an SE@95SP of 95%, and an accuracy of 91% (compared with 85% SE@95SP for the top bespoke models). Using the REFUGE dataset for external validation, our model had an SE@95SP, AUC, PPV, and accuracy of 83%, 0.960%, 73% to 94%, and 95% to 98%, respectively, at the 0.1, 0.3, and 0.5 confidence threshold cutoffs. Using the GAMMA dataset for external validation at the same confidence threshold cutoffs, our model had an SE@95SP, AUC, PPV, and accuracy of 98%, 0.994%, 94% to 96%, and 94% to 97%, respectively.
CONCLUSION: The capacity of CFDL models to perform glaucoma screening using fundus images presents a compelling proof of concept, empowering clinicians to explore innovative model designs for broad glaucoma screening in the near future.
FINANCIAL DISCLOSURES: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
PMID:40182983 | PMC:PMC11964632 | DOI:10.1016/j.xops.2025.100721
The Role of Artificial Intelligence in Epiretinal Membrane Care: A Scoping Review
Ophthalmol Sci. 2024 Dec 20;5(4):100689. doi: 10.1016/j.xops.2024.100689. eCollection 2025 Jul-Aug.
ABSTRACT
TOPIC: In ophthalmology, artificial intelligence (AI) demonstrates potential in using ophthalmic imaging across diverse diseases, often matching ophthalmologists' performance. However, the range of machine learning models for epiretinal membrane (ERM) management, which differ in methodology, application, and performance, remains largely unsynthesized.
CLINICAL RELEVANCE: Epiretinal membrane management relies on clinical evaluation and imaging, with surgical intervention considered in cases of significant impairment. AI analysis of ophthalmic images and clinical features could enhance ERM detection, characterization, and prognostication, potentially improving clinical decision-making. This scoping review aims to evaluate the methodologies, applications, and reported performance of AI models in ERM diagnosis, characterization, and prognostication.
METHODS: A comprehensive literature search was conducted across 5 electronic databases including Ovid MEDLINE, EMBASE, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, and Web of Science Core Collection from inception to November 14, 2024. Studies pertaining to AI algorithms in the context of ERM were included. The primary outcomes measured will be the reported design, application in ERM management, and performance of each AI model.
RESULTS: Three hundred ninety articles were retrieved, with 33 studies meeting inclusion criteria. There were 30 studies (91%) reporting their training and validation methods. Altogether, 61 distinct AI models were included. OCT scans and fundus photographs were used in 26 (79%) and 7 (21%) papers, respectively. Supervised learning and both supervised and unsupervised learning were used in 32 (97%) and 1 (3%) studies, respectively. Twenty-seven studies (82%) developed or adapted AI models using images, whereas 5 (15%) had models using both images and clinical features, and 1 (3%) used preoperative and postoperative clinical features without ophthalmic images. Study objectives were categorized into 3 stages of ERM care. Twenty-three studies (70%) implemented AI for diagnosis (stage 1), 1 (3%) identified ERM characteristics (stage 2), and 6 (18%) predicted vision impairment after diagnosis or postoperative vision outcomes (stage 3). No articles studied treatment planning. Three studies (9%) used AI in stages 1 and 2. Of the 16 studies comparing AI performance to human graders (i.e., retinal specialists, general ophthalmologists, and trainees), 10 (63%) reported equivalent or higher performance.
CONCLUSION: Artificial intelligence-driven assessments of ophthalmic images and clinical features demonstrated high performance in detecting ERM, identifying its morphological properties, and predicting visual outcomes following ERM surgery. Future research might consider the validation of algorithms for clinical applications in personal treatment plan development, ideally to identify patients who might benefit most from surgery.
FINANCIAL DISCLOSURES: The author(s) have no proprietary or commercial interest in any materials discussed in this article.
PMID:40182981 | PMC:PMC11964620 | DOI:10.1016/j.xops.2024.100689
MDNN-DTA: a multimodal deep neural network for drug-target affinity prediction
Front Genet. 2025 Mar 20;16:1527300. doi: 10.3389/fgene.2025.1527300. eCollection 2025.
ABSTRACT
Determining drug-target affinity (DTA) is a pivotal step in drug discovery, where in silico methods can significantly improve efficiency and reduce costs. Artificial intelligence (AI), especially deep learning models, can automatically extract high-dimensional features from the biological sequences of drug molecules and target proteins. This technology demonstrates lower complexity in DTA prediction compared to traditional experimental methods, particularly when handling large-scale data. In this study, we introduce a multimodal deep neural network model for DTA prediction, referred to as MDNN-DTA. This model employs Graph Convolutional Networks (GCN) and Convolutional Neural Networks (CNN) to extract features from the drug and protein sequences, respectively. One notable strength of our method is its ability to accurately predict DTA directly from the sequences of the target proteins, obviating the need for protein 3D structures, which are frequently unavailable in drug discovery. To comprehensively extract features from the protein sequence, we leverage an ESM pre-trained model for extracting biochemical features and design a specific Protein Feature Extraction (PFE) block for capturing both global and local features of the protein sequence. Furthermore, a Protein Feature Fusion (PFF) Block is engineered to augment the integration of multi-scale protein features derived from the abovementioned techniques. We then compare MDNN-DTA with other models on the same dataset, conducting a series of ablation experiments to assess the performance and efficacy of each component. The results highlight the advantages and effectiveness of the MDNN-DTA method.
PMID:40182923 | PMC:PMC11965683 | DOI:10.3389/fgene.2025.1527300
PMPred-AE: a computational model for the detection and interpretation of pathological myopia based on artificial intelligence
Front Med (Lausanne). 2025 Mar 13;12:1529335. doi: 10.3389/fmed.2025.1529335. eCollection 2025.
ABSTRACT
INTRODUCTION: Pathological myopia (PM) is a serious visual impairment that may lead to irreversible visual damage or even blindness. Timely diagnosis and effective management of PM are of great significance. Given the increasing number of myopia cases worldwide, there is an urgent need to develop an automated, accurate, and highly interpretable PM diagnostic technology.
METHODS: We proposed a computational model called PMPred-AE based on EfficientNetV2-L with attention mechanism optimization. In addition, Gradient-weighted class activation mapping (Grad-CAM) technology was used to provide an intuitive and visual interpretation for the model's decision-making process.
RESULTS: The experimental results demonstrated that PMPred-AE achieved excellent performance in automatically detecting PM, with accuracies of 98.50, 98.25, and 97.25% in the training, validation, and test datasets, respectively. In addition, PMPred-AE can focus on specific areas of PM image when making detection decisions.
DISCUSSION: The developed PMPred-AE model is capable of reliably providing accurate PM detection. In addition, the Grad-CAM technology was also used to provide an intuitive and visual interpretation for the decision-making process of the model. This approach provides healthcare professionals with an effective tool for interpretable AI decision-making process.
PMID:40182849 | PMC:PMC11965940 | DOI:10.3389/fmed.2025.1529335
Artificial intelligence optimizes the standardized diagnosis and treatment of chronic sinusitis
Front Physiol. 2025 Mar 13;16:1522090. doi: 10.3389/fphys.2025.1522090. eCollection 2025.
ABSTRACT
BACKGROUND: Standardised management of chronic sinusitis (CRS) is a challenging but vital area of research. Not only is accurate diagnosis and individualised treatment plans required, but post-treatment chronic disease management is also indispensable. With the development of artificial intelligence (AI), more "AI + medical" application models are emerging. Many AI-assisted systems have been applied to the diagnosis and treatment of CRS, providing valuable solutions for clinical practice.
OBJECTIVE: This study summarises the research progress of various AI-assisted systems applied to the clinical diagnosis and treatment of CRS, focusing on their role in imaging and pathological diagnosis and prognostic prediction and treatment.
METHODS: We used PubMed, Web of Science, and other Internet search engines with "artificial intelligence"、"machine learning" and "chronic sinusitis" as the keywords to conduct a literature search for studies from the last 7 years. We included literature eligible for AI application to CRS diagnosis and treatment in our study, excluded literature outside this scope, and categorized it according to its clinical application to CRS diagnosis, treatment, and prognosis prediction. We provide an overview and summary of current advances in AI to optimize the diagnosis and treatment of CRS, as well as difficulties and challenges in promoting standardization of clinical diagnosis and treatment in this area.
RESULTS: Through applications in CRS imaging and pathology diagnosis, personalised medicine and prognosis prediction, AI can significantly reduce turnaround times, lower diagnostic costs and accurately predict disease outcomes. However, a number of challenges remain. These include a lack of AI product standards, standardised data, difficulties in collaboration between different healthcare providers, and the non-interpretability of AI systems. There may also be data privacy issues involved. Therefore, more research and improvements are needed to realise the full potential of AI in the diagnosis and treatment of CRS.
CONCLUSION: Our findings inform the clinical diagnosis and treatment of CRS and the development of AI-assisted clinical diagnosis and treatment systems. We provide recommendations for AI to drive standardisation of CRS diagnosis and treatment.
PMID:40182690 | PMC:PMC11966420 | DOI:10.3389/fphys.2025.1522090