Deep learning
Prediction of drug-target interactions based on substructure subsequences and cross-public attention mechanism
PLoS One. 2025 May 30;20(5):e0324146. doi: 10.1371/journal.pone.0324146. eCollection 2025.
ABSTRACT
Drug-target interactions (DTIs) play a critical role in drug discovery and repurposing. Deep learning-based methods for predicting drug-target interactions are more efficient than wet-lab experiments. The extraction of original and substructural features from drugs and proteins plays a key role in enhancing the accuracy of DTI predictions, while the integration of multi-feature information and effective representation of interaction data also impact the precision of DTI forecasts. Consequently, we propose a drug-target interaction prediction model, SSCPA-DTI, based on substructural subsequences and a cross co-attention mechanism. We use drug SMILES sequences and protein sequences as inputs for the model, employing a Multi-feature information mining module (MIMM) to extract original and substructural features of DTIs. Substructural information provides detailed insights into molecular local structures, while original features enhance the model's understanding of the overall molecular architecture. Subsequently, a Cross-public attention module (CPA) is utilized to first integrate the extracted original and substructural features, then to extract interaction information between the protein and drug, addressing issues such as insufficient accuracy and weak interpretability arising from mere concatenation without interactive integration of feature information. We conducted experiments on three public datasets and demonstrated superior performance compared to baseline models.
PMID:40445972 | DOI:10.1371/journal.pone.0324146
Deep learning reconstruction of free-breathing, diffusion-weighted imaging of the liver: A comparison with conventional free-breathing acquisition
PLoS One. 2025 May 30;20(5):e0320362. doi: 10.1371/journal.pone.0320362. eCollection 2025.
ABSTRACT
This study aimed to compare image quality and solid focal liver lesion (FLL) assessments between free-breathing, diffusion-weighted imaging using deep learning reconstruction (FB-DL-DWI) and conventional DWI (FB-C-DWI) in patients undergoing clinically indicated liver MRIs. Our retrospective study included 199 patients who underwent 3 T-liver MRIs with FB-DL-DWI and FB-C-DWI. DWI was performed using a single-shot, spin-echo, echo-planar, fat suppression technique during free-breathing with matching parameters. Three radiologists independently evaluated subjective image quality across two sequences. The apparent diffusion coefficient (ADC) was measured in 15 liver regions. Four radiologists analyzed 138 solid FLLs from 60 patients for the presence of diffusion restriction, lesion conspicuity, and sharpness. Among the 199 patients, 110 (55.3%) had underlying chronic liver disease (CLD). FB-DL-DWI was found to be 43.0% faster than FB-C-DWI (119.4 ± 2.2 sec vs. 209.6 ± 3.7 sec). Furthermore, FB-DL-DWI scored higher than FB-C-DWI for all subjective image quality parameters (all, P < 0.001); however, FB-DL-DWI exhibited greater artificial sensation than FB-C-DWI (P < 0.001). In patients with CLD, FB-DL-DWI exhibited a better subjective image quality (all, P < 0.001) than FB-C-DWI. ADC values ranged from 1.06-1.12 × 10-3 mm2/sec in FB-DL-DWI and 1.06-1.20 × 10-3 mm2/sec in FB-C-DWI. Among the 138 lesions analyzed, 116 malignancies (61 hepatocellular carcinomas, 3 cholangiocarcinomas, 52 metastases) and 22 benignities were included. Four readers identified 88, 93, 93, and 105 diffusion-restricted FLLs in FB-DL-DWI and 84, 80, 98, and 95 in FB-C-DWI. FB-DL-DWI (75.9-90.5%) demonstrated comparable or superior diffusion restriction rates for malignant FLLs compared to FB-C-DWI (68.1-82.8%). Furthermore, FB-DL-DWI presented higher lesion-edge sharpness and lesion-conspicuity compared to FB-C-DWI. Overall, FB-DL-DWI provided better image quality, lesion sharpness, and conspicuity for solid FLLs, with a shorter acquisition time than FB-C-DWI. Therefore, FB-DL-DWI may replace FB-C-DWI as the preferred imaging method for liver evaluations.
PMID:40445963 | DOI:10.1371/journal.pone.0320362
EODA: A three-stage efficient outlier detection approach using Boruta-RF feature selection and enhanced KNN-based clustering algorithm
PLoS One. 2025 May 30;20(5):e0322738. doi: 10.1371/journal.pone.0322738. eCollection 2025.
ABSTRACT
Outlier detection is essential for identifying unusual patterns or observations that significantly deviate from the normal behavior of a dataset. With the rapid growth of data science, the prevalence of anomalies and outliers has increased, which can disrupt system modeling and parameter estimation, leading to inaccurate results. Recently, deep learning-based outlier detection methods have gained significant attention, but their performance is often limited by challenges in parameter selection and the nearest neighbor search. To overcome these limitations, we propose a three-stage Efficient Outlier Detection Approach (named EODA), that not only detects outliers with high accuracy but also emphasizes dataset characteristics. In the first stage, we apply a feature selection algorithm based on the Boruta method and Random Forest to reduce the data size by selecting the most relevant attributes and calculating the highest Z-score of shadow features. In the second stage, we improve the K-nearest neighbors algorithm to enhance the accuracy of nearest neighbor identification in the clustering phase. Finally, the third stage efficiently identifies the most significant outliers within clustered datasets. We evaluate the proposed EODA algorithm across eight UCI machine-learning repository datasets. The results demonstrate the effectiveness of our EODA approach, achieving a Precision of 63.07%, Recall of 82.49%, and an F1-Score of 64.53%, outperforming the existing techniques in the field.
PMID:40445940 | DOI:10.1371/journal.pone.0322738
XLLC-Net: A lightweight and explainable CNN for accurate lung cancer classification using histopathological images
PLoS One. 2025 May 30;20(5):e0322488. doi: 10.1371/journal.pone.0322488. eCollection 2025.
ABSTRACT
Lung cancer imaging plays a crucial role in early diagnosis and treatment, where machine learning and deep learning have significantly advanced the accuracy and efficiency of disease classification. This study introduces the Explainable and Lightweight Lung Cancer Net (XLLC-Net), a streamlined convolutional neural network designed for classifying lung cancer from histopathological images. Using the LC25000 dataset, which includes three lung cancer classes and two colon cancer classes, we focused solely on the three lung cancer classes for this study. XLLC-Net effectively discerns complex disease patterns within these classes. The model consists of four convolutional layers and contains merely 3 million parameters, considerably reducing its computational footprint compared to existing deep learning models. This compact architecture facilitates efficient training, completing each epoch in just 60 seconds. Remarkably, XLLC-Net achieves a classification accuracy of 99.62% [Formula: see text] 0.16%, with precision, recall, and F1 score of 99.33% [Formula: see text] 0.30%, 99.67% [Formula: see text] 0.30%, and 99.70% [Formula: see text] 0.30%, respectively. Furthermore, the integration of Explainable AI techniques, such as Saliency Map and GRAD-CAM, enhances the interpretability of the model, offering clear visual insights into its decision-making process. Our results underscore the potential of lightweight DL models in medical imaging, providing high accuracy and rapid training while ensuring model transparency and reliability.
PMID:40445896 | DOI:10.1371/journal.pone.0322488
Integrating Motor Unit Activity With Deep Learning for Real-Time, Simultaneous and Proportional Wrist Angle and Grasp Force Estimation
IEEE Trans Biomed Eng. 2025 May 30;PP. doi: 10.1109/TBME.2025.3575252. Online ahead of print.
ABSTRACT
OBJECTIVE: Myoelectric prostheses offer great promise in enabling amputees to perform daily activities independently. However, existing neural interfaces generally cannot simultaneously and proportionally decode kinematics and kinetics in real time, nor can they directly interpret neural commands. We thus propose a novel framework that integrates motor unit activity with deep learning and demonstrate its efficiency in the real-time, simultaneous, and proportional estimation of wrist angles and grasp forces.
METHODS: This framework utilizes real-time high-density surface electromyography decomposition to identify motor neuron discharges, followed by neural drive computation integrated with a modular Long Short-Term Memory-based neural network. Ten subjects participated in the experiments involving wrist pronation/supination, flexion/extension, and abduction/adduction, with varying grasp force.
RESULTS: The proposed framework significantly outperformed five baseline methods, achieving an nRMSE of 13.6% and 11.1% and an R2 of 73.2% and 76.8% for wrist angle and grasp force, respectively. In addition, we further characterized the spatial distribution and recruitment patterns of motor units during movement generation.
CONCLUSION: These findings highlight the feasibility of integrating neural drive insights with deep learning methods to improve simultaneous and proportional estimation performance.
SIGNIFICANCE: The proposed framework has the potential to enhance the independence and quality of life of prosthetic users by enabling them to perform a wider range of tasks with improved precision and control over both kinematics and kinetics.
PMID:40445821 | DOI:10.1109/TBME.2025.3575252
Phantom-Based Ultrasound-ECG Deep Learning Framework for Prospective Cardiac Computed Tomography
IEEE Trans Biomed Eng. 2025 May 30;PP. doi: 10.1109/TBME.2025.3575268. Online ahead of print.
ABSTRACT
OBJECTIVE: We present the first multimodal deep learning framework combining ultrasound (US) and electrocardiography (ECG) data to predict cardiac quiescent periods (QPs) for optimized computed tomography angiography gating (CTA).
METHODS: The framework integrates a 3D convolutional neural network (CNN) for US data and an artificial neural network (ANN) for ECG data. A dynamic heart motion phantom, replicating diverse cardiac conditions, including arrhythmias, was used to validate the framework. Performance was assessed across varying QP lengths, cardiac segments, and motions to simulate real-world conditions.
RESULTS: The multimodal US-ECG 3D CNN-ANN framework demonstrated improved QP prediction accuracy compared to single-modality ECG-only gating, achieving 96.87% accuracy compared to 85.56%, including scenarios involving arrhythmic conditions. Notably, the framework shows higher accuracy for longer QP durations (100 ms - 200 ms) compared to shorter durations (<100ms), while still outperforming single-modality methods, which often fail to detect shorter quiescent phases, especially in arrhythmic cases. Consistently outperforming single-modality approaches, it achieves reliable QP prediction across cardiac regions, including the whole phantom, interventricular septum, and cardiac wall regions. Analysis of QP prediction accuracy across cardiac segments demonstrated an average accuracy of 92% in clinically relevant echocardiographic views, highlighting the framework's robustness.
CONCLUSION: Combining US and ECG data using a multimodal framework improves QP prediction accuracy under variable cardiac motion, particularly in arrhythmic conditions.
SIGNIFICANCE: Since even small errors in cardiac CTA can result in non-diagnostic scans, the potential benefits of multimodal gating may improve diagnostic scan rates in patients with high and variable heart rates and arrhythmias.
PMID:40445820 | DOI:10.1109/TBME.2025.3575268
The value of artificial intelligence in PSMA PET: a pathway to improved efficiency and results
Q J Nucl Med Mol Imaging. 2025 May 30. doi: 10.23736/S1824-4785.25.03640-4. Online ahead of print.
ABSTRACT
INTRODUCTION: This systematic review investigates the potential of artificial intelligence (AI) in improving the accuracy and efficiency of prostate-specific membrane antigen positron emission tomography (PSMA PET) scans for detecting metastatic prostate cancer.
EVIDENCE ACQUISITION: A comprehensive literature search was conducted across Medline, Embase, and Web of Science, adhering to PRISMA guidelines. Key search terms included "artificial intelligence," "machine learning," "deep learning," "prostate cancer," and "PSMA PET." The PICO framework guided the selection of studies focusing on AI's application in evaluating PSMA PET scans for staging lymph node and distant metastasis in prostate cancer patients. Inclusion criteria prioritized original English-language articles published up to October 2024, excluding studies using non-PSMA radiotracers, those analyzing only the CT component of PSMA PET-CT, studies focusing solely on intra-prostatic lesions, and non-original research articles.
EVIDENCE SYNTHESIS: The review included 22 studies, with a mix of prospective and retrospective designs. AI algorithms employed included machine learning (ML), deep learning (DL), and convolutional neural networks (CNNs). The studies explored various applications of AI, including improving diagnostic accuracy, sensitivity, differentiation from benign lesions, standardization of reporting, and predicting treatment response. Results showed high sensitivity (62% to 97%) and accuracy (AUC up to 98%) in detecting metastatic disease, but also significant variability in positive predictive value (39.2% to 66.8%).
CONCLUSIONS: AI demonstrates significant promise in enhancing PSMA PET scan analysis for metastatic prostate cancer, offering improved efficiency and potentially better diagnostic accuracy. However, the variability in performance and the "black box" nature of some algorithms highlight the need for larger prospective studies, improved model interpretability, and the continued involvement of experienced nuclear medicine physicians in interpreting AI-assisted results. AI should be considered a valuable adjunct, not a replacement, for expert clinical judgment.
PMID:40444499 | DOI:10.23736/S1824-4785.25.03640-4
Deep learning-based applicator selection between Syed and T&O in high-dose-rate brachytherapy for locally advanced cervical cancer: a retrospective study
Phys Med Biol. 2025 May 29. doi: 10.1088/1361-6560/addea5. Online ahead of print.
ABSTRACT
OBJECTIVE: High-dose-rate (HDR) brachytherapy is integral to the standard-of-care for locally advanced cervical cancer (LACC). Currently, selection of brachytherapy applicators relies on physician's clinical experience, which can lead to variability in treatment quality and outcomes. This study presents a deep learning-based decision-support tool for selecting between interstitial Syed applicators and intracavitary tandem & ovoids applicators.
APPROACH: The network architecture consists of six 3D convolutional-pooling-ReLU blocks, followed by a fully connected block. The input to the network includes three channels: a 3D contour mask of clinical target volume (CTV), organs at risk (OAR), and central tandem, and two 3D distance maps of CTV and OAR voxels relative to the tandem's central axis. The network outputs a probability score, indicating the suitability of Syed applicators. Binary cross-entropy loss combined with L1 regularization was used for network training.
MAIN RESULTS: A retrospective study was performed on 184 LACC patients with 422 instances of applicator insertion. The data was divided into three sets: Dataset-1 of 163 patients with 372 insertions for training and hyperparameter tuning, Dataset-2 of 17 patients with 36 insertions and Dataset-3 of four complex cases with 14 insertions for testing. Five-fold cross-validation was performed on Dataset-1, during which hyperparameters were heuristically tuned to optimize classification accuracy across the folds. The highest average accuracy was 92.1 ± 3.8%. Using the hyperparameters that resulted in this highest accuracy, the final model was then trained on the full Dataset-1, and evaluated on the other two independent datasets, achieving 96.0% accuracy, 90.9% sensitivity, and 97.4% specificity.
SIGNIFICANCE: These results demonstrate the potential of our model as a quality assurance tool in LACC HDR brachytherapy, providing feedback on physicians' applicator choice and supporting continuous improvement in decision-making. Future work will focus on collecting more data for further validation and extending its application for prospective applicator selection.
PMID:40444332 | DOI:10.1088/1361-6560/addea5
QID<sup>2</sup>: An Image-Conditioned Diffusion Model for <em>Q</em>-space Up-sampling of DWI Data
Comput Diffus MRI. 2025;15171:119-131. doi: 10.1007/978-3-031-86920-4_11. Epub 2025 Apr 18.
ABSTRACT
We propose an image-conditioned diffusion model to estimate high angular resolution diffusion weighted imaging (DWI) from a low angular resolution acquisition. Our model, which we call QID2, takes as input a set of low angular resolution DWI data and uses this information to estimate the DWI data associated with a target gradient direction. We leverage a U-Net architecture with cross-attention to preserve the positional information of the reference images, further guiding the target image generation. We train and evaluate QID2 on single-shell DWI samples curated from the Human Connectome Project (HCP) dataset. Specifically, we sub-sample the HCP gradient directions to produce low angular resolution DWI data and train QID2 to reconstruct the missing high angular resolution samples. We compare QID2 with two state-of-the-art GAN models. Our results demonstrate that QID2 not only achieves higher-quality generated images, but it consistently outperforms state-of-the-art baseline methods in downstream tensor estimation across multiple metrics and in generalizing to downsampling scenario during testing. Taken together, this study highlights the potential of diffusion models, and QID2 in particular, for q-space up-sampling, thus offering a promising toolkit for clinical and research applications.
PMID:40444168 | PMC:PMC12122016 | DOI:10.1007/978-3-031-86920-4_11
TFKT V2: task-focused knowledge transfer from natural images for computed tomography perceptual image quality assessment
J Med Imaging (Bellingham). 2025 Sep;12(5):051805. doi: 10.1117/1.JMI.12.5.051805. Epub 2025 May 28.
ABSTRACT
PURPOSE: The accurate assessment of computed tomography (CT) image quality is crucial for ensuring diagnostic reliability while minimizing radiation dose. Radiologists' evaluations are time-consuming and labor-intensive. Existing automated approaches often require large CT datasets with predefined image quality assessment (IQA) scores, which often do not align well with clinical evaluations. We aim to develop a reference-free, automated method for CT IQA that closely reflects radiologists' evaluations, reducing the dependency on large annotated datasets.
APPROACH: We propose Task-Focused Knowledge Transfer (TFKT), a deep learning-based IQA method leveraging knowledge transfer from task-similar natural image datasets. TFKT incorporates a hybrid convolutional neural network-transformer model, enabling accurate quality predictions by learning from natural image distortions with human-annotated mean opinion scores. The model is pre-trained on natural image datasets and fine-tuned on low-dose computed tomography perceptual image quality assessment data to ensure task-specific adaptability.
RESULTS: Extensive evaluations demonstrate that the proposed TFKT method effectively predicts IQA scores aligned with radiologists' assessments on in-domain datasets and generalizes well to out-of-domain clinical pediatric CT exams. The model achieves robust performance without requiring high-dose reference images. Our model is capable of assessing the quality of ∼ 30 CT image slices in a second.
CONCLUSIONS: The proposed TFKT approach provides a scalable, accurate, and reference-free solution for CT IQA. The model bridges the gap between traditional and deep learning-based IQA, offering clinically relevant and computationally efficient assessments applicable to real-world clinical settings.
PMID:40444137 | PMC:PMC12116730 | DOI:10.1117/1.JMI.12.5.051805
Reduction of photobleaching effects in photoacoustic imaging using noise agnostic, platform-flexible deep-learning methods
J Biomed Opt. 2025 Dec;30(Suppl 3):S34102. doi: 10.1117/1.JBO.30.S3.S34102. Epub 2025 May 28.
ABSTRACT
SIGNIFICANCE: Molecular photoacoustic (PA) imaging with exogenous dyes faces a significant challenge due to the photobleaching of the dye that can compromise tissue visualization, particularly in 3D imaging. Addressing this limitation can revolutionize the field by enabling safer, more reliable imaging and improve real-time visualization, quantitative analysis, and clinical decision-making in various molecular PA imaging applications such as image-guided surgeries.
AIM: We tackle photobleaching in molecular PA imaging by introducing a platform-flexible deep learning framework that enhances SNR from single-laser pulse data, preserving contrast and signal integrity without requiring averaging of signals from multiple laser pulses.
APPROACH: The generative deep learning network was trained with an LED-illuminated PA image dataset and tested on acoustic resolution PA microscopy images obtained with single-laser pulse illumination. In vitro and ex vivo samples were first tested for demonstrating SNR improvement, and then, a 3D-scanning experiment with an ICG-filled tube was conducted to depict the usability of the technique in reducing the impact of photobleaching during PA imaging.
RESULTS: Our generative deep learning model outperformed traditional nonlearning, filter-based algorithms and the U-Net deep learning network when tested with in vitro and ex vivo single pulse-illuminated images, showing superior performance in terms of signal-to-noise ratio ( 93.54 ± 6.07 , and 92.77 ± 10.74 compared with 86.35 ± 3.97 , and 84.52 ± 11.82 with U-Net for kidney, and tumor, respectively) and contrast-to-noise ratio ( 11.82 ± 4.42 , and 9.9 ± 4.41 compared with 7.59 ± 0.82 , and 6.82 ± 2.12 with U-Net for kidney, and tumor respectively). The use of cGAN with single-pulse rapid imaging has the potential to prevent photobleaching ( 9.51 ± 3.69 % with cGAN, and 35.14 ± 5.38 % with long-time laser exposure by averaging 30 pulses), enabling accurate, quantitative imaging suitable for real-time implementation, and improved clinical decision support.
CONCLUSIONS: We demonstrate the potential of a platform-flexible generative deep learning-based approach to mitigate the effects of photobleaching in PA imaging by enhancing signal-to-noise ratio from single pulse-illuminated data, thereby improving image quality and preserving contrast in real time.
PMID:40443946 | PMC:PMC12118878 | DOI:10.1117/1.JBO.30.S3.S34102
Mapping research on ICT addiction: a comprehensive review of Internet, smartphone, social media, and gaming addictions
Front Psychol. 2025 May 15;16:1578457. doi: 10.3389/fpsyg.2025.1578457. eCollection 2025.
ABSTRACT
INTRODUCTION: The use of information and communication technologies such as the Internet, smartphones, social media, and gaming has gained significant popularity in recent years. While the benefits are immense and ICTs have become essential in people's daily lives, the inappropriate use of these technologies has led to addiction, causing negative consequences in family, academic, and work environments.
METHODS: This study analyzes existing research related to ICT addiction (Internet, smartphone, social media, and gaming), reviewing relevant contributions. Historical trends, regions, relevance, factors, and instruments were analyzed to map out the existing research on ICT addiction.
RESULTS AND DISCUSSION: The findings revealed that although the number of relevant studies has grown in recent years, there is still a lack of attention on ICT addiction and its relationship with psychological factors, social factors, physical factors, phenomenological experiences, and treatment/prevention approaches. In this regard, psychology scholars should consider appropriate methods to raise awareness about ICT addiction and emphasize the need for an in-depth understanding of the meaning, context, and practices associated with Internet, smartphone, social media, and gaming addiction.
PMID:40443730 | PMC:PMC12120558 | DOI:10.3389/fpsyg.2025.1578457
Advances in Electrocardiogram-Based Artificial Intelligence Reveal Multisystem Biomarkers
J Clin Exp Cardiolog. 2025;16(2):935. Epub 2025 Mar 24.
ABSTRACT
As Artificial Intelligence (AI) plays an increasingly prominent role in society, its application in clinical cardiology is gaining traction by providing innovative diagnostic, prognostic, and therapeutic solutions. Electrocardiogram (ECG), as a ubiquitous diagnostic tool in cardiology, has emerged as the leading data source for Deep Learning (DL) applications. A recent study from our group used ECG-based DL model to identify cardiac wall motion abnormalities and outperformed expert human interpretation. Motivated by this work and that of many others, we aim to discuss advances, limitations, future directions, and equity considerations in DL models for ECG-based AI applications.
PMID:40443717 | PMC:PMC12121951
Dimensionality Reduction for Improving Out-of-Distribution Detection in Medical Image Segmentation
Uncertain Safe Util Mach Learn Med Imaging (2023). 2023 Oct;14291:147-156. doi: 10.1007/978-3-031-44336-7_15. Epub 2023 Oct 7.
ABSTRACT
Clinically-deployed deep learning-based segmentation models are known to fail on data outside of their training distributions. While clinicians review the segmentations, these models do tend to perform well in most instances, which could exacerbate automation bias. Therefore, it is critical to detect out-of-distribution images at inference to warn the clinicians that the model likely failed. This work applies the Mahalanobis distance post hoc to the bottleneck features of a Swin UNETR model that segments the liver on T1-weighted magnetic resonance imaging. By reducing the dimensions of the bottleneck features with principal component analysis, images the model failed on were detected with high performance and minimal computational load. Specifically, the proposed technique achieved 92% area under the receiver operating characteristic curve and 94% area under the precision-recall curve and can run in seconds on a central processing unit.
PMID:40443712 | PMC:PMC12120689 | DOI:10.1007/978-3-031-44336-7_15
Do Sharpness-Based Optimizers Improve Generalization in Medical Image Analysis?
IEEE Access. 2025;13:82972-82985. doi: 10.1109/ACCESS.2025.3568641. Epub 2025 May 9.
ABSTRACT
Effective clinical deployment of deep learning models in healthcare demands high generalization performance to ensure accurate diagnosis and treatment planning. In recent years, significant research has focused on improving the generalization of deep learning models by regularizing the sharpness of the loss landscape. Among the optimization approaches that explicitly minimize sharpness, Sharpness-Aware Minimization (SAM) has shown potential in enhancing generalization performance on general domain image datasets. This success has led to the development of several advanced sharpness-based algorithms aimed at addressing the limitations of SAM, such as Adaptive SAM, Surrogate-Gap SAM, Weighted SAM, and Curvature Regularized SAM. These sharpness-based optimizers have shown improvements in model generalization compared to conventional stochastic gradient descent optimizers and their variants on general domain image datasets, but they have not been thoroughly evaluated on medical images. This work provides a review of recent sharpness-based methods for improving the generalization of deep learning networks and evaluates the methods' performance on three medical image datasets, including breast ultrasound, chest X-ray, and colon histopathology images. Our findings indicate that the initial SAM method successfully enhances the generalization of various deep learning models. While Adaptive SAM improves generalization of convolutional neural networks, it fails to do so for vision transformers. Other sharpness-based optimizers, however, do not demonstrate consistent results. The results reveal that contrary to findings in the non-medical domain, SAM is the only recommended sharpness-based optimizer that consistently improves generalization in medical image analysis, and further research is necessary to refine the variants of SAM to enhance generalization performance in this field.
PMID:40443707 | PMC:PMC12121992 | DOI:10.1109/ACCESS.2025.3568641
Uncertainty Quantification for Conditional Treatment Effect Estimation under Dynamic Treatment Regimes
Proc Mach Learn Res. 2024 Dec;259:248-266.
ABSTRACT
In medical decision-making, clinicians must choose between different time-varying treatment strategies. Counterfactual prediction via g-computation enables comparison of alternative outcome distributions under such treatment strategies. While deep learning can better model high-dimensional data with complex temporal dependencies, incorporating model uncertainty into predicted conditional counterfactual distributions remains challenging. We propose a principled approach to model uncertainty in deep learning implementations of g-computations using approximate Bayesian posterior predictive distributions of counterfactual outcomes via variational dropout and deep ensembles. We evaluate these methods by comparing their counterfactual predictive calibration and performance in decision-making tasks, using two simulated datasets from mechanistic models and a real-world sepsis dataset. Our findings suggest that the proposed uncertainty quantification approach improves both calibration and decision-making performance, particularly in minimizing risks of worst-case adverse clinical outcomes under alternative dynamic treatment regimes. To our knowledge, this is the first work to propose and compare multiple uncertainty quantification methods in machine learning models of g-computation in estimating conditional treatment effects under dynamic treatment regimes.
PMID:40443560 | PMC:PMC12121963
Mobile based deep CNN model for maize leaf disease detection and classification
Plant Methods. 2025 May 29;21(1):72. doi: 10.1186/s13007-025-01386-5.
ABSTRACT
Maize is the most produced crop in the world, exceeding wheat and rice production. However, its yield is often affected by various leaf diseases. Early identification of maize leaf disease through easily accessible tool is required to increase the yield of maize. Recently, researchers have attempted to detect and classify maize leaf diseases using Deep Learning algorithms. However, to the best of the researcher's knowledge, nearly all the studies are concentrated on developing an offline model that can detect maize diseases. But, those models are not easily accessible to individuals and don't provide immediate feedback and monitoring. Thus, in this study, we developed a novel real-time, user-friendly maize leaf disease detection and classification mobile application. The VGG16, AlexNet, and ResNet50 models were implemented and compared their performance on maize disease detection and classification. A total of 4188 images of blight, common_rust, grey_leaf_spot, and healthy were used to train each model. Data augmentation techniques were applied to the dataset to increase the size of the dataset, which can also reduce model overfitting. Weighted cross-entropy loss was also employed to mitigate class-imbalance problems. After training, VGG16 achieved 95% of testing accuracy, AlexNet achieved 91%, and ResNet50 achieved 72% of testing accuracy. The VGG16 model outperformed the other models in terms of accuracy. Consequently, we deployed the VGG16 model into a mobile application to provide real-time disease detection and classification tool for farmers, extension officers, agribusiness managers, and policy-makers. The developed application will enhance early disease detection, decision making, and contribute to better crop management and food security.
PMID:40442806 | DOI:10.1186/s13007-025-01386-5
Unified estimation of rice canopy leaf area index over multiple periods based on UAV multispectral imagery and deep learning
Plant Methods. 2025 May 30;21(1):73. doi: 10.1186/s13007-025-01398-1.
ABSTRACT
BACKGROUND: Rice is one of the major food crops in the world, and the monitoring of its growth condition is of great significance for guaranteeing food security and promoting sustainable agricultural development. Leaf area index (LAI) is a key indicator for assessing the growth condition and yield potential of rice, and the traditional methods for obtaining LAI have problems such as low efficiency and large error. With the development of remote sensing technology, unmanned aerial multispectral remote sensing combined with deep learning technology provides a new way for efficient and accurate estimation of LAI in rice.
RESULTS: In this study, a multispectral camera mounted on a UAV was utilized to acquire rice canopy image data, and rice LAI was uniformly estimated over multiple periods by the multilayer perceptron (MLP) and convolutional neural network (CNN) models in deep learning. The results showed that the CNN model based on five-band reflectance images (490, 550, 670, 720, and 850 nm) as input after feature screening exhibited high estimation accuracy at different growth stages. Compared with the traditional MLP model with multiple vegetation indices as inputs, the CNN model could better process the original multispectral image data, effectively avoiding the problem of vegetation index saturation, and improving the accuracies by 4.89, 5.76, 10.96, 1.84 and 6.01% in the rice tillering, jointing, booting, and heading periods, respectively, and the overall accuracy was improved by 6.01%. Moreover, the model accuracies (MLP and CNN) before and after variable screening showed noticeable changes. Conducting variable screening contributed to a substantial improvement in the accuracy of rice LAI estimation.
CONCLUSIONS: UAV multispectral remote sensing combined with CNN technology provides an efficient and accurate method for the unified multi-period estimation of rice LAI. Moreover, the generalization ability and adaptability of the model were further improved by rational variable screening and data enhancement techniques. This study can provide a technical support for precision agriculture and a more accurate solution for rice growth monitoring. More feature extraction and variable screening methods can be further explored in future studies by optimizing the model structure to improve the accuracy and stability of the model.
PMID:40442795 | DOI:10.1186/s13007-025-01398-1
Gaussian random fields as an abstract representation of patient metadata for multimodal medical image segmentation
Sci Rep. 2025 May 29;15(1):18810. doi: 10.1038/s41598-025-03393-x.
ABSTRACT
Growing rates of chronic wound occurrence, especially in patients with diabetes, has become a recent concerning trend. Chronic wounds are difficult and costly to treat, and have become a serious burden on health care systems worldwide. Innovative deep learning methods for the detection and monitoring of such wounds have the potential to reduce the impact to patients and clinicians. We present a novel multimodal segmentation method which allows for the introduction of patient metadata into the training workflow whereby the patient data are expressed as Gaussian random fields. Our results indicate that the proposed method improved performance when utilising multiple models, each trained on different metadata categories. Using the Diabetic Foot Ulcer Challenge 2022 test set, when compared to the baseline results (intersection over union = 0.4670, Dice similarity coefficient = 0.5908) we demonstrate improvements of +0.0220 and +0.0229 for intersection over union and Dice similarity coefficient respectively. This paper presents the first study to focus on integrating patient data into a chronic wound segmentation workflow. Our results show significant performance gains when training individual models using specific metadata categories, followed by average merging of prediction masks using distance transforms. All source code for this study is available at: https://github.com/mmu-dermatology-research/multimodal-grf.
PMID:40442267 | DOI:10.1038/s41598-025-03393-x
Hierarchical Information-guided robotic grasp detection
Sci Rep. 2025 May 29;15(1):18821. doi: 10.1038/s41598-025-03313-z.
ABSTRACT
With the advancement of deep learning, robotic grasping has seen widespread application in fields, becoming a critical component in enhancing automation. Accurate and efficient grasping capabilities not only significantly boost productivity but also ensure safety and reliability in complex and dynamic environments. However, current approaches, particularly those based on convolutional neural networks (CNNs), often neglect the hierarchical information inherent in the data and lead to challenges in complex environments with abundant background information. Moreover, these methods struggle to capture long-range dependencies and non-local self-similarity, critical for accurate grasp detection. To address these issues, we propose GraspFormer, a novel method for robotic grasp detection. GraspFormer features a unique Encoder-Decoder framework that incorporates a Grasp Transformer Block designed to model long-range dependencies while avoiding background interference. Our approach also designs hierarchical information-guided self-attention (HIGSA) and an adaptive deep channel modulator (DCM) to enhance feature interactions and competition. Extensive experiments demonstrate that GraspFormer achieves performance comparable to state-of-the-art methods. The code is available at https://github.com/shine793/Hierarchical-Information-guided-Robotic-Grasp-Detection .
PMID:40442259 | DOI:10.1038/s41598-025-03313-z