Deep learning
Attention-based Vision Transformer Enables Early Detection of Radiotherapy-Induced Toxicity in Magnetic Resonance Images of a Preclinical Model
Technol Cancer Res Treat. 2025 Jan-Dec;24:15330338251333018. doi: 10.1177/15330338251333018. Epub 2025 Apr 4.
ABSTRACT
IntroductionEarly identification of patients at risk for toxicity induced by radiotherapy (RT) is essential for developing personalized treatments and mitigation plans. Preclinical models with relevant endpoints are critical for systematic evaluation of normal tissue responses. This study aims to determine whether attention-based vision transformers can classify MR images of irradiated and control mice, potentially aiding early identification of individuals at risk of developing toxicity.MethodC57BL/6J mice (n = 14) were subjected to 66 Gy of fractionated RT targeting the oral cavity, swallowing muscles, and salivary glands. A control group (n = 15) received no irradiation but was otherwise treated identically. T2-weighted MR images were obtained 3-5 days post-irradiation. Late toxicity in terms of saliva production in individual mice was assessed at day 105 after treatment. A pre-trained vision transformer model (ViT Base 16) was employed to classify the images into control and irradiated groups.ResultsThe ViT Base 16 model classified the MR images with an accuracy of 69%, with identical overall performance for control and irradiated animals. The ViT's model predictions showed a significant correlation with late toxicity (r = 0.65, p < 0.01). One of the attention maps from the ViT model highlighted the irradiated regions of the animals.ConclusionsAttention-based vision transformers using MRI have the potential to predict individuals at risk of developing early toxicity. This approach may enhance personalized treatment and follow-up strategies in head and neck cancer radiotherapy.
PMID:40183426 | DOI:10.1177/15330338251333018
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
ABSTRACT
BackgroundIn this research, we explore the application of Convolutional Neural Networks (CNNs) for the development of an automated cancer detection system, particularly for MRI images. By leveraging deep learning and image processing techniques, we aim to build a system that can accurately detect and classify tumors in medical images. The system's performance depends on several stages, including image enhancement, segmentation, data augmentation, feature extraction, and classification. Through these stages, CNNs can be effectively trained to detect tumors in MRI images with high accuracy. This automated cancer detection system can assist healthcare professionals in diagnosing cancer more quickly and accurately, improving patient outcomes. The integration of deep learning and image processing in medical diagnostics has the potential to revolutionize healthcare, making it more efficient and accessible.MethodsIn this paper, we examine the failure of semantic segmentation by predicting the mean intersection over union (mIoU), which is a standard evaluation metric for segmentation tasks. mIoU calculates the overlap between the predicted segmentation map and the ground truth segmentation map, offering a way to evaluate the model's performance. A low mIoU indicates poor segmentation, suggesting that the model has failed to accurately classify parts of the image. To further improve the robustness of the system, we introduce a deep neural network capable of predicting the mIoU of a segmentation map. The key innovation here is the ability to predict the mIoU without needing access to ground truth data during testing. This allows the system to estimate how well the model is performing on a given image and detect potential failure cases early in the process. The proposed method not only predicts the mIoU but also uses the expected mIoU value to detect failure events. For instance, if the predicted mIoU falls below a certain threshold, the system can flag this as a potential failure, prompting further investigation or triggering a safety mechanism in the autonomous vehicle. This mechanism can prevent the vehicle from making decisions based on faulty segmentation, improving safety and performance. Furthermore, the system is designed to handle imbalanced data, which is a common challenge in training deep learning models. In autonomous driving, certain objects, such as pedestrians or cyclists, might appear much less frequently than other objects like vehicles or roads. The imbalance can cause the model to be biased toward the more frequent objects. By leveraging the expected mIoU, the method can effectively balance the influence of different object classes, ensuring that the model does not overlook critical elements in the scene. This approach offers a novel way of not only training the model to be more accurate but also incorporating failure prediction as an additional layer of safety. It is a significant step forward in ensuring that autonomous systems, especially self-driving cars, operate in a safe and reliable manner, minimizing the risk of accidents caused by misinterpretations of visual data. In summary, this research introduces a deep learning model that predicts segmentation performance and detects failure events by using the mIoU metric. By improving both the accuracy of semantic segmentation and the detection of failures, we contribute to the development of more reliable autonomous driving systems. Moreover, the technique can be extended to other domains where segmentation plays a critical role, such as medical imaging or robotics, enhancing their ability to function safely and effectively in complex environments.Results and DiscussionBrain tumor detection from MRI images is a critical task in medical image analysis that can significantly impact patient outcomes. By leveraging a hybrid approach that combines traditional image processing techniques with modern deep learning methods, this research aims to create an automated system that can segment and identify brain tumors with high accuracy and efficiency. Deep learning techniques, particularly CNNs, have proven to be highly effective in medical image analysis due to their ability to learn complex features from raw image data. The use of deep learning for automated brain tumor segmentation provides several benefits, including faster processing times, higher accuracy, and more consistent results compared to traditional manual methods. As a result, this research not only contributes to the development of advanced methods for brain tumor detection but also demonstrates the potential of deep learning in revolutionizing medical image analysis and assisting healthcare professionals in diagnosing and treating brain tumors more effectively.ConclusionIn conclusion, this research demonstrates the potential of deep learning techniques, particularly CNNs, in automating the process of brain tumor detection from MRI images. By combining traditional image processing methods with deep learning, we have developed an automated system that can quickly and accurately segment tumors from MRI scans. This system has the potential to assist healthcare professionals in diagnosing and treating brain tumors more efficiently, ultimately improving patient outcomes. As deep learning continues to evolve, we expect these systems to become even more accurate, robust, and widely applicable in clinical settings. The use of deep learning for brain tumor detection represents a significant step forward in medical image analysis, and its integration into clinical workflows could greatly enhance the speed and accuracy of diagnosis, ultimately saving lives. The suggested plan also includes a convolutional neural network-based classification technique to improve accuracy and save computation time. Additionally, the categorization findings manifest as images depicting either a healthy brain or one that is cancerous. CNN, a form of deep learning, employs a number of feed-forward layers. Additionally, it functions using Python. The Image Net database groups the images. The database has already undergone training and preparation. Therefore, we have completed the final training layer. Along with depth, width, and height feature information, CNN also extracts raw pixel values.We then use the Gradient decent-based loss function to achieve a high degree of precision. We can determine the training accuracy, validation accuracy, and validation loss separately. 98.5% of the training is accurate. Similarly, both validation accuracy and validation loss are high.
PMID:40183298 | DOI:10.1177/18758592241311184
Code-Free Deep Learning Glaucoma Detection on Color Fundus Images
Ophthalmol Sci. 2025 Jan 30;5(4):100721. doi: 10.1016/j.xops.2025.100721. eCollection 2025 Jul-Aug.
ABSTRACT
OBJECTIVE: Code-free deep learning (CFDL) allows clinicians with no coding experience to build their own artificial intelligence models. This study assesses the performance of CFDL in glaucoma detection from fundus images in comparison to expert-designed models.
DESIGN: Deep learning model development, testing, and validation.
SUBJECTS: A total of 101 442 labeled fundus images from the Rotterdam EyePACS Artificial Intelligence for Robust Glaucoma Screening (AIROGS) dataset were included.
METHODS: Ophthalmology trainees without coding experience designed a CFDL binary model using the Rotterdam EyePACS AIROGS dataset of fundus images (101 442 labeled images) to differentiate glaucoma from normal optic nerves. We compared our results with bespoke models from the literature. We then proceeded to externally validate our model using 2 datasets, the Retinal Fundus Glaucoma Challenge (REFUGE) and the Glaucoma grading from Multi-Modality imAges (GAMMA) at 0.1, 0.3, and 0.5 confidence thresholds.
MAIN OUTCOME MEASURES: Area under the precision-recall curve (AuPRC), sensitivity at 95% specificity (SE@95SP), accuracy, area under the receiver operating curve (AUC), and positive predictive value (PPV).
RESULTS: The CFDL model showed high performance metrics that were comparable to the bespoke deep learning models. Our single-label classification model had an AuPRC of 0.988, an SE@95SP of 95%, and an accuracy of 91% (compared with 85% SE@95SP for the top bespoke models). Using the REFUGE dataset for external validation, our model had an SE@95SP, AUC, PPV, and accuracy of 83%, 0.960%, 73% to 94%, and 95% to 98%, respectively, at the 0.1, 0.3, and 0.5 confidence threshold cutoffs. Using the GAMMA dataset for external validation at the same confidence threshold cutoffs, our model had an SE@95SP, AUC, PPV, and accuracy of 98%, 0.994%, 94% to 96%, and 94% to 97%, respectively.
CONCLUSION: The capacity of CFDL models to perform glaucoma screening using fundus images presents a compelling proof of concept, empowering clinicians to explore innovative model designs for broad glaucoma screening in the near future.
FINANCIAL DISCLOSURES: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
PMID:40182983 | PMC:PMC11964632 | DOI:10.1016/j.xops.2025.100721
The Role of Artificial Intelligence in Epiretinal Membrane Care: A Scoping Review
Ophthalmol Sci. 2024 Dec 20;5(4):100689. doi: 10.1016/j.xops.2024.100689. eCollection 2025 Jul-Aug.
ABSTRACT
TOPIC: In ophthalmology, artificial intelligence (AI) demonstrates potential in using ophthalmic imaging across diverse diseases, often matching ophthalmologists' performance. However, the range of machine learning models for epiretinal membrane (ERM) management, which differ in methodology, application, and performance, remains largely unsynthesized.
CLINICAL RELEVANCE: Epiretinal membrane management relies on clinical evaluation and imaging, with surgical intervention considered in cases of significant impairment. AI analysis of ophthalmic images and clinical features could enhance ERM detection, characterization, and prognostication, potentially improving clinical decision-making. This scoping review aims to evaluate the methodologies, applications, and reported performance of AI models in ERM diagnosis, characterization, and prognostication.
METHODS: A comprehensive literature search was conducted across 5 electronic databases including Ovid MEDLINE, EMBASE, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, and Web of Science Core Collection from inception to November 14, 2024. Studies pertaining to AI algorithms in the context of ERM were included. The primary outcomes measured will be the reported design, application in ERM management, and performance of each AI model.
RESULTS: Three hundred ninety articles were retrieved, with 33 studies meeting inclusion criteria. There were 30 studies (91%) reporting their training and validation methods. Altogether, 61 distinct AI models were included. OCT scans and fundus photographs were used in 26 (79%) and 7 (21%) papers, respectively. Supervised learning and both supervised and unsupervised learning were used in 32 (97%) and 1 (3%) studies, respectively. Twenty-seven studies (82%) developed or adapted AI models using images, whereas 5 (15%) had models using both images and clinical features, and 1 (3%) used preoperative and postoperative clinical features without ophthalmic images. Study objectives were categorized into 3 stages of ERM care. Twenty-three studies (70%) implemented AI for diagnosis (stage 1), 1 (3%) identified ERM characteristics (stage 2), and 6 (18%) predicted vision impairment after diagnosis or postoperative vision outcomes (stage 3). No articles studied treatment planning. Three studies (9%) used AI in stages 1 and 2. Of the 16 studies comparing AI performance to human graders (i.e., retinal specialists, general ophthalmologists, and trainees), 10 (63%) reported equivalent or higher performance.
CONCLUSION: Artificial intelligence-driven assessments of ophthalmic images and clinical features demonstrated high performance in detecting ERM, identifying its morphological properties, and predicting visual outcomes following ERM surgery. Future research might consider the validation of algorithms for clinical applications in personal treatment plan development, ideally to identify patients who might benefit most from surgery.
FINANCIAL DISCLOSURES: The author(s) have no proprietary or commercial interest in any materials discussed in this article.
PMID:40182981 | PMC:PMC11964620 | DOI:10.1016/j.xops.2024.100689
MDNN-DTA: a multimodal deep neural network for drug-target affinity prediction
Front Genet. 2025 Mar 20;16:1527300. doi: 10.3389/fgene.2025.1527300. eCollection 2025.
ABSTRACT
Determining drug-target affinity (DTA) is a pivotal step in drug discovery, where in silico methods can significantly improve efficiency and reduce costs. Artificial intelligence (AI), especially deep learning models, can automatically extract high-dimensional features from the biological sequences of drug molecules and target proteins. This technology demonstrates lower complexity in DTA prediction compared to traditional experimental methods, particularly when handling large-scale data. In this study, we introduce a multimodal deep neural network model for DTA prediction, referred to as MDNN-DTA. This model employs Graph Convolutional Networks (GCN) and Convolutional Neural Networks (CNN) to extract features from the drug and protein sequences, respectively. One notable strength of our method is its ability to accurately predict DTA directly from the sequences of the target proteins, obviating the need for protein 3D structures, which are frequently unavailable in drug discovery. To comprehensively extract features from the protein sequence, we leverage an ESM pre-trained model for extracting biochemical features and design a specific Protein Feature Extraction (PFE) block for capturing both global and local features of the protein sequence. Furthermore, a Protein Feature Fusion (PFF) Block is engineered to augment the integration of multi-scale protein features derived from the abovementioned techniques. We then compare MDNN-DTA with other models on the same dataset, conducting a series of ablation experiments to assess the performance and efficacy of each component. The results highlight the advantages and effectiveness of the MDNN-DTA method.
PMID:40182923 | PMC:PMC11965683 | DOI:10.3389/fgene.2025.1527300
PMPred-AE: a computational model for the detection and interpretation of pathological myopia based on artificial intelligence
Front Med (Lausanne). 2025 Mar 13;12:1529335. doi: 10.3389/fmed.2025.1529335. eCollection 2025.
ABSTRACT
INTRODUCTION: Pathological myopia (PM) is a serious visual impairment that may lead to irreversible visual damage or even blindness. Timely diagnosis and effective management of PM are of great significance. Given the increasing number of myopia cases worldwide, there is an urgent need to develop an automated, accurate, and highly interpretable PM diagnostic technology.
METHODS: We proposed a computational model called PMPred-AE based on EfficientNetV2-L with attention mechanism optimization. In addition, Gradient-weighted class activation mapping (Grad-CAM) technology was used to provide an intuitive and visual interpretation for the model's decision-making process.
RESULTS: The experimental results demonstrated that PMPred-AE achieved excellent performance in automatically detecting PM, with accuracies of 98.50, 98.25, and 97.25% in the training, validation, and test datasets, respectively. In addition, PMPred-AE can focus on specific areas of PM image when making detection decisions.
DISCUSSION: The developed PMPred-AE model is capable of reliably providing accurate PM detection. In addition, the Grad-CAM technology was also used to provide an intuitive and visual interpretation for the decision-making process of the model. This approach provides healthcare professionals with an effective tool for interpretable AI decision-making process.
PMID:40182849 | PMC:PMC11965940 | DOI:10.3389/fmed.2025.1529335
Artificial intelligence optimizes the standardized diagnosis and treatment of chronic sinusitis
Front Physiol. 2025 Mar 13;16:1522090. doi: 10.3389/fphys.2025.1522090. eCollection 2025.
ABSTRACT
BACKGROUND: Standardised management of chronic sinusitis (CRS) is a challenging but vital area of research. Not only is accurate diagnosis and individualised treatment plans required, but post-treatment chronic disease management is also indispensable. With the development of artificial intelligence (AI), more "AI + medical" application models are emerging. Many AI-assisted systems have been applied to the diagnosis and treatment of CRS, providing valuable solutions for clinical practice.
OBJECTIVE: This study summarises the research progress of various AI-assisted systems applied to the clinical diagnosis and treatment of CRS, focusing on their role in imaging and pathological diagnosis and prognostic prediction and treatment.
METHODS: We used PubMed, Web of Science, and other Internet search engines with "artificial intelligence"、"machine learning" and "chronic sinusitis" as the keywords to conduct a literature search for studies from the last 7 years. We included literature eligible for AI application to CRS diagnosis and treatment in our study, excluded literature outside this scope, and categorized it according to its clinical application to CRS diagnosis, treatment, and prognosis prediction. We provide an overview and summary of current advances in AI to optimize the diagnosis and treatment of CRS, as well as difficulties and challenges in promoting standardization of clinical diagnosis and treatment in this area.
RESULTS: Through applications in CRS imaging and pathology diagnosis, personalised medicine and prognosis prediction, AI can significantly reduce turnaround times, lower diagnostic costs and accurately predict disease outcomes. However, a number of challenges remain. These include a lack of AI product standards, standardised data, difficulties in collaboration between different healthcare providers, and the non-interpretability of AI systems. There may also be data privacy issues involved. Therefore, more research and improvements are needed to realise the full potential of AI in the diagnosis and treatment of CRS.
CONCLUSION: Our findings inform the clinical diagnosis and treatment of CRS and the development of AI-assisted clinical diagnosis and treatment systems. We provide recommendations for AI to drive standardisation of CRS diagnosis and treatment.
PMID:40182690 | PMC:PMC11966420 | DOI:10.3389/fphys.2025.1522090
Artificial Intelligence (AI)-Based Computer-Assisted Detection and Diagnosis for Mammography: An Evidence-Based Review of Food and Drug Administration (FDA)-Cleared Tools for Screening Digital Breast Tomosynthesis (DBT)
AI Precis Oncol. 2024 Aug 19;1(4):195-206. doi: 10.1089/aipo.2024.0022. eCollection 2024 Aug.
ABSTRACT
In recent years, the emergence of new-generation deep learning-based artificial intelligence (AI) tools has reignited enthusiasm about the potential of computer-assisted detection (CADe) and diagnosis (CADx) for screening mammography. For screening mammography, digital breast tomosynthesis (DBT) combined with acquired digital 2D mammography or synthetic 2D mammography is widely used throughout the United States. As of this writing in July 2024, there are six Food and Drug Administration (FDA)-cleared AI-based CADe/x tools for DBT. These tools detect suspicious lesions on DBT and provide corresponding scores at the lesion and examination levels that reflect likelihood of malignancy. In this article, we review the evidence supporting the use of AI-based CADe/x for DBT. The published literature on this topic consists of multireader, multicase studies, retrospective analyses, and two "real-world" evaluations. These studies suggest that AI-based CADe/x could lead to improvements in sensitivity without compromising specificity and to improvements in efficiency. However, the overall published evidence is limited and includes only two small postimplementation clinical studies. Prospective studies and careful postimplementation clinical evaluation will be necessary to fully understand the impact of AI-based CADe/x on screening DBT outcomes.
PMID:40182614 | PMC:PMC11963389 | DOI:10.1089/aipo.2024.0022
Facing the challenges of autoimmune pancreatitis diagnosis: The answer from artificial intelligence
World J Gastroenterol. 2025 Mar 28;31(12):102950. doi: 10.3748/wjg.v31.i12.102950.
ABSTRACT
Current diagnosis of autoimmune pancreatitis (AIP) is challenging and often requires combining multiple dimensions. There is a need to explore new methods for diagnosing AIP. The development of artificial intelligence (AI) is evident, and it is believed to have potential in the clinical diagnosis of AIP. This article aims to list the current diagnostic difficulties of AIP, describe existing AI applications, and suggest directions for future AI usages in AIP diagnosis.
PMID:40182594 | PMC:PMC11962844 | DOI:10.3748/wjg.v31.i12.102950
Automated inflammatory bowel disease detection using wearable bowel sound event spotting
Front Digit Health. 2025 Mar 13;7:1514757. doi: 10.3389/fdgth.2025.1514757. eCollection 2025.
ABSTRACT
INTRODUCTION: Inflammatory bowel disorders may result in abnormal Bowel Sound (BS) characteristics during auscultation. We employ pattern spotting to detect rare bowel BS events in continuous abdominal recordings using a smart T-shirt with embedded miniaturised microphones. Subsequently, we investigate the clinical relevance of BS spotting in a classification task to distinguish patients diagnosed with inflammatory bowel disease (IBD) and healthy controls.
METHODS: Abdominal recordings were obtained from 24 patients with IBD with varying disease activity and 21 healthy controls across different digestive phases. In total, approximately 281 h of audio data were inspected by expert raters and thereof 136 h were manually annotated for BS events. A deep-learning-based audio pattern spotting algorithm was trained to retrieve BS events. Subsequently, features were extracted around detected BS events and a Gradient Boosting Classifier was trained to classify patients with IBD vs. healthy controls. We further explored classification window size, feature relevance, and the link between BS-based IBD classification performance and IBD activity.
RESULTS: Stratified group K-fold cross-validation experiments yielded a mean area under the receiver operating characteristic curve ≥0.83 regardless of whether BS were manually annotated or detected by the BS spotting algorithm.
DISCUSSION: Automated BS retrieval and our BS event classification approach have the potential to support diagnosis and treatment of patients with IBD.
PMID:40182584 | PMC:PMC11965935 | DOI:10.3389/fdgth.2025.1514757
An enhanced lightweight model for apple leaf disease detection in complex orchard environments
Front Plant Sci. 2025 Mar 13;16:1545875. doi: 10.3389/fpls.2025.1545875. eCollection 2025.
ABSTRACT
Automated detection of apple leaf diseases is crucial for predicting and preventing losses and for enhancing apple yields. However, in complex natural environments, factors such as light variations, shading from branches and leaves, and overlapping disease spots often result in reduced accuracy in detecting apple diseases. To address the challenges of detecting small-target diseases on apple leaves in complex backgrounds and difficulty in mobile deployment, we propose an enhanced lightweight model, ELM-YOLOv8n.To mitigate the high consumption of computational resources in real-time deployment of existing models, we integrate the Fasternet Block into the C2f of the backbone network and neck network, effectively reducing the parameter count and the computational load of the model. In order to enhance the network's anti-interference ability in complex backgrounds and its capacity to differentiate between similar diseases, we incorporate an Efficient Multi-Scale Attention (EMA) within the deep structure of the network for in-depth feature extraction. Additionally, we design a detail-enhanced shared convolutional scaling detection head (DESCS-DH) to enable the model to effectively capture edge information of diseases and address issues such as poor performance in object detection across different scales. Finally, we employ the NWD loss function to replace the CIoU loss function, allowing the model to locate and identify small targets more accurately and further enhance its robustness, thereby facilitating rapid and precise identification of apple leaf diseases. Experimental results demonstrate ELM-YOLOv8n's effectiveness, achieving 94.0% of F1 value and 96.7% of mAP50 value-a significant improvement over YOLOv8n. Furthermore, the parameter count and computational load are reduced by 44.8% and 39.5%, respectively. The ELM-YOLOv8n model is better suited for deployment on mobile devices while maintaining high accuracy.
PMID:40182549 | PMC:PMC11965912 | DOI:10.3389/fpls.2025.1545875
CTDA: an accurate and efficient cherry tomato detection algorithm in complex environments
Front Plant Sci. 2025 Mar 13;16:1492110. doi: 10.3389/fpls.2025.1492110. eCollection 2025.
ABSTRACT
INTRODUCTION: In the natural harvesting conditions of cherry tomatoes, the robotic vision for harvesting faces challenges such as lighting, overlapping, and occlusion among various environmental factors. To ensure accuracy and efficiency in detecting cherry tomatoes in complex environments, the study proposes a precise, realtime, and robust target detection algorithm: the CTDA model, to support robotic harvesting operations in unstructured environments.
METHODS: The model, based on YOLOv8, introduces a lightweight downsampling method to restructure the backbone network, incorporating adaptive weights and receptive field spatial characteristics to ensure that low-dimensional small target features are not completely lost. By using softpool to replace maxpool in SPPF, a new SPPFS is constructed, achieving efficient feature utilization and richer multi-scale feature fusion. Additionally, by incorporating a dynamic head driven by the attention mechanism, the recognition precision of cherry tomatoes in complex scenarios is enhanced through more effective feature capture across different scales.
RESULTS: CTDA demonstrates good adaptability and robustness in complex scenarios. Its detection accuracy reaches 94.3%, with recall and average precision of 91.5% and 95.3%, respectively, while achieving a mAP@0.5:0.95 of 76.5% and an FPS of 154.1 frames per second. Compared to YOLOv8, it improves mAP by 2.9% while maintaining detection speed, with a model size of 6.7M.
DISCUSSION: Experimental results validate the effectiveness of the CTDA model in cherry tomato detection under complex environments. While improving detection accuracy, the model also enhances adaptability to lighting variations, occlusion, and dense small target scenarios, and can be deployed on edge devices for rapid detection, providing strong support for automated cherry tomato picking.
PMID:40182545 | PMC:PMC11965914 | DOI:10.3389/fpls.2025.1492110
Breast cancer histopathology image classification using transformer with discrete wavelet transform
Med Eng Phys. 2025 Apr;138:104317. doi: 10.1016/j.medengphy.2025.104317. Epub 2025 Feb 26.
ABSTRACT
Early diagnosis of breast cancer using pathological images is essential to effective treatment. With the development of deep learning techniques, breast cancer histopathology image classification methods based on neural networks develop rapidly. However, these methods usually capture features in the spatial domain, rarely consider frequency feature distributions, which limits classification performance to some extent. This paper proposes a novel breast cancer histopathology image classification network, called DWNAT-Net, which introduces Discrete Wavelet Transform (DWT) to Neighborhood Attention Transformer (NAT). DWT decomposes inputs into different frequency bands through iterative filtering and downsampling, and it can extract frequency information while retaining spatial information. NAT utilizes Neighborhood Attention (NA) to confine the attention computation to a local neighborhood around each token to enable efficient modeling of local dependencies. The proposed method was evaluated on the BreakHis and Bach datasets, yielding impressive image-level recognition accuracy rates. We achieve a recognition accuracy rate of 99.66% on the BreakHis dataset and 91.25% on the BACH dataset, demonstrating competitive performance compared to state-of-the-art methods.
PMID:40180530 | DOI:10.1016/j.medengphy.2025.104317
Multi-scale feature fusion model for real-time Blood glucose monitoring and hyperglycemia prediction based on wearable devices
Med Eng Phys. 2025 Apr;138:104312. doi: 10.1016/j.medengphy.2025.104312. Epub 2025 Mar 1.
ABSTRACT
Accurate monitoring of blood glucose levels and the prediction of hyperglycemia are critical for the management of diabetes and the enhancement of medical efficiency. The primary challenge lies in uncovering the correlations among physiological information, nutritional intake, and other features, and addressing the issue of scale disparity among these features, in addition to considering the impact of individual variances on the model's accuracy. This paper introduces a universal, wearable device-assisted, multi-scale feature fusion model for real-time blood glucose monitoring and hyperglycemia prediction. It aims to more effectively capture the local correlations between diverse features and their inherent temporal relationships, overcoming the challenges of physiological data redundancy at large time scales and the incompleteness of nutritional intake data at smaller time scales. Furthermore, we have devised a personalized tuner strategy to enhance the model's accuracy and stability by continuously collecting personal data from users of the wearable devices to fine-tune the generic model, thereby accommodating individual differences and providing patients with more precise health management services. The model's performance, assessed using public datasets, indicates that the real-time monitoring error in terms of Mean Squared Error (MSE) is 0.22mmol/L, with a prediction accuracy for hyperglycemia occurrences of 96.75%. The implementation of the personalized tuner strategy yielded an average improvement rate of 1.96% on individual user datasets. This study on blood glucose monitoring and hyperglycemia prediction, facilitated by wearable devices, assists users in better managing their blood sugar levels and holds significant clinical application prospects.
PMID:40180525 | DOI:10.1016/j.medengphy.2025.104312
Using Explainable Machine Learning to Predict the Irritation and Corrosivity of Chemicals on Eyes and Skin
Toxicol Lett. 2025 Apr 1:S0378-4274(25)00057-8. doi: 10.1016/j.toxlet.2025.03.008. Online ahead of print.
ABSTRACT
Contact with specific chemicals often results in corrosive and irritative responses in the eyes and skin, playing a pivotal role in assessing the potential hazards of personal care products, cosmetics, and industrial chemicals to human health. While traditional animal testing can provide valuable information, its high costs, ethical controversies, and significant demand for animals limit its extensive use, particularly during preliminary screening stages. To address these issues, we adopted a computational modeling approach, integrating 3,316 experimental data points on eye irritation and 3,080 data points on skin irritation, to develop various machine learning and deep learning models. Under the evaluation of the external validation set, the best-performing models for the two tasks achieved balanced accuracies (BAC) of 73.0% and 75.1%, respectively. Furthermore, interpretability analyses were conducted at the dataset level, molecular level, and atomic level to provide insights into the prediction outcomes. Analysis of substructure frequencies identified structural alert fragments within the datasets. This information serves as a reference for identifying potentially irritating chemicals. Additionally, a user-friendly visualization interface was developed, enabling non-specialists to easily predict eye and skin irritation potential. In summary, our study provides a new avenue for the assessment of irritancy potential in chemicals used in pesticides, cosmetics, and ophthalmic drugs.
PMID:40180199 | DOI:10.1016/j.toxlet.2025.03.008
Multi-Class Brain Malignant Tumor Diagnosis in Magnetic Resonance Imaging Using Convolutional Neural Networks
Brain Res Bull. 2025 Apr 1:111329. doi: 10.1016/j.brainresbull.2025.111329. Online ahead of print.
ABSTRACT
To reduce the clinical misdiagnosis rate of glioblastoma (GBM), primary central nervous system lymphoma (PCNSL), and brain metastases (BM), which are common malignant brain tumors with similar radiological features, we propose a new CNN-based model, FoTNet. The model integrates a frequency-based channel attention layer and Focal Loss to address the class imbalance issue caused by the limited data available for PCNSL. A multi-center MRI dataset was constructed by collecting and integrating data from Zhejiang University School of Medicine's Sir Run Run Shaw Hospital, along with public datasets from UPENN and TCGA. The dataset includes T1-weighted contrast-enhanced (T1-CE) MRI images from 58 GBM, 82 PCNSL, and 269 BM cases, which were divided into training and testing sets in a 5:2 ratio. FoTNet achieved a classification accuracy of 92.5% and an average AUC of 0.9754 on the test set, significantly outperforming existing machine learning and deep learning methods in distinguishing between GBM, PCNSL, and BM. Through multiple validations, FoTNet has proven to be an effective and robust tool for accurately classifying these brain tumors, providing strong support for preoperative diagnosis and assisting clinicians in making more informed treatment decisions.
PMID:40180191 | DOI:10.1016/j.brainresbull.2025.111329
An enhanced CNN-Bi-transformer based framework for detection of neurological illnesses through neurocardiac data fusion
Sci Rep. 2025 Apr 3;15(1):11379. doi: 10.1038/s41598-025-96052-0.
ABSTRACT
Classical approaches to diagnosis frequently rely on self-reported symptoms or clinician observations, which can make it difficult to examine mental health illnesses due to their subjective and complicated nature. In this work, we offer an innovative methodology for predicting mental illnesses such as epilepsy, sleep disorders, bipolar disorder, eating disorders, and depression using a multimodal deep learning framework that integrates neurocardiac data fusion. The proposed framework combines MEG, EEG, and ECG signals to create a more comprehensive understanding of brain and cardiac function in individuals with mental disorders. The multimodal deep learning approach uses an integrated CNN-Bi-Transformer, i.e., CardioNeuroFusionNet, which can process multiple types of inputs simultaneously, allowing for the fusion of various modalities and improving the performance of the predictive representation. The proposed framework has undergone testing on data from the Deep BCI Scalp Database and was further validated on the Kymata Atlas dataset to assess its generalizability. The model achieved promising results with high accuracy (98.54%) and sensitivity (97.77%) in predicting mental problems, including neurological and psychiatric conditions. The neurocardiac data fusion has been found to provide additional insights into the relationship between brain and cardiac function in neurological conditions, which could potentially lead to more accurate diagnosis and personalized treatment options. The suggested method overcomes the shortcomings of earlier studies, which tended to concentrate on single-modality data, lacked thorough neurocardiac data fusion, and made use of less advanced machine learning algorithms. The comprehensive experimental findings, which provide an average improvement in accuracy of 2.72%, demonstrate that the suggested work performs better than other cutting-edge AI techniques and generalizes effectively across diverse datasets.
PMID:40181122 | DOI:10.1038/s41598-025-96052-0
Efficient fault diagnosis in rolling bearings lightweight hybrid model
Sci Rep. 2025 Apr 3;15(1):11514. doi: 10.1038/s41598-025-96285-z.
ABSTRACT
To address the issue of low efficiency in feature extraction and model training when traditional deep learning methods handle long time-series data, this paper proposes a Time-Series Lightweight Transformer (TSL-Transformer) model. According to the data characteristics of bearing fault diagnosis tasks, the model makes lightweight improvements to the traditional Transformer model, and focuses on adjusting the encoder module (core feature extraction module), introducing multi-head attention mechanism and feedforward neural network to efficiently extract complex features of vibration signals. Considering the rich temporal features present in vibration signals, a Long Short-Term Memory (LSTM) module is introduced in parallel to the encoder module of the improved lightweight Transformer model. This enhancement further strengthens the model's ability to capture temporal features, thereby improving diagnostic accuracy. Experimental results demonstrate that the proposed TSL-Transformer model achieves a fault diagnosis accuracy of 99.2% on the CWRU dataset. Through dimensionality reduction and visualization analysis using the t-SNE method, the effectiveness of different network structures within the proposed TSL-Transformer model is elucidated.
PMID:40181056 | DOI:10.1038/s41598-025-96285-z
Difficulty aware programming knowledge tracing via large language models
Sci Rep. 2025 Apr 3;15(1):11475. doi: 10.1038/s41598-025-96540-3.
ABSTRACT
Knowledge Tracing (KT) assesses students' mastery of specific knowledge concepts and predicts their problem-solving abilities by analyzing their interactions with intelligent tutoring systems. Although recent years have seen significant improvements in tracking accuracy with the introduction of deep learning and graph neural network techniques, existing research has not sufficiently focused on the impact of difficulty on knowledge state. The text understanding difficulty and knowledge concept difficulty of programming problems are crucial for students' responses; thus, accurately assessing these two types of difficulty and applying them to knowledge state prediction is a key challenge. To address this challenge, we propose a Difficulty aware Programming Knowledge Tracing via Large Language Models(DPKT) to extract the text understanding difficulty and knowledge concept difficulty of programming problems. Specifically, we analyze the relationship between knowledge concept difficulty and text understanding difficulty using an attention mechanism, allowing for dynamic updates to students' s. This model combines an update gate mechanism with a graph attention network, significantly improving the assessment accuracy of programming problem difficulty and the spatiotemporal reflection capability of knowledge state. Experimental results demonstrate that this model performs excellently across various language datasets, validating its application value in programming education. This model provides an innovative solution for programming knowledge tracing and offers educators a powerful tool to promote personalized learning.
PMID:40181055 | DOI:10.1038/s41598-025-96540-3
An interpretable deep learning model for the accurate prediction of mean fragmentation size in blasting operations
Sci Rep. 2025 Apr 3;15(1):11515. doi: 10.1038/s41598-025-96005-7.
ABSTRACT
Fragmentation size is an important indicator for evaluating blasting effectiveness. To address the limitations of conventional blasting fragmentation size prediction methods in terms of prediction accuracy and applicability, this study proposes an NRBO-CNN-LSSVM model for predicting mean fragmentation size, which integrates Convolutional Neural Networks (CNN), Least Squares Support Vector Machines (LSSVM), and the Newton-Raphson Optimizer (NRBO). The study is based on a database containing 105 samples derived from both previous research and field collection. Additionally, several machine learning prediction models, including CNN-LSSVM, CNN, LSSVM, Support Vector Machine (SVM), and Support Vector Regression (SVR), are developed for comparative analysis. The results showed that the NRBO-CNN-LSSVM model achieved remarkable prediction accuracy on the training dataset, with a coefficient of determination (R2) as high as 0.9717 and a root mean square error (RMSE) as low as 0.0285. On the test set, the model maintained high prediction accuracy, with an R2 value of 0.9105 and an RMSE of 0.0403. SHapley Additive exPlanations (SHAP) analysis revealed that the modulus of elasticity (E) was a key variable influencing the prediction of mean fragmentation size. Partial Dependence Plots (PDP) analysis further disclosed a significant positive correlation between the modulus of elasticity (E) and mean fragmentation size. In contrast, a distinct negative correlation was observed between the powder factor (Pf) and mean fragmentation size. To enhance the convenience of the model in practical applications, we developed an interactive Graphical User Interface (GUI), allowing users to input relevant variables and obtain instant prediction results.
PMID:40181054 | DOI:10.1038/s41598-025-96005-7