Deep learning
Interobserver ground-truth variability limits performance of automated glioblastoma segmentation on [<sup>18</sup>F]FET PET
EJNMMI Phys. 2025 Jun 6;12(1):54. doi: 10.1186/s40658-025-00767-y.
ABSTRACT
BACKGROUND: Positron emission tomography (PET) with a [18F]fluoroethyl)-L-tyrosine ([18F]FET) tracer is of growing importance in the management of glioblastoma for the estimation of tumor extent and extraction of diagnostic and prognostic parameters. Robust and accurate glioblastoma segmentation methods are essential to maximize the benefits of this imaging modality. Given the importance of setting the foreground threshold during manual tumor delineation, this study investigates the added value of incorporating such prior knowledge to guide the automated segmentation and improve performance. Two segmentation networks were trained based on the nnU-Net guidelines: one with the [18F]FET PET image as sole input, and one with an additional input channel for the threshold map. For the latter, we investigate the benefit of manually obtained thresholds and explore automated prediction and generation of such maps. A fully automated pipeline was constructed by selecting the best performing threshold prediction approach and cascading this with the tumor segmentation model.
RESULTS: The proposed two-channel network shows increased performance with guidance of threshold maps originating from the same reader whose ground-truth tumor label the prediction is compared to (DSC = 0.901). When threshold maps were generated by a different reader, performance reverted to levels comparable to the one-channel network and inter-reader variability. The proposed full pipeline achieves results on par with current state of the art (DSC = 0.807).
CONCLUSIONS: Incorporating a threshold map can significantly improve tumor segmentation performance when it aligns well with the ground-truth label. However, the current inability to reliably reproduce these maps-both manually and automatically-or the ground-truth tumor labels, restricts the achievable accuracy for automated glioblastoma segmentation on [18F]FET PET, highlighting the need for more consistent definitions of such ground-truth delineations.
PMID:40478497 | DOI:10.1186/s40658-025-00767-y
Hypothalamus and intracranial volume segmentation at the group level by use of a Gradio-CNN framework
Int J Comput Assist Radiol Surg. 2025 Jun 6. doi: 10.1007/s11548-025-03438-6. Online ahead of print.
ABSTRACT
PURPOSE: This study aimed to develop and evaluate a graphical user interface (GUI) for the automated segmentation of the hypothalamus and intracranial volume (ICV) in brain MRI scans. The interface was designed to facilitate efficient and accurate segmentation for research applications, with a focus on accessibility and ease of use for end-users.
METHODS: We developed a web-based GUI using the Gradio library integrating deep learning-based segmentation models trained on annotated brain MRI scans. The model utilizes a U-Net architecture to delineate the hypothalamus and ICV. The GUI allows users to upload high-resolution MRI scans, visualize the segmentation results, calculate hypothalamic volume and ICV, and manually correct individual segmentation results. To ensure widespread accessibility, we deployed the interface using ngrok, allowing users to access the tool via a shared link. As an example for the universality of the approach, the tool was applied to a group of 90 patients with Parkinson's disease (PD) and 39 controls.
RESULTS: The GUI demonstrated high usability and efficiency in segmenting the hypothalamus and the ICV, with no significant difference in normalized hypothalamic volume observed between PD patients and controls, consistent with previously published findings. The average processing time per patient volume was 18 s for the hypothalamus and 44 s for the ICV segmentation on a 6 GB NVidia GeForce GTX 1060 GPU. The ngrok-based deployment allowed for seamless access across different devices and operating systems, with an average connection time of less than 5 s.
CONCLUSION: The developed GUI provides a powerful and accessible tool for applications in neuroimaging. The combination of the intuitive interface, accurate deep learning-based segmentation, and easy deployment via ngrok addresses the need for user-friendly tools in brain MRI analysis. This approach has the potential to streamline workflows in neuroimaging research.
PMID:40478471 | DOI:10.1007/s11548-025-03438-6
Artificial Intelligence Empowers Solid-State Batteries for Material Screening and Performance Evaluation
Nanomicro Lett. 2025 Jun 6;17(1):287. doi: 10.1007/s40820-025-01797-y.
ABSTRACT
Solid-state batteries are widely recognized as the next-generation energy storage devices with high specific energy, high safety, and high environmental adaptability. However, the research and development of solid-state batteries are resource-intensive and time-consuming due to their complex chemical environment, rendering performance prediction arduous and delaying large-scale industrialization. Artificial intelligence serves as an accelerator for solid-state battery development by enabling efficient material screening and performance prediction. This review will systematically examine how the latest progress in using machine learning (ML) algorithms can be used to mine extensive material databases and accelerate the discovery of high-performance cathode, anode, and electrolyte materials suitable for solid-state batteries. Furthermore, the use of ML technology to accurately estimate and predict key performance indicators in the solid-state battery management system will be discussed, among which are state of charge, state of health, remaining useful life, and battery capacity. Finally, we will summarize the main challenges encountered in the current research, such as data quality issues and poor code portability, and propose possible solutions and development paths. These will provide clear guidance for future research and technological reiteration.
PMID:40478305 | DOI:10.1007/s40820-025-01797-y
Deep Learning Reveals Liver MRI Features Associated With PNPLA3 I148M in Steatotic Liver Disease
Liver Int. 2025 Jul;45(7):e70164. doi: 10.1111/liv.70164.
ABSTRACT
BACKGROUND: Steatotic liver disease (SLD) is the most common liver disease worldwide, affecting 30% of the global population. It is strongly associated with the interplay of genetic and lifestyle-related risk factors. The genetic variant accounting for the largest fraction of SLD heritability is PNPLA3 I148M, which is carried by 23% of the western population and increases the risk of SLD two to three-fold. However, identification of variant carriers is not part of routine clinical care and prevents patients from receiving personalised care.
METHODS: We analysed MRI images and common genetic variants in PNPLA3, TM6SF2, MTARC1, HSD17B13 and GCKR from a cohort of 45 603 individuals from the UK Biobank. Proton density fat fraction (PDFF) maps were generated using a water-fat separation toolbox, applied to the magnitude and phase MRI data. The liver region was segmented using a U-Net model trained on 600 manually segmented ground truth images. The resulting liver masks and PDFF maps were subsequently used to calculate liver PDFF values. Individuals with (PDFF ≥ 5%) and without SLD (PDFF < 5%) were selected as the study cohort and used to train and test a Vision Transformer classification model with five-fold cross validation. We aimed to differentiate individuals who are homozygous for the PNPLA3 I148M variant from non-carriers, as evaluated by the area under the receiver operating characteristic curve (AUROC). To ensure a clear genetic contrast, all heterozygous individuals were excluded. To interpret our model, we generated attention maps that highlight the regions that are most predictive of the outcomes.
RESULTS: Homozygosity for the PNPLA3 I148M variant demonstrated the best predictive performance among five variants with AUROC of 0.68 (95% CI: 0.64-0.73) in SLD patients and 0.57 (95% CI: 0.52-0.61) in non-SLD patients. The AUROCs for the other SNPs ranged from 0.54 to 0.57 in SLD patients and from 0.52 to 0.54 in non-SLD patients. The predictive performance was generally higher in SLD patients compared to non-SLD patients. Attention maps for PNPLA3 I148M carriers showed that fat deposition in regions adjacent to the hepatic vessels, near the liver hilum, plays an important role in predicting the presence of the I148M variant.
CONCLUSION: Our study marks novel progress in the non-invasive detection of homozygosity for PNPLA3 I148M through the application of deep learning models on MRI images. Our findings suggest that PNPLA3 I148M might affect the liver fat distribution and could be used to predict the presence of PNPLA3 variants in patients with fatty liver. The findings of this research have the potential to be integrated into standard clinical practice, particularly when combined with clinical and biochemical data from other modalities to increase accuracy, enabling easier identification of at-risk individuals and facilitating the development of tailored interventions for PNPLA3 I148M-associated liver disease.
PMID:40478199 | DOI:10.1111/liv.70164
Ultrasound measurement of relative tongue size and its correlation with tongue mobility for healthy individuals
JASA Express Lett. 2025 Jun 1;5(6):065201. doi: 10.1121/10.0036838.
ABSTRACT
The size of an individual's tongue relative to the oral cavity is associated with articulation speed [Feng, Lu, Zheng, Chi, and Honda, in Proceedings of the 10th Biennial Asia Pacific Conference on Speech, Language, and Hearing (2017), pp. 17-19)] and may affect speech clarity. This study introduces an ultrasound-based method for measuring relative tongue size, termed ultrasound-based relative tongue size (uRTS), as a cost-effective alternative to the magnetic resonance imaging (MRI) based method. Using deep learning to extract the tongue contour, uRTS was calculated from tongue and oropharyngeal cavity sizes in the midsagittal plane. Results from ten speakers showed a strong correlation between uRTS and MRI-based measurements (r = 0.87) and a negative correlation with tongue movement speed (r = -0.73), indicating uRTS is a useful index for assessing tongue size.
PMID:40478168 | DOI:10.1121/10.0036838
Using fused Contourlet transform and neural features to spot COVID19 infections in CT scan images
Intell Syst Appl. 2023 Feb;17:200182. doi: 10.1016/j.iswa.2023.200182. Epub 2023 Jan 13.
ABSTRACT
The World Health Organization (WHO) claims that COVID19 is the pandemic disease of the 22nd century. The COVID19 disease is caused by a strain of coronavirus that led to the infection and death of millions of people and continues to do so unless we find mechanisms that enable healthcare providers to detect infections accurately and as early as possible. To that end, and to diagnose this lung infection, where CT scan images are usually reliable tools that physicians typically use to spot infections. Like many other research studies in the computing field, we present here a new approach for automating the process of identifying COVID19 infections in CT scans using Machine Learning. This approach uses the hybrid fast fuzzy c-means for COVID19 CT scan image segmentation. Then, the Contourlet transform and CNN feature extracted approaches are used to extract features individually from segmented CT scan images and combine them in one feature vector. For feature selection, we experimented with three feature selection techniques, namely, Principle Component Analysis (PCA), Minimum Redundancy Maximum Relevance (MRMR), and Binary Differential Evaluation (BDE), where we found the latter gave the best results. For classification, we used several neural network models (AlexNet, ResNet50, GoogleNet, VGG16, VGG19) and found that the ensemble classifier worked better. An extensive set of experiments was conducted on standard public datasets. The results suggest that our methodology gives better performance than other existing approaches with an accuracy of 99.98%.
PMID:40478143 | PMC:PMC9837210 | DOI:10.1016/j.iswa.2023.200182
Evaluation of artificial intelligence techniques in disease diagnosis and prediction
Discov Artif Intell. 2023;3(1):5. doi: 10.1007/s44163-023-00049-5. Epub 2023 Jan 30.
ABSTRACT
A broad range of medical diagnoses is based on analyzing disease images obtained through high-tech digital devices. The application of artificial intelligence (AI) in the assessment of medical images has led to accurate evaluations being performed automatically, which in turn has reduced the workload of physicians, decreased errors and times in diagnosis, and improved performance in the prediction and detection of various diseases. AI techniques based on medical image processing are an essential area of research that uses advanced computer algorithms for prediction, diagnosis, and treatment planning, leading to a remarkable impact on decision-making procedures. Machine Learning (ML) and Deep Learning (DL) as advanced AI techniques are two main subfields applied in the healthcare system to diagnose diseases, discover medication, and identify patient risk factors. The advancement of electronic medical records and big data technologies in recent years has accompanied the success of ML and DL algorithms. ML includes neural networks and fuzzy logic algorithms with various applications in automating forecasting and diagnosis processes. DL algorithm is an ML technique that does not rely on expert feature extraction, unlike classical neural network algorithms. DL algorithms with high-performance calculations give promising results in medical image analysis, such as fusion, segmentation, recording, and classification. Support Vector Machine (SVM) as an ML method and Convolutional Neural Network (CNN) as a DL method is usually the most widely used techniques for analyzing and diagnosing diseases. This review study aims to cover recent AI techniques in diagnosing and predicting numerous diseases such as cancers, heart, lung, skin, genetic, and neural disorders, which perform more precisely compared to specialists without human error. Also, AI's existing challenges and limitations in the medical area are discussed and highlighted.
PMID:40478140 | PMC:PMC9885935 | DOI:10.1007/s44163-023-00049-5
Deep viewing for the identification of Covid-19 infection status from chest X-Ray image using CNN based architecture
Intell Syst Appl. 2022 Nov;16:200130. doi: 10.1016/j.iswa.2022.200130. Epub 2022 Oct 6.
ABSTRACT
In recent years, coronavirus (Covid-19) has evolved into one of the world's leading life-threatening severe viral illnesses. A self-executing accord system might be a better option to stop Covid-19 from spreading due to its quick diagnostic option. Many researches have already investigated various deep learning techniques, which have a significant impact on the quick and precise early detection of Covid-19. Most of the existing techniques, though, have not been trained and tested using a significant amount of data. In this paper, we purpose a deep learning technique enabled Convolutional Neural Network (CNN) to automatically diagnose Covid-19 from chest x-rays. To train and test our model, 10,293 x-rays, including 2875 x-rays of Covid-19, were collected as a data set. The applied dataset consists of three groups of chest x-rays: Covid-19, pneumonia, and normal patients. The proposed approach achieved 98.5% accuracy, 98.9% specificity, 99.2% sensitivity, 99.2% precision, and 98.3% F1-score. Distinguishing Covid-19 patients from pneumonia patients using chest x-ray, particularly for human eyes is crucial since both diseases have nearly identical characteristics. To address this issue, we have categorized Covid-19 and pneumonia using x-rays, achieving a 99.60% accuracy rate. Our findings show that the proposed model might aid clinicians and researchers in rapidly detecting Covid-19 patients, hence facilitating the treatment of Covid-19 patients.
PMID:40478040 | PMC:PMC9536212 | DOI:10.1016/j.iswa.2022.200130
A multi-modal bone suppression, lung segmentation, and classification approach for accurate COVID-19 detection using chest radiographs
Intell Syst Appl. 2022 Nov;16:200148. doi: 10.1016/j.iswa.2022.200148. Epub 2022 Nov 7.
ABSTRACT
The high transmission rate of COVID-19 and the lack of quick, robust, and intelligent systems for its detection have become a point of concern for the public, Government, and health experts worldwide. The study of radiological images is one of the fastest ways to comprehend the infectious spread and diagnose a patient. However, it is difficult to differentiate COVID-19 from other pneumonic infections. The purpose of this research is to provide an automatic, precise, reliable, robust, and intelligent assisting system 'Covid Scanner' for mass screening of COVID-19, Non-COVID Viral Pneumonia, and Bacterial Pneumonia from healthy chest radiographs. To train the proposed system, the authors of this research prepared novel a dataset called, "COVID-Pneumonia CXR". The system is a coherent integration of bone suppression, lung segmentation, and the proposed classifier, 'EXP-Net'. The system reported an AUC of 96.58% on the validation dataset and 96.48% on the testing dataset comprising chest radiographs. The results from the ablation study prove the efficacy and generalizability of the proposed integrated pipeline of models. To prove the system's reliability, the feature heatmaps visualized in the lung region were validated by radiology experts. Moreover, a comparison with the state-of-the-art models and existing approaches shows that the proposed system finds clearer demarcation between the highly similar chest radiographs of COVID-19 and Non-COVID viral pneumonia. The copyright of "Covid Scanner" is protected with registration number SW-13625/2020. The code for the models used in this research is publicly available at: https://github.com/Ankit-Misra/multi_modal_covid_detection/.
PMID:40478003 | PMC:PMC9639387 | DOI:10.1016/j.iswa.2022.200148
Automatic engagement estimation in smart education/learning settings: a systematic review of engagement definitions, datasets, and methods
Smart Learn Environ. 2022;9(1):31. doi: 10.1186/s40561-022-00212-y. Epub 2022 Nov 12.
ABSTRACT
BACKGROUND: Recognizing learners' engagement during learning processes is important for providing personalized pedagogical support and preventing dropouts. As learning processes shift from traditional offline classrooms to distance learning, methods for automatically identifying engagement levels should be developed.
OBJECTIVE: This article aims to present a literature review of recent developments in automatic engagement estimation, including engagement definitions, datasets, and machine learning-based methods for automation estimation. The information, figures, and tables presented in this review aim at providing new researchers with insight on automatic engagement estimation to enhance smart learning with automatic engagement recognition methods.
METHODS: A literature search was carried out using Scopus, Mendeley references, the IEEE Xplore digital library, and ScienceDirect following the four phases of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA): identification, screening, eligibility, and inclusion. The selected studies included research articles published between 2010 and 2022 that focused on three research questions (RQs) related to the engagement definitions, datasets, and methods used in the literature. The article selection excluded books, magazines, news articles, and posters.
RESULTS: Forty-seven articles were selected to address the RQs and discuss engagement definitions, datasets, and methods. First, we introduce a clear taxonomy that defines engagement according to different types and the components used to measure it. Guided by this taxonomy, we reviewed the engagement types defined in the selected articles, with emotional engagement (n = 40; 65.57%) measured by affective cues appearing most often (n = 38; 57.58%). Then, we reviewed engagement and engagement-related datasets in the literature, with most studies assessing engagement with external observations (n = 20; 43.48%) and self-reported measures (n = 9; 19.57%). Finally, we summarized machine learning (ML)-based methods, including deep learning, used in the literature.
CONCLUSIONS: This review examines engagement definitions, datasets and ML-based methods from forty-seven selected articles. A taxonomy and three tables are presented to address three RQs and provide researchers in this field with guidance on enhancing smart learning with automatic engagement recognition. However, several key challenges remain, including cognitive and personalized engagement and ML issues that may affect real-world implementations.
PMID:40477985 | PMC:PMC9660128 | DOI:10.1186/s40561-022-00212-y
Modeling CAPRI Targets of Round 55 by Combining AlphaFold and Docking
Proteins. 2025 Jun 6. doi: 10.1002/prot.26853. Online ahead of print.
ABSTRACT
In recent years, the field of structural biology has seen remarkable advancements, particularly in modeling of protein tertiary and quaternary structures. The AlphaFold deep learning approach revolutionized protein structure prediction by achieving near-experimental accuracy on many targets. This paper presents a detailed account of structural modeling of oligomeric targets in Round 55 of CAPRI by combining deep learning-based predictions (AlphaFold2 multimer pipeline) with traditional docking techniques in a hybrid approach to protein-protein docking. To complement the AlphaFold models generated for the given oligomeric state of the targets, we built docking predictions by combining models generated for lower-oligomeric states-dimers for trimeric targets and trimers/dimers for tetrameric targets. In addition, we used a template-based docking procedure applied to AlphaFold predicted structures of the monomers. We analyzed the clustering of the generated AlphaFold models, the confidence in the prediction of intra- and inter-chain residue-residue contacts, and the correlation of the AlphaFold predictions stability with the quality of the submitted models.
PMID:40476317 | DOI:10.1002/prot.26853
Performance of ChatGPT-3.5 and ChatGPT-4 in Solving Questions Based on Core Concepts in Cardiovascular Physiology
Cureus. 2025 May 6;17(5):e83552. doi: 10.7759/cureus.83552. eCollection 2025 May.
ABSTRACT
Background Medical students often struggle to apply previously learned concepts to new situations, such as cardiovascular physiology. ChatGPT, an AI chatbot trained through deep learning, can analyze basic problems and produce human-like language in various subjects. Multiple-choice questions (MCQs) are given to students by many medical schools before exams, but due to time constraints, instructors frequently lack the resources necessary to adequately explain the practice questions. Even when given, the explanations might not give students sufficient information to grasp the concepts completely. This study aimed to examine ChatGPT's ability to solve various reasoning problems based on the core concepts of cardiovascular physiology. Materials and methods Multiple-choice questions were presented manually to both chatbots (ChatGPT-4 and ChatGPT-3.5), and the answers generated were compared with the faculty-led answer key using various statistical tests. Results The accuracy rates of ChatGPT-4 and ChatGPT-3.5 were 83.33% and 60%, respectively, which were statistically significant. Compared to ChatGPT-3.5, ChatGPT-4's explanation of the response was substantially more appropriate. The execution of ChatGPT-4 was better than ChatGPT-3.5 in certain core concept areas like mass balance (75% vs. 50%), scientific reasoning (60% vs. 40%), and homeostasis (100% vs. 66.67%). Conclusion When it came to responding to concept-based questions about cardiovascular physiology, ChatGPT-4 outperformed ChatGPT-3.5. However, to ensure accuracy, faculty members should review the generated explanations, and thus, the growing application of generative AI in the form of virtual-assisted learning approaches in medical education needs to be carefully considered.
PMID:40476113 | PMC:PMC12138729 | DOI:10.7759/cureus.83552
Dental practitioners versus artificial intelligence software in assessing alveolar bone loss using intraoral radiographs
J Taibah Univ Med Sci. 2025 May 9;20(3):272-279. doi: 10.1016/j.jtumed.2025.04.001. eCollection 2025 Jun.
ABSTRACT
OBJECTIVES: Integrating artificial intelligence (AI) in the dental field can potentially enhance the efficiency of dental care. However, few studies have investigated whether AI software can achieve results comparable to those obtained by dental practitioners (general practitioners (GPs) and specialists) when assessing alveolar bone loss in a clinical setting. Thus, this study compared the performance of AI in assessing periodontal bone loss with those of GPs and specialists.
METHODS: This comparative cross-sectional study evaluated the performance of dental practitioners and AI software in assessing alveolar bone loss. Radiographs were randomly selected to ensure representative samples. Dental practitioners independently evaluated the radiographs, and the AI software "Second Opinion Software" was tested using the same set of radiographs evaluated by the dental practitioners. The results produced by the AI software were then compared with the baseline values to measure their accuracy and allow direct comparison with the performance of human specialists.
RESULTS: The survey received 149 responses, where each answered 10 questions to compare the measurements made by AI and dental practitioners when assessing the amount of bone loss radiographically. The mean estimates of the participants had a moderate positive correlation with the radiographic measurements (rho = 0.547, p < 0.001) and a weaker but still significant correlation with AI measurements (rho = 0.365, p < 0.001). AI measurements had a stronger positive correlation with the radiographic measurements (rho = 0.712, p < 0.001) compared with their correlation with the estimates of dental practitioners.
CONCLUSION: This study highlights the capacity of AI software to enhance the accuracy and efficiency of radiograph-based evaluations of alveolar bone loss. Dental practitioners are vital for the clinical experience but AI technology provides a consistent and replicable methodology. Future collaborations between AI experts, researchers, and practitioners could potentially optimize patient care.
PMID:40476084 | PMC:PMC12136790 | DOI:10.1016/j.jtumed.2025.04.001
Prediction of the space group and cell volume by training a convolutional neural network with primitive 'ideal' diffraction profiles and its application to 'real' experimental data
J Appl Crystallogr. 2025 Apr 25;58(Pt 3):718-730. doi: 10.1107/S1600576725002419. eCollection 2025 Jun 1.
ABSTRACT
This study describes a deep learning approach to predict the space group and unit-cell volume of inorganic crystals from their powder X-ray diffraction profiles. Using an inorganic crystallographic database, convolutional neural network (CNN) models were successfully constructed with the δ-function-like 'ideal' X-ray diffraction profiles derived solely from the intrinsic properties of the crystal structure, which are dependent on neither the incident X-ray wavelength nor the line shape of the profiles. We examined how the statistical metrics (e.g. the prediction accuracy, precision and recall) are influenced by the ensemble averaging technique and the multi-task learning approach; six CNN models were created from an identical data set for the former, and the space group classification was coupled with the unit-cell volume prediction in a CNN architecture for the latter. The CNN models trained in the 'ideal' world were tested with 'real' X-ray profiles for eleven materials such as TiO2, LiNiO2 and LiMnO2. While the models mostly fared well in the 'real' world, the cases at odds were scrutinized to elucidate the causes of the mismatch. Specifically for Li2MnO3, detailed crystallographic considerations revealed that the mismatch can stem from the state of the specific material and/or from the quality of the experimental data, and not from the CNN models. The present study demonstrates that we can obviate the need for emulating experimental diffraction profiles in training CNN models to elicit structural information, thereby focusing efforts on further improvements.
PMID:40475932 | PMC:PMC12135985 | DOI:10.1107/S1600576725002419
YOLO-ODD: an improved YOLOv8s model for onion foliar disease detection
Front Plant Sci. 2025 May 22;16:1551794. doi: 10.3389/fpls.2025.1551794. eCollection 2025.
ABSTRACT
Onion crops are affected by many diseases at different stages of growth, resulting in significant yield loss. The early detection of diseases helps in the timely incorporation of management practices, thereby reducing yield losses. However, the manual identification of plant diseases requires considerable effort and is prone to mistakes. Thus, adopting cutting-edge technologies such as machine learning (ML) and deep learning (DL) can help overcome these difficulties by enabling the early detection of plant diseases. This study presents a cross layer integration of YOLOv8 architecture for detection of onion leaf diseases viz.anthracnose, Stemphylium blight, purple blotch (PB), and Twister disease. The experimental results demonstrate that customized YOLOv8 model YOLO-ODD integrated with CABM and DTAH attentions outperform YOLOv5 and YOLO v8 base models in most disease categories, particularly in detecting Anthracnose, Purple Blotch, and Twister disease. Proposed YOLOv8 model achieved the highest overall 77.30% accuracy, 81.50% precession and Recall of 72.10% and thus YOLOv8-based deep learning approach will detect and classify major onion foliar diseases while optimizing for accuracy, real-time application, and adaptability in diverse field conditions.
PMID:40475906 | PMC:PMC12137250 | DOI:10.3389/fpls.2025.1551794
Artificial intelligence driven mental health diagnosis based on physiological signals
MethodsX. 2025 May 7;14:103358. doi: 10.1016/j.mex.2025.103358. eCollection 2025 Jun.
ABSTRACT
Mental health disorders like stress, anxiety, and depression are increasing rapidly these days. Diagnosis of mental health disorders is a matter of concern in this era. A cost-effective and efficient method is to be implemented for detection. With this aim, stress is being monitored in this work with the help of physiological signals. This work uses a machine learning approach to detect a subject in stressed and non-stressed situations. This work aims to automatically detect stressful situations in humans using physiological data collected during anxiety-inducing scenarios. Diagnosing stress in the early stage can be helpful to minimize the risk of stress-related issues and enhance the overall well-being of the patient. The traditional methods for diagnosing stress are based on patient reporting. This approach has limitations. This proposed research aims to develop a stress-assessing model with a machine learning approach.•Stress and anxiety these days have become a prevalent issue affecting individuals' well-being and productivity. Early detection of these conditions is crucial for timely intervention and prevention of associated health complications. This paper presents a machine learning model for stress diagnosis.•The dataset consists of recordings obtained from individuals under different stress levels. The physiological signals used in this project are ECG, EMG, HR, RESP, Foot GSR, and Hand GSR. The machine learning algorithms, like Decision tree and kernel support vector machine, are employed for dope classification tasks. Additionally, a deep learning framework based on feed-forward artificial neural networks is introduced for comparative analysis.•The study evaluates the accuracies of both binary (Stressed Vs. non-Stressed) and three-class (relaxed Vs. baseline Vs. stressed) classification. Results demonstrate promising accuracies with machine learning techniques achieving up to 91.87 % and 66.68 % for binary classes and three classifications respectively. This paper highlights the potential of machine learning methods accurately detecting mental disorders offering insights for the development of effective detection managing tools.
PMID:40475896 | PMC:PMC12139511 | DOI:10.1016/j.mex.2025.103358
Bridging language gaps: The role of NLP and speech recognition in oral english instruction
MethodsX. 2025 May 7;14:103359. doi: 10.1016/j.mex.2025.103359. eCollection 2025 Jun.
ABSTRACT
The Natural Language Processing (NLP) and speech recognition have transformed language learning by providing interactive and real-time feedback, enhancing oral English proficiency. These technologies facilitate personalized and adaptive learning, making pronunciation and fluency improvement more efficient. Traditional methods lack real-time speech assessment and individualized feedback, limiting learners' progress. Existing speech recognition models struggle with diverse accents, variations in speaking styles, and computational efficiency, reducing their effectiveness in real-world applications. This study utilizes three datasets-including a custom dataset of 882 English teachers, the CMU ARCTIC corpus, and LibriSpeech Clean-to ensure generalizability and robustness. The methodology integrates Hidden Markov Models for speech recognition, NLP-based text analysis, and computer vision-based lip movement detection to create an adaptive multimodal learning system. The novelty of this study lies in its real-time Bayesian feedback mechanism and multimodal integration of audio, visual, and textual data, enabling dynamic and personalized oral instruction. Performance is evaluated using recognition accuracy, processing speed, and statistical significance testing. The continuous HMM model achieves up to 97.5 % accuracy and significantly outperforms existing models such as MLP-LSTM and GPT-3.5-turbo (p < 0.05) across all datasets. Developed a multimodal system that combines speech, text, and visual data to enhance real-time oral English learning.•Collected and annotated a diverse dataset of English speech recordings from teachers across various accents and speaking styles.•Designed an adaptive feedback framework to provide learners with immediate, personalized insights into their pronunciation and fluency.
PMID:40475890 | PMC:PMC12139008 | DOI:10.1016/j.mex.2025.103359
Enhancing patient-specific deep learning based segmentation for abdominal magnetic resonance imaging-guided radiation therapy: A framework conditioned on prior segmentation
Phys Imaging Radiat Oncol. 2025 Apr 17;34:100766. doi: 10.1016/j.phro.2025.100766. eCollection 2025 Apr.
ABSTRACT
BACKGROUND AND PURPOSE: Conventionally, the contours annotated during magnetic resonance-guided radiation therapy (MRgRT) planning are manually corrected during the RT fractions, which is a time-consuming task. Deep learning-based segmentation can be helpful, but the available patient-specific approaches require training at least one model per patient, which is computationally expensive. In this work, we introduced a novel framework that integrates fraction MR volumes and planning segmentation maps to generate robust fraction MR segmentations without the need for patient-specific retraining.
MATERIALS AND METHODS: The dataset included 69 patients (222 fraction MRs in total) treated with MRgRT for abdominal cancers with a 0.35 T MR-Linac, and annotations for eight clinically relevant abdominal structures (aorta, bowel, duodenum, left kidney, right kidney, liver, spinal canal and stomach). In the framework, we implemented two alternative models capable of generating patient-specific segmentations using the planning segmentation as prior information. The first one is a 3D UNet with dual-channel input (i.e. fraction MR and planning segmentation map) and the second one is a modified 3D UNet with double encoder for the same two inputs.
RESULTS: On average, the two models with prior anatomical information outperformed the conventional population-based 3D UNet with an increase in Dice similarity coefficient > 4 % . In particular, the dual-channel input 3D UNet outperformed the one with double encoder, especially when the alignment between the two input channels is satisfactory.
CONCLUSION: The proposed workflow was able to generate accurate patient-specific segmentations while avoiding training one model per patient and allowing for a seamless integration into clinical practice.
PMID:40475848 | PMC:PMC12138391 | DOI:10.1016/j.phro.2025.100766
Leveraging network uncertainty to identify regions in rectal cancer clinical target volume auto-segmentations likely requiring manual edits
Phys Imaging Radiat Oncol. 2025 May 8;34:100771. doi: 10.1016/j.phro.2025.100771. eCollection 2025 Apr.
ABSTRACT
BACKGROUND AND PURPOSE: While Deep Learning (DL) auto-segmentation has the potential to improve segmentation efficiency in the radiotherapy workflow, manual adjustments of the predictions are still required. Network uncertainty quantification has been proposed as a quality assurance tool to ensure an efficient segmentation workflow. However, the interpretation is often complicated due to various sources of uncertainty interacting non-trivially. In this work, we compared network predictions with both independent manual segmentations and manual corrections of the predictions. We assume that manual corrections only address clinically relevant errors and are therefore associated with lower aleatoric uncertainty due to less inter-observer variability. We expect the remaining epistemic uncertainty to be a better predictor of segmentation corrections.
MATERIALS AND METHODS: We considered DL auto-segmentations of the mesorectum clinical target volume. Uncertainty maps of nnU-Net outputs were generated using Monte Carlo dropout. On a global level, we investigated the correlation between mean network uncertainty and network segmentation performance. On a local level, we compared the uncertainty envelope width with the length of the error from both independent contours and corrected predictions. The uncertainty envelope widths were used to classify the error lengths as above or below a predefined threshold.
RESULTS: We achieved an AUC above 0.9 in identifying regions manually corrected with edits larger than 8 mm, while the AUC for inconsistencies with the independent contours was significantly lower at approximately 0.7.
CONCLUSIONS: Our results validate the hypothesis that epistemic uncertainty estimates are a valuable tool to capture regions likely requiring clinically relevant edits.
PMID:40475847 | PMC:PMC12140033 | DOI:10.1016/j.phro.2025.100771
AttentionAML: An Attention-based Deep Learning Framework for Accurate Molecular Categorization of Acute Myeloid Leukemia
bioRxiv [Preprint]. 2025 May 22:2025.05.20.655179. doi: 10.1101/2025.05.20.655179.
ABSTRACT
Acute myeloid leukemia (AML) is an aggressive hematopoietic malignancy defined by aberrant clonal expansion of abnormal myeloid progenitor cells. Characterized by morphological, molecular, and genetic alterations, AML encompasses multiple distinct subtypes that would exhibit subtype-specific responses to treatment and prognosis, underscoring the critical need of accurately identifying AML subtypes for effective clinical management and tailored therapeutic approaches. Traditional wet lab approaches such as immunophenotyping, cytogenetic analysis, morphological analysis, or molecular profiling to identify AML subtypes are labor-intensive, costly, and time-consuming. To address these challenges, we propose AttentionAML , a novel attention-based deep learning framework for accurately categorizing AML subtypes based on transcriptomic profiling only. Benchmarking tests based on 1,661 AML patients suggested that AttentionAML outperformed state-of-the-art methods across all evaluated metrics (accuracy: 0.96, precision: 0.96, recall of 0.96, F1-score: 0.96, and Matthews correlation coefficient: 0.96). Furthermore, we also demonstrated the superiority of AttentionAML over conventional approaches in terms of AML patient clustering visualization and subtype-specific gene marker characterization. We believe AttentionAML will bring remarkable positive impacts on downstream AML risk stratification and personalized treatment design. To enhance its impact, a user-friendly Python package implementing AttentionAML is publicly available at https://github.com/wan-mlab/AttentionAML .
PMID:40475602 | PMC:PMC12139891 | DOI:10.1101/2025.05.20.655179