Deep learning
A data-driven approach to turmeric disease detection: Dataset for plant condition classification
Data Brief. 2025 Feb 27;59:111435. doi: 10.1016/j.dib.2025.111435. eCollection 2025 Apr.
ABSTRACT
Turmeric, Curcuma longa, is an economically and medicinally important crop. However, the crop has often suffered from diseases such as rhizome disease roots, leaf blotch, and dry conditions of leaves. The control of these diseases essentially requires early and accurate diagnosis to reduce losses and help farmers adopt sustainable farming methods. The conventional methods of diagnosis involve a visual examination of symptoms, which is laborious, subjective, and rather impossible in large areas. This paper proposes a new dataset consisting of 1037 originals and 4628 augmented images of turmeric plants representing five classes: healthy leaf, dry leaf, leaf blotch, rhizome disease roots, and rhizome healthy roots. The dataset was pre-processed to enhance its applicability to deep learning applications by resizing, cleaning, and augmenting the data through flipping, rotation, and brightness adjustment. The turmeric plant disease classification was conducted using the Inception-v3 model, attaining an accuracy of 97.36% with data augmentation, compared to 95.71% without augmentation. Some of the major key performance metrics are precision, recall, and F1-score, which establish the efficacy and robustness of the model. This work attempts to show the potential of AI-aided solutions towards precision farming and sustainable crop production in developing agriculture disease management. The publicly available dataset and the results obtained are expected to attract more research interest for innovations in AI-driven agriculture.
PMID:40144898 | PMC:PMC11937664 | DOI:10.1016/j.dib.2025.111435
Feasibility study of single-image super-resolution scanning system based on deep learning for pathological diagnosis of oral epithelial dysplasia
Front Med (Lausanne). 2025 Mar 12;12:1550512. doi: 10.3389/fmed.2025.1550512. eCollection 2025.
ABSTRACT
This study aimed to evaluate the feasibility of applying deep learning combined with a super-resolution scanner for the digital scanning and diagnosis of oral epithelial dysplasia (OED) slides. A model of a super-resolution digital slide scanning system based on deep learning was built and trained using 40 pathological slides of oral epithelial tissue. Two hundred slides with definite OED diagnoses were scanned into digital slides by the DS30R and Nikon scanners, and the scanner parameters were obtained for comparison. Considering that diagnosis under a microscope is the gold standard, the sensitivity and specificity of OED pathological feature recognition by the same pathologist when reading different scanner images were evaluated. Furthermore, the consistency of whole-slide diagnosis results obtained by pathologists using various digital scanning imaging systems was assessed. This was done to evaluate the feasibility of the super-resolution digital slide-scanning system, which is based on deep learning, for the pathological diagnosis of OED. The DS30R scanner processes an entire slide in a single layer within 0.25 min, occupying 0.35GB of storage. In contrast, the Nikon scanner requires 15 min for scanning, utilizing 0.5GB of storage. Following model training, the system enhanced the clarity of imaging pathological sections of oral epithelial tissue. Both the DS30R and Nikon scanners demonstrate high sensitivity and specificity for detecting structural features in OED pathological images; however, DS30R excels at identifying certain cellular features. The agreement in full-section diagnostic conclusions by the same pathologist using different imaging systems was exceptionally high, with kappa values of 0.969 for DS30R-optical microscope and 0.979 for DS30R-Nikon-optical microscope. The performance of the super-resolution microscopic imaging system based on deep learning has improved. It preserves the diagnostic information of the OED and addresses the shortcomings of existing digital scanners, such as slow imaging speed, large data volumes, and challenges in rapid transmission and sharing. This high-quality super-resolution image lays a solid foundation for the future popularization of artificial intelligence (AI) technology and will aid AI in the accurate diagnosis of oral potential malignant diseases.
PMID:40144879 | PMC:PMC11936936 | DOI:10.3389/fmed.2025.1550512
Multi-objective RGB-D fusion network for non-destructive strawberry trait assessment
Front Plant Sci. 2025 Mar 12;16:1564301. doi: 10.3389/fpls.2025.1564301. eCollection 2025.
ABSTRACT
Growing consumer demand for high-quality strawberries has highlighted the need for accurate, efficient, and non-destructive methods to assess key postharvest quality traits, such as weight, size uniformity, and quantity. This study proposes a multi-objective learning algorithm that leverages RGB-D multimodal information to estimate these quality metrics. The algorithm develops a fusion expert network architecture that maximizes the use of multimodal features while preserving the distinct details of each modality. Additionally, a novel Heritable Loss function is implemented to reduce redundancy and enhance model performance. Experimental results show that the coefficient of determination (R²) values for weight, size uniformity and number are 0.94, 0.90 and 0.95 respectively. Ablation studies demonstrate the advantage of the architecture in multimodal, multi-task prediction accuracy. Compared to single-modality models, non-fusion branch networks, and attention-enhanced fusion models, our approach achieves enhanced performance across multi-task learning scenarios, providing more precise data for trait assessment and precision strawberry applications.
PMID:40144753 | PMC:PMC11937088 | DOI:10.3389/fpls.2025.1564301
YO-AFD: an improved YOLOv8-based deep learning approach for rapid and accurate apple flower detection
Front Plant Sci. 2025 Mar 12;16:1541266. doi: 10.3389/fpls.2025.1541266. eCollection 2025.
ABSTRACT
The timely and accurate detection of apple flowers is crucial for assessing the growth status of fruit trees, predicting peak blooming dates, and early estimating apple yields. However, challenges such as variable lighting conditions, complex growth environments, occlusion of apple flowers, clustered flowers and significant morphological variations, impede precise detection. To overcome these challenges, an improved YO-AFD method based on YOLOv8 for apple flower detection was proposed. First, to enable adaptive focus on features across different scales, a new attention module, ISAT, which integrated the Inverted Residual Mobile Block (IRMB) with the Spatial and Channel Synergistic Attention (SCSA) module was designed. This module was then incorporated into the C2f module within the network's neck, forming the C2f-IS module, to enhance the model's ability to extract critical features and fuse features across scales. Additionally, to balance attention between simple and challenging targets, a regression loss function based on Focaler Intersection over Union (FIoU) was used for loss function calculation. Experimental results showed that the YO-AFD model accurately detected both simple and challenging apple flowers, including small, occluded, and morphologically diverse flowers. The YO-AFD model achieved an F1 score of 88.6%, mAP50 of 94.1%, and mAP50-95 of 55.3%, with a model size of 6.5 MB and an average detection speed of 5.3 ms per image. The proposed YO-AFD method outperforms five comparative models, demonstrating its effectiveness and accuracy in real-time apple flower detection. With its lightweight design and high accuracy, this method offers a promising solution for developing portable apple flower detection systems.
PMID:40144752 | PMC:PMC11936985 | DOI:10.3389/fpls.2025.1541266
Development and validation of a deep learning-based automated computed tomography image segmentation and diagnostic model for infectious hydronephrosis: a retrospective multicentre cohort study
EClinicalMedicine. 2025 Mar 13;82:103146. doi: 10.1016/j.eclinm.2025.103146. eCollection 2025 Apr.
ABSTRACT
BACKGROUND: Accurately diagnosing whether hydronephrosis is complicated by infection is crucial for guiding appropriate clinical treatment. This study aimed to develop a fully automated segmentation and non-invasive diagnostic model for infectious hydronephrosis (IH) using CT images and a deep learning algorithm.
METHODS: A retrospective analysis of clinical information and annotated cross-sectional CT images from patients diagnosed with hydronephrosis between June 2, 2019 and June 30, 2024 at the Sun Yat-Sen Memorial Hospital (SYSMH), Heyuan People's Hospital (HPH), and Ganzhou People's Hospital (GPH) was performed. Data on cases of hydronephrosis were extracted from the hospital's medical record system. The SYSMH cohort was randomly divided into two subsets: the SYSMH training set (n = 279) and the SYSMH validation set (n = 93) in a 3:1 ratio. The HPH cohort and GPH cohort serve as external validation sets. A hydronephrosis segmentation model (HRSM) was developed using the Improved U-Net algorithm, and the segmentation accuracy evaluated by the Dice Similarity Coefficient (DSC). Using 3D Convolutional Neural Network established an IH risk score (IHRS) based on segmented images. Independent risk clinical data for IH were screened by logistic regression. An IH diagnostic model (IHDM) was then developed, incorporating the IHRS and clinical data, using five machine learning algorithms (Random Forests, K-Nearest Neighbor, Decision Tree, Logistic Regression and Support Vector Machine). The diagnostic performance of the IHDM was assessed by the Receiver Operating Characteristic (ROC) curve.
FINDINGS: The study initially included 1464 potential eligible cases, of which 864 were deemed qualified after preliminary examination. Ultimately, a total of 615 patients (363 female and 252 male) with hydronephrosis (including 5876 annotated cross-sectional CT images) were included in the study, 372 of whom were from SYSMH, 123 from HPH, and 120 from GPH. Based on bacterial culture results from percutaneous nephrostomy drainage of hydronephrosis, 291 cases were classified as IH, while 324 were non-IH. The DSC for the HRSM in the internal and two external validation cohorts were 0.922 (95% CI: 0.895, 0.949), 0.906 (95% CI: 0.869, 0.943), and 0.883 (95% CI: 0.857, 0.909), respectively, indicating high segmentation accuracy. The IHRS achieved a prediction accuracy of 78.5% (95% CI: 78.1%-78.9%) in the internal validation set. The IHDM developed using Support Vector Machine (SVM) combination with blood neutrophil count, fever within one week of history and IHRS performed best, demonstrated areas under the ROC curve of 0.919 (95% CI: 0.859-0.980), 0.902 (95% CI: 0.849-0.955), and 0.863 (95% CI: 0.800-0.926) in three cohorts, respectively.
INTERPRETATION: The automated HRSM demonstrated excellent segmentation performance for hydronephrosis, while the non-invasive IHDM provided significant diagnostic efficacy, facilitating infection assessment in patients with hydronephrosis. However, more diverse real-world multicenter validation studies are needed to verify the robustness of the model before it can be incorporated into clinical practice.
FUNDING: The Key-Area Research and Development Program of Guangdong Province, and the National Natural Science Foundation of China.
PMID:40144691 | PMC:PMC11938262 | DOI:10.1016/j.eclinm.2025.103146
Integration of longitudinal load-bearing tissue MRI radiomics and neural network to predict knee osteoarthritis incidence
J Orthop Translat. 2025 Mar 15;51:187-197. doi: 10.1016/j.jot.2025.01.007. eCollection 2025 Mar.
ABSTRACT
BACKGROUND: Load-bearing structural degradation is crucial in knee osteoarthritis (KOA) progression, yet limited prediction models use load-bearing tissue radiomics for radiographic (structural) KOA incident.
PURPOSE: We aim to develop and test a Load-Bearing Tissue plus Clinical variable Radiomic Model (LBTC-RM) to predict radiographic KOA incidents.
STUDY DESIGN: Risk prediction study.
METHODS: The 700 knees without radiographic KOA at baseline were included from Osteoarthritis Initiative cohort. We selected 2164 knee MRIs during 4-year follow-up. LBTC-RM, which integrated MRI features of meniscus, femur, tibia, femorotibial cartilage, and clinical variables, was developed in total development cohort (n = 1082, 542 cases vs. 540 controls) using neural network algorithm. Final predictive model was tested in total test cohort (n = 1082, 534 cases vs. 548 controls), which integrated data from five visits: baseline (n = 353, 191 cases vs. 162 controls), 3 years prior KOA (n = 46, 19 cases vs. 27 controls), 2 years prior KOA (n = 143, 77 cases vs. 66 controls), 1 year prior KOA (n = 220, 105 cases vs. 115 controls), and at KOA incident (n = 320, 156 cases vs. 164 controls).
RESULTS: In total test cohort, LBTC-RM predicted KOA incident with AUC (95 % CI) of 0.85 (0.82-0.87); with LBTC-RM aid, performance of resident physicians for KOA prediction were improved, with specificity, sensitivity, and accuracy increasing from 50 %, 60 %, and 55 %-72 %, 73 %, and 72 %, respectively. The LBTC-RM output indicated an increased KOA risk (OR: 20.6, 95 % CI: 13.8-30.6, p < .001). Radiomic scores of load-bearing tissue raised KOA risk (ORs: 1.02-1.9) from 4-year prior KOA whereas 3-dimensional feature score of medial meniscus decreased the OR (0.99) of KOA incident at KOA confirmed. The 2-dimensional feature score of medial meniscus increased the ORs (1.1-1.2) of KOA symptom score from 2-year prior KOA.
CONCLUSIONS: We provided radiomic features of load-bearing tissue to improved KOA risk level assessment and incident prediction. The model has potential clinical applicability in predicting KOA incidents early, enabling physicians to identify high-risk patients before significant radiographic evidence appears. This can facilitate timely interventions and personalized management strategies, improving patient outcomes.
THE TRANSLATIONAL POTENTIAL OF THIS ARTICLE: This study presents a novel approach integrating longitudinal MRI-based radiomics and clinical variables to predict knee osteoarthritis (KOA) incidence using machine learning. By leveraging deep learning for auto-segmentation and machine learning for predictive modeling, this research provides a more interpretable and clinically applicable method for early KOA detection. The introduction of a Radiomics Score System enhances the potential for radiomics as a virtual image-based biopsy tool, facilitating non-invasive, personalized risk assessment for KOA patients. The findings support the translation of advanced imaging and AI-driven predictive models into clinical practice, aiding early diagnosis, personalized treatment planning, and risk stratification for KOA progression. This model has the potential to be integrated into routine musculoskeletal imaging workflows, optimizing early intervention strategies and resource allocation for high-risk populations. Future validation across diverse cohorts will further enhance its clinical utility and generalizability.
PMID:40144553 | PMC:PMC11937290 | DOI:10.1016/j.jot.2025.01.007
A multi-modal deep learning solution for precise pneumonia diagnosis: the PneumoFusion-Net model
Front Physiol. 2025 Mar 12;16:1512835. doi: 10.3389/fphys.2025.1512835. eCollection 2025.
ABSTRACT
BACKGROUND: Pneumonia is considered one of the most important causes of morbidity and mortality in the world. Bacterial and viral pneumonia share many similar clinical features, thus making diagnosis a challenging task. Traditional diagnostic method developments mainly rely on radiological imaging and require a certain degree of consulting clinical experience, which can be inefficient and inconsistent. Deep learning for the classification of pneumonia in multiple modalities, especially integrating multiple data, has not been well explored.
METHODS: The study introduce the PneumoFusion-Net, a deep learning-based multimodal framework that incorporates CT images, clinical text, numerical lab test results, and radiology reports for improved diagnosis. In the experiments, a dataset of 10,095 pneumonia CT images was used-including associated clinical data-most of which was used for training and validation while keeping part of it for validation on a held-out test set. Five-fold cross-validation was considered in order to evaluate this model, calculating different metrics including accuracy and F1-Score.
RESULTS: PneumoFusion-Net, which achieved 98.96% classification accuracy with a 98% F1-score on the held-out test set, is highly effective in distinguishing bacterial from viral types of pneumonia. This has been highly beneficial for diagnosis, reducing misdiagnosis and further improving homogeneity across various data sets from multiple patients.
CONCLUSION: PneumoFusion-Net offers an effective and efficient approach to pneumonia classification by integrating diverse data sources, resulting in high diagnostic accuracy. Its potential for clinical integration could significantly reduce the burden of pneumonia diagnosis by providing radiologists and clinicians with a robust, automated diagnostic tool.
PMID:40144549 | PMC:PMC11937601 | DOI:10.3389/fphys.2025.1512835
Multimodal diagnosis of Alzheimer's disease based on resting-state electroencephalography and structural magnetic resonance imaging
Front Physiol. 2025 Mar 12;16:1515881. doi: 10.3389/fphys.2025.1515881. eCollection 2025.
ABSTRACT
Multimodal diagnostic methods for Alzheimer's disease (AD) have demonstrated remarkable performance. However, the inclusion of electroencephalography (EEG) in such multimodal studies has been relatively limited. Moreover, most multimodal studies on AD use convolutional neural networks (CNNs) to extract features from different modalities and perform fusion classification. Regrettably, this approach often lacks collaboration and fails to effectively enhance the representation ability of features. To address this issue and explore the collaborative relationship among multimodal EEG, this paper proposes a multimodal AD diagnosis model based on resting-state EEG and structural magnetic resonance imaging (sMRI). Specifically, this work designs corresponding feature extraction models for EEG and sMRI modalities to enhance the capability of extracting modality-specific features. Additionally, a multimodal joint attention mechanism (MJA) is developed to address the issue of independent modalities. The MJA promotes cooperation and collaboration between the two modalities, thereby enhancing the representation ability of multimodal fusion. Furthermore, a random forest classifier is introduced to enhance the classification ability. The diagnostic accuracy of the proposed model can achieve 94.7%, marking a noteworthy accomplishment. This research stands as the inaugural exploration into the amalgamation of deep learning and EEG multimodality for AD diagnosis. Concurrently, this work strives to bolster the use of EEG in multimodal AD research, thereby positioning itself as a hopeful prospect for future advancements in AD diagnosis.
PMID:40144547 | PMC:PMC11937600 | DOI:10.3389/fphys.2025.1515881
Review of applications of deep learning in veterinary diagnostics and animal health
Front Vet Sci. 2025 Mar 12;12:1511522. doi: 10.3389/fvets.2025.1511522. eCollection 2025.
ABSTRACT
Deep learning (DL), a subfield of artificial intelligence (AI), involves the development of algorithms and models that simulate the problem-solving capabilities of the human mind. Sophisticated AI technology has garnered significant attention in recent years in the domain of veterinary medicine. This review provides a comprehensive overview of the research dedicated to leveraging DL for diagnostic purposes within veterinary medicine. Our systematic review approach followed PRISMA guidelines, focusing on the intersection of DL and veterinary medicine, and identified 422 relevant research articles. After exporting titles and abstracts for screening, we narrowed our selection to 39 primary research articles directly applying DL to animal disease detection or management, excluding non-primary research, reviews, and unrelated AI studies. Key findings from the current body of research highlight an increase in the utilisation of DL models across various diagnostic areas from 2013 to 2024, including radiography (33% of the studies), cytology (33%), health record analysis (8%), MRI (8%), environmental data analysis (5%), photo/video imaging (5%), and ultrasound (5%). Over the past decade, radiographic imaging has emerged as most impactful. Various studies have demonstrated notable success in the classification of primary thoracic lesions and cardiac disease from radiographs using DL models compared to specialist veterinarian benchmarks. Moreover, the technology has proven adept at recognising, counting, and classifying cell types in microscope slide images, demonstrating its versatility across different veterinary diagnostic modality. While deep learning shows promise in veterinary diagnostics, several challenges remain. These challenges range from the need for large and diverse datasets, the potential for interpretability issues and the importance of consulting with experts throughout model development to ensure validity. A thorough understanding of these considerations for the design and implementation of DL in veterinary medicine is imperative for driving future research and development efforts in the field. In addition, the potential future impacts of DL on veterinary diagnostics are discussed to explore avenues for further refinement and expansion of DL applications in veterinary medicine, ultimately contributing to increased standards of care and improved health outcomes for animals as this technology continues to evolve.
PMID:40144529 | PMC:PMC11938132 | DOI:10.3389/fvets.2025.1511522
SympCoughNet: symptom assisted audio-based COVID-19 detection
Front Digit Health. 2025 Mar 12;7:1551298. doi: 10.3389/fdgth.2025.1551298. eCollection 2025.
ABSTRACT
COVID-19 remains a significant global public health challenge. While nucleic acid tests, antigen tests, and CT imaging provide high accuracy, they face inefficiencies and limited accessibility, making rapid and convenient testing difficult. Recent studies have explored COVID-19 detection using acoustic health signals, such as cough and breathing sounds. However, most existing approaches focus solely on audio classification, often leading to suboptimal accuracy while neglecting valuable prior information, such as clinical symptoms. To address this limitation, we propose SympCoughNet, a deep learning-based COVID-19 audio classification network that integrates cough sounds with clinical symptom data. Our model employs symptom-encoded channel weighting to enhance feature processing, making it more attentive to symptom information. We also conducted an ablation study to assess the impact of symptom integration by removing the symptom-attention mechanism and instead using symptoms as classification labels within a CNN-based architecture. We trained and evaluated SympCoughNet on the UK COVID-19 Vocal Audio Dataset. Our model demonstrated significant performance improvements over traditional audio-only approaches, achieving 89.30% accuracy, 94.74% AUROC, and 91.62% PR on the test set. The results confirm that incorporating symptom data enhances COVID-19 detection performance. Additionally, we found that incorrect symptom inputs could influence predictions. Our ablation study validated that even when symptoms are treated as classification labels, the network can still effectively leverage cough audio to infer symptom-related information.
PMID:40144457 | PMC:PMC11936986 | DOI:10.3389/fdgth.2025.1551298
Comparative Evaluation of Deep Learning Models for Diagnosis of Helminth Infections
J Pers Med. 2025 Mar 20;15(3):121. doi: 10.3390/jpm15030121.
ABSTRACT
(1) Background: Helminth infections are a widespread global health concern, with Ascaris and taeniasis representing two of the most prevalent infestations. Traditional diagnostic methods, such as egg-based microscopy, are fraught with challenges, including subjectivity and low throughput, often leading to misdiagnosis. This study evaluates the efficacy of advanced deep learning models in accurately classifying Ascaris lumbricoides and Taenia saginata eggs from microscopic images, proposing a technologically enhanced approach for diagnostics in clinical settings. (2) Methods: Three state-of-the-art deep learning models, ConvNeXt Tiny, EfficientNet V2 S, and MobileNet V3 S, are considered. A diverse dataset comprising images of Ascaris, Taenia, and uninfected eggs was utilized for training and validating these models by performing multiclass experiments. (3) Results: All models demonstrated high classificatory accuracy, with ConvNeXt Tiny achieving an F1-score of 98.6%, followed by EfficientNet V2 S at 97.5% and MobileNet V3 S at 98.2% in the experiments. These results prove the potential of deep learning in streamlining and improving the diagnostic process for helminthic infections. The application of deep learning models such as ConvNeXt Tiny, EfficientNet V2 S, and MobileNet V3 S shows promise for efficient and accurate helminth egg classification, potentially significantly enhancing the diagnostic workflow. (4) Conclusion: The study demonstrates the feasibility of leveraging advanced computational techniques in parasitology and points towards a future where rapid, objective, and reliable diagnostics are standard.
PMID:40137437 | DOI:10.3390/jpm15030121
Explainable Siamese Neural Networks for Detection of High Fall Risk Older Adults in the Community Based on Gait Analysis
J Funct Morphol Kinesiol. 2025 Feb 22;10(1):73. doi: 10.3390/jfmk10010073.
ABSTRACT
BACKGROUND/OBJECTIVES: Falls among the older adult population represent a significant public health concern, often leading to diminished quality of life and serious injuries that escalate healthcare costs, and they may even prove fatal. Accurate fall risk prediction is therefore crucial for implementing timely preventive measures. However, to date, there is no definitive metric to identify individuals with high risk of experiencing a fall. To address this, the present study proposes a novel approach that transforms biomechanical time-series data, derived from gait analysis, into visual representations to facilitate the application of deep learning (DL) methods for fall risk assessment.
METHODS: By leveraging convolutional neural networks (CNNs) and Siamese neural networks (SNNs), the proposed framework effectively addresses the challenges of limited datasets and delivers robust predictive capabilities.
RESULTS: Through the extraction of distinctive gait-related features and the generation of class-discriminative activation maps using Grad-CAM, the random forest (RF) machine learning (ML) model not only achieves commendable accuracy (83.29%) but also enhances explainability.
CONCLUSIONS: Ultimately, this study underscores the potential of advanced computational tools and machine learning algorithms to improve fall risk prediction, reduce healthcare burdens, and promote greater independence and well-being among the older adults.
PMID:40137325 | DOI:10.3390/jfmk10010073
Machine Learning for Human Activity Recognition: State-of-the-Art Techniques and Emerging Trends
J Imaging. 2025 Mar 20;11(3):91. doi: 10.3390/jimaging11030091.
ABSTRACT
Human activity recognition (HAR) has emerged as a transformative field with widespread applications, leveraging diverse sensor modalities to accurately identify and classify human activities. This paper provides a comprehensive review of HAR techniques, focusing on the integration of sensor-based, vision-based, and hybrid methodologies. It explores the strengths and limitations of commonly used modalities, such as RGB images/videos, depth sensors, motion capture systems, wearable devices, and emerging technologies like radar and Wi-Fi channel state information. The review also discusses traditional machine learning approaches, including supervised and unsupervised learning, alongside cutting-edge advancements in deep learning, such as convolutional and recurrent neural networks, attention mechanisms, and reinforcement learning frameworks. Despite significant progress, HAR still faces critical challenges, including handling environmental variability, ensuring model interpretability, and achieving high recognition accuracy in complex, real-world scenarios. Future research directions emphasise the need for improved multimodal sensor fusion, adaptive and personalised models, and the integration of edge computing for real-time analysis. Additionally, addressing ethical considerations, such as privacy and algorithmic fairness, remains a priority as HAR systems become more pervasive. This study highlights the evolving landscape of HAR and outlines strategies for future advancements that can enhance the reliability and applicability of HAR technologies in diverse domains.
PMID:40137203 | DOI:10.3390/jimaging11030091
Recovering Image Quality in Low-Dose Pediatric Renal Scintigraphy Using Deep Learning
J Imaging. 2025 Mar 19;11(3):88. doi: 10.3390/jimaging11030088.
ABSTRACT
The objective of this study is to propose an advanced image enhancement strategy to address the challenge of reducing radiation doses in pediatric renal scintigraphy. Data from a public dynamic renal scintigraphy database were used. Based on noisier images, four denoising neural networks (DnCNN, UDnCNN, DUDnCNN, and AttnGAN) were evaluated. To evaluate the quality of the noise reduction, with minimal detail loss, the kidney signal-to-noise ratio (SNR) and multiscale structural similarity (MS-SSIM) were used. Although all the networks reduced noise, UDnCNN achieved the best balance between SNR and MS-SSIM, leading to the most notable improvements in image quality. In clinical practice, 100% of the acquired data are summed to produce the final image. To simulate the dose reduction, we summed only 50%, simulating a proportional decrease in radiation. The proposed deep-learning approach for image enhancement ensured that half of all the frames acquired may yield results that are comparable to those of the complete dataset, suggesting that it is feasible to reduce patients' exposure to radiation. This study demonstrates that the neural networks evaluated can markedly improve the renal scintigraphic image quality, facilitating high-quality imaging with lower radiation doses, which will benefit the pediatric population considerably.
PMID:40137200 | DOI:10.3390/jimaging11030088
Automatic Segmentation of Plants and Weeds in Wide-Band Multispectral Imaging (WMI)
J Imaging. 2025 Mar 18;11(3):85. doi: 10.3390/jimaging11030085.
ABSTRACT
Semantic segmentation in deep learning is a crucial area of research within computer vision, aimed at assigning specific labels to each pixel in an image. The segmentation of crops, plants, and weeds has significantly advanced the application of deep learning in precision agriculture, leading to the development of sophisticated architectures based on convolutional neural networks (CNNs). This study proposes a segmentation algorithm for identifying plants and weeds using broadband multispectral images. In the first part of this algorithm, we utilize the PIF-Net model for feature extraction and fusion. The resulting feature map is then employed to enhance an optimized U-Net model for semantic segmentation within a broadband system. Our investigation focuses specifically on scenes from the CAVIAR dataset of multispectral images. The proposed algorithm has enabled us to effectively capture complex details while regulating the learning process, achieving an impressive overall accuracy of 98.2%. The results demonstrate that our approach to semantic segmentation and the differentiation between plants and weeds yields accurate and compelling outcomes.
PMID:40137197 | DOI:10.3390/jimaging11030085
Deep Learning-Based Semantic Segmentation for Objective Colonoscopy Quality Assessment
J Imaging. 2025 Mar 18;11(3):84. doi: 10.3390/jimaging11030084.
ABSTRACT
Background: This study aims to objectively evaluate the overall quality of colonoscopies using a specially trained deep learning-based semantic segmentation neural network. This represents a modern and valuable approach for the analysis of colonoscopy frames. Methods: We collected thousands of colonoscopy frames extracted from a set of video colonoscopy files. A color-based image processing method was used to extract color features from specific regions of each colonoscopy frame, namely, the intestinal mucosa, residues, artifacts, and lumen. With these features, we automatically annotated all the colonoscopy frames and then selected the best of them to train a semantic segmentation network. This trained network was used to classify the four region types in a different set of test colonoscopy frames and extract pixel statistics that are relevant to quality evaluation. The test colonoscopies were also evaluated by colonoscopy experts using the Boston scale. Results: The deep learning semantic segmentation method obtained good results, in terms of classifying the four key regions in colonoscopy frames, and produced pixel statistics that are efficient in terms of objective quality assessment. The Spearman correlation results were as follows: BBPS vs. pixel scores: 0.69; BBPS vs. mucosa pixel percentage: 0.63; BBPS vs. residue pixel percentage: -0.47; BBPS vs. Artifact Pixel Percentage: -0.65. The agreement analysis using Cohen's Kappa yielded a value of 0.28. The colonoscopy evaluation based on the extracted pixel statistics showed a fair level of compatibility with the experts' evaluations. Conclusions: Our proposed deep learning semantic segmentation approach is shown to be a promising tool for evaluating the overall quality of colonoscopies and goes beyond the Boston Bowel Preparation Scale in terms of assessing colonoscopy quality. In particular, while the Boston scale focuses solely on the amount of residual content, our method can identify and quantify the percentage of colonic mucosa, residues, and artifacts, providing a more comprehensive and objective evaluation.
PMID:40137196 | DOI:10.3390/jimaging11030084
GM-CBAM-ResNet: A Lightweight Deep Learning Network for Diagnosis of COVID-19
J Imaging. 2025 Mar 3;11(3):76. doi: 10.3390/jimaging11030076.
ABSTRACT
COVID-19 can cause acute infectious diseases of the respiratory system, and may probably lead to heart damage, which will seriously threaten human health. Electrocardiograms (ECGs) have the advantages of being low cost, non-invasive, and radiation free, and is widely used for evaluating heart health status. In this work, a lightweight deep learning network named GM-CBAM-ResNet is proposed for diagnosing COVID-19 based on ECG images. GM-CBAM-ResNet is constructed by replacing the convolution module with the Ghost module (GM) and adding the convolutional block attention module (CBAM) in the residual module of ResNet. To reveal the superiority of GM-CBAM-ResNet, the other three methods (ResNet, GM-ResNet, and CBAM-ResNet) are also analyzed from the following aspects: model performance, complexity, and interpretability. The model performance is evaluated by using the open 'ECG Images dataset of Cardiac and COVID-19 Patients'. The complexity is reflected by comparing the number of model parameters. The interpretability is analyzed by utilizing Gradient-weighted Class Activation Mapping (Grad-CAM). Parameter statistics indicate that, on the basis of ResNet19, the number of model parameters of GM-CBAM-ResNet19 is reduced by 45.4%. Experimental results show that, under less model complexity, GM-CBAM-ResNet19 improves the diagnostic accuracy by approximately 5% in comparison with ResNet19. Additionally, the interpretability analysis shows that CBAM can suppress the interference of grid backgrounds and ensure higher diagnostic accuracy under lower model complexity. This work provides a lightweight solution for the rapid and accurate diagnosing of COVD-19 based on ECG images, which holds significant practical deployment value.
PMID:40137188 | DOI:10.3390/jimaging11030076
Concealed Weapon Detection Using Thermal Cameras
J Imaging. 2025 Feb 26;11(3):72. doi: 10.3390/jimaging11030072.
ABSTRACT
In an era where security concerns are ever-increasing, the need for advanced technology to detect visible and concealed weapons has become critical. This paper introduces a novel two-stage method for concealed handgun detection, leveraging thermal imaging and deep learning, offering a potential real-world solution for law enforcement and surveillance applications. The approach first detects potential firearms at the frame level and subsequently verifies their association with a detected person, significantly reducing false positives and false negatives. Alarms are triggered only under specific conditions to ensure accurate and reliable detection, with precautionary alerts raised if no person is detected but a firearm is identified. Key contributions include a lightweight algorithm optimized for low-end embedded devices, making it suitable for wearable and mobile applications, and the creation of a tailored thermal dataset for controlled concealment scenarios. The system is implemented on a chest-worn Android smartphone with a miniature thermal camera, enabling hands-free operation. Experimental results validate the method's effectiveness, achieving an mAP@50-95 of 64.52% on our dataset, improving state-of-the-art methods. By reducing false negatives and improving reliability, this study offers a scalable, practical solution for security applications.
PMID:40137184 | DOI:10.3390/jimaging11030072
A Comparative Study of Network-Based Machine Learning Approaches for Binary Classification in Metabolomics
Metabolites. 2025 Mar 3;15(3):174. doi: 10.3390/metabo15030174.
ABSTRACT
Background/Objectives: Metabolomics has recently emerged as a key tool in the biological sciences, offering insights into metabolic pathways and processes. Over the last decade, network-based machine learning approaches have gained significant popularity and application across various fields. While several studies have utilized metabolomics profiles for sample classification, many network-based machine learning approaches remain unexplored for metabolomic-based classification tasks. This study aims to compare the performance of various network-based machine learning approaches, including recently developed methods, in metabolomics-based classification. Methods: A standard data preprocessing procedure was applied to 17 metabolomic datasets, and Bayesian neural network (BNN), convolutional neural network (CNN), feedforward neural network (FNN), Kolmogorov-Arnold network (KAN), and spiking neural network (SNN) were evaluated on each dataset. The datasets varied widely in size, mass spectrometry method, and response variable. Results: With respect to AUC on test data, BNN, CNN, FNN, KAN, and SNN were the top-performing models in 4, 1, 5, 3, and 4 of the 17 datasets, respectively. Regarding F1-score, the top-performing models were BNN (3 datasets), CNN (3 datasets), FNN (4 datasets), KAN (4 datasets), and SNN (3 datasets). For accuracy, BNN, CNN, FNN, KAN, and SNN performed best in 4, 1, 4, 4, and 4 datasets, respectively. Conclusions: No network-based modeling approach consistently outperformed others across the metrics of AUC, F1-score, or accuracy. Our results indicate that while no single network-based modeling approach is superior for metabolomics-based classification tasks, BNN, KAN, and SNN may be underappreciated and underutilized relative to the more commonly used CNN and FNN.
PMID:40137139 | DOI:10.3390/metabo15030174
Prediction of Water Chemical Oxygen Demand with Multi-Scale One-Dimensional Convolutional Neural Network Fusion and Ultraviolet-Visible Spectroscopy
Biomimetics (Basel). 2025 Mar 20;10(3):191. doi: 10.3390/biomimetics10030191.
ABSTRACT
Chemical oxygen demand (COD) is a critical parameter employed to assess the level of organic pollution in water. Accurate COD detection is essential for effective environmental monitoring and water quality assessment. Ultraviolet-visible (UV-Vis) spectroscopy has become a widely applied method for COD detection due to its convenience and the absence of the need for chemical reagents. This non-destructive and reagent-free approach offers a rapid and reliable means of analyzing water. Recently, deep learning has emerged as a powerful tool for automating the process of spectral feature extraction and improving COD prediction accuracy. In this paper, we propose a novel multi-scale one-dimensional convolutional neural network (MS-1D-CNN) fusion model designed specifically for spectral feature extraction and COD prediction. The architecture of the proposed model involves inputting raw UV-Vis spectra into three parallel sub-1D-CNNs, which independently process the data. The outputs from the final convolution and pooling layers of each sub-CNN are then fused into a single layer, capturing a rich set of spectral features. This fused output is subsequently passed through a Flatten layer followed by fully connected layers to predict the COD value. Experimental results demonstrate the effectiveness of the proposed method, as it was compared with three traditional methods and three deep learning methods on the same dataset. The MS-1D-CNN model showed a significant improvement in the accuracy of COD prediction, highlighting its potential for more reliable and efficient water quality monitoring.
PMID:40136845 | DOI:10.3390/biomimetics10030191