Deep learning

A comparative study of explainability methods for whole slide classification of lymph node metastases using vision transformers

Tue, 2025-04-15 06:00

PLOS Digit Health. 2025 Apr 15;4(4):e0000792. doi: 10.1371/journal.pdig.0000792. eCollection 2025 Apr.

ABSTRACT

Recent advancements in deep learning have shown promise in enhancing the performance of medical image analysis. In pathology, automated whole slide imaging has transformed clinical workflows by streamlining routine tasks and diagnostic and prognostic support. However, the lack of transparency of deep learning models, often described as black boxes, poses a significant barrier to their clinical adoption. This study evaluates various explainability methods for Vision Transformers, assessing their effectiveness in explaining the rationale behind their classification predictions on histopathological images. Using a Vision Transformer trained on the publicly available CAMELYON16 dataset comprising of 399 whole slide images of lymph node metastases of patients with breast cancer, we conducted a comparative analysis of a diverse range of state-of-the-art techniques for generating explanations through heatmaps, including Attention Rollout, Integrated Gradients, RISE, and ViT-Shapley. Our findings reveal that Attention Rollout and Integrated Gradients are prone to artifacts, while RISE and particularly ViT-Shapley generate more reliable and interpretable heatmaps. ViT-Shapley also demonstrated faster runtime and superior performance in insertion and deletion metrics. These results suggest that integrating ViT-Shapley-based heatmaps into pathology reports could enhance trust and scalability in clinical workflows, facilitating the adoption of explainable artificial intelligence in pathology.

PMID:40233316 | DOI:10.1371/journal.pdig.0000792

Categories: Literature Watch

Can Super Resolution via Deep Learning Improve Classification Accuracy in Dental

Tue, 2025-04-15 06:00

Dentomaxillofac Radiol. 2025 Apr 15:twaf029. doi: 10.1093/dmfr/twaf029. Online ahead of print.

ABSTRACT

OBJECTIVES: Deep Learning-driven Super Resolution (SR) aims to enhance the quality and resolution of images, offering potential benefits in dental imaging. Although extensive research has focused on deep learning based dental classification tasks, the impact of applying super-resolution techniques on classification remains underexplored. This study seeks to address this gap by evaluating and comparing the performance of deep learning classification models on dental images with and without super-resolution enhancement.

METHODS: An open-source dental image dataset was utilized to investigate the impact of SR on image classification performance. SR was applied by two models with a scaling ratio of 2 and 4, while classification was performed by four deep learning models. Performances were evaluated by well-accepted metrics like SSIM, PSNR, accuracy, recall, precision, and F1-score. The effect of SR on classification performance is interpreted through two different approaches.

RESULTS: Two SR models yielded average SSIM and PSNR values of 0.904 and 36.71 for increasing resolution with two scaling ratios. Average accuracy and F-1 score for the classification trained and tested with two SR-generated images were 0.859 and 0.873. In the first of the comparisons carried out with two different approaches, it was observed that the accuracy increased in at least half of the cases (8 out of 16) when different models and scaling ratios were considered, while in the second approach, SR showed a significantly higher performance for almost all cases (12 out of 16).

CONCLUSION: This study demonstrated that the classification with SR-generated images significantly improved outcomes.

ADVANCES IN KNOWLEDGE: For the first time, the classification performance of dental radiographs with improved resolution by SR has been investigated. Significant performance improvement was observed compared to the case without SR.

PMID:40233244 | DOI:10.1093/dmfr/twaf029

Categories: Literature Watch

Viral escape-inspired framework for structure-guided dual bait protein biosensor design

Tue, 2025-04-15 06:00

PLoS Comput Biol. 2025 Apr 15;21(4):e1012964. doi: 10.1371/journal.pcbi.1012964. Online ahead of print.

ABSTRACT

A generalizable computational platform, CTRL-V (Computational TRacking of Likely Variants), is introduced to design selective binding (dual bait) biosensor proteins. The iteratively evolving receptor binding domain (RBD) of SARS-CoV-2 spike protein has been construed as a model dual bait biosensor which has iteratively evolved to distinguish and selectively bind to human entry receptors and avoid binding neutralizing antibodies. Spike RBD prioritizes mutations that reduce antibody binding while enhancing/ retaining binding with the ACE2 receptor. CTRL-V's through iterative design cycles was shown to pinpoint 20% (of the 39) reported SARS-CoV-2 point mutations across 30 circulating, infective strains as responsible for immune escape from commercial antibody LY-CoV1404. CTRL-V successfully identifies ~70% (five out of seven) single point mutations (371F, 373P, 440K, 445H, 456L) in the latest circulating KP.2 variant and offers detailed structural insights to the escape mechanism. While other data-driven viral escape variant predictor tools have shown promise in predicting potential future viral variants, they require massive amounts of data to bypass the need for physics of explicit biochemical interactions. Consequently, they cannot be generalized for other protein design applications. The publicly availably viral escape data was leveraged as in vivo anchors to streamline a computational workflow that can be generalized for dual bait biosensor design tasks as exemplified by identifying key mutational loci in Raf kinase that enables it to selectively bind Ras and Rap1a GTP. We demonstrate three versions of CTRL-V which use a combination of integer optimization, stochastic sampling by PyRosetta, and deep learning-based ProteinMPNN for structure-guided biosensor design.

PMID:40233103 | DOI:10.1371/journal.pcbi.1012964

Categories: Literature Watch

ProtNote: a multimodal method for protein-function annotation

Tue, 2025-04-15 06:00

Bioinformatics. 2025 Apr 15:btaf170. doi: 10.1093/bioinformatics/btaf170. Online ahead of print.

ABSTRACT

MOTIVATION: Understanding the protein sequence-function relationship is essential for advancing protein biology and engineering. However, less than 1% of known protein sequences have human-verified functions. While deep learning methods have demonstrated promise for protein function prediction, current models are limited to predicting only those functions on which they were trained.

RESULTS: Here, we introduce ProtNote, a multimodal deep learning model that leverages free-form text to enable both supervised and zero-shot protein function prediction. ProtNote not only maintains near state-of-the-art performance for annotations in its training set, but also generalizes to unseen and novel functions in zero-shot test settings. ProtNote demonstrates superior performance in prediction of novel GO annotations and EC numbers compared to baseline models by capturing nuanced sequence-function relationships that unlock a range of biological use cases inaccessible to prior models. We envision that ProtNote will enhance protein function discovery by enabling scientists to use free text inputs without restriction to predefined labels - a necessary capability for navigating the dynamic landscape of protein biology.

AVAILABILITY AND IMPLEMENTATION: The code is available on GitHub: https://github.com/microsoft/protnote; model weights, datasets, and evaluation metrics are provided via Zenodo: https://zenodo.org/records/13897920.

SUPPLEMENTARY INFORMATION: Supplementary Information is available at Bioinformatics online.

PMID:40233101 | DOI:10.1093/bioinformatics/btaf170

Categories: Literature Watch

Protocol for deep-learning-driven cell type label transfer in single-cell RNA sequencing data

Tue, 2025-04-15 06:00

STAR Protoc. 2025 Apr 14;6(2):103768. doi: 10.1016/j.xpro.2025.103768. Online ahead of print.

ABSTRACT

Here, we present a protocol for using SIMS (scalable, interpretable machine learning for single cell) to transfer cell type labels in single-cell RNA sequencing data. This protocol outlines data preparation, model training with labeled data or inference using pretrained models, and methods for visualizing, downloading, and interpreting predictions. We provide stepwise instructions for accessing SIMS through the application programming interface (API), GitHub Codespaces, and a web application. For complete details on the use and execution of this protocol, please refer to Gonzalez-Ferrer et al.1.

PMID:40232935 | DOI:10.1016/j.xpro.2025.103768

Categories: Literature Watch

Heterogeneous Mutual Knowledge Distillation for Wearable Human Activity Recognition

Tue, 2025-04-15 06:00

IEEE Trans Neural Netw Learn Syst. 2025 Apr 15;PP. doi: 10.1109/TNNLS.2025.3556317. Online ahead of print.

ABSTRACT

Recently, numerous deep learning algorithms have addressed wearable human activity recognition (HAR), but they often struggle with efficient knowledge transfer to lightweight models for mobile devices. Knowledge distillation (KD) is a popular technique for model compression, transferring knowledge from a complex teacher to a compact student. Most existing KD algorithms consider homogeneous architectures, hindering performance in heterogeneous setups. This is an under-explored area in wearable HAR. To bridge this gap, we propose a heterogeneous mutual KD (HMKD) framework for wearable HAR. HMKD establishes mutual learning within the intermediate and output layers of both teacher and student models. To accommodate substantial structural differences between teacher and student, we employ a weighted ensemble feature approach to merge the features from their intermediate layers, enhancing knowledge exchange within them. Experimental results on the HAPT, WISDM, and UCI_HAR datasets show HMKD outperforms ten state-of-the-art KD algorithms in terms of classification accuracy. Notably, with ResNetLSTMaN as the teacher and MLP as the student, HMKD increases by 9.19% in MLP's $F_{1}$ score on the HAPT dataset.

PMID:40232930 | DOI:10.1109/TNNLS.2025.3556317

Categories: Literature Watch

FLINT: Learning-based Flow Estimation and Temporal Interpolation for Scientific Ensemble Visualization

Tue, 2025-04-15 06:00

IEEE Trans Vis Comput Graph. 2025 Apr 15;PP. doi: 10.1109/TVCG.2025.3561091. Online ahead of print.

ABSTRACT

We present FLINT (learning-based FLow estimation and temporal INTerpolation), a novel deep learning-based approach to estimate flow fields for 2D+time and 3D+time scientific ensemble data. FLINT can flexibly handle different types of scenarios with (1) a flow field being partially available for some members (e.g., omitted due to space constraints) or (2) no flow field being available at all (e.g., because it could not be acquired during an experiment). The design of our architecture allows to flexibly cater to both cases simply by adapting our modular loss functions, effectively treating the different scenarios as flow-supervised and flow-unsupervised problems, respectively (with respect to the presence or absence of ground-truth flow). To the best of our knowledge, FLINT is the first approach to perform flow estimation from scientific ensembles, generating a corresponding flow field for each discrete timestep, even in the absence of original flow information. Additionally, FLINT produces high-quality temporal interpolants between scalar fields. FLINT employs several neural blocks, each featuring several convolutional and deconvolutional layers. We demonstrate performance and accuracy for different usage scenarios with scientific ensembles from both simulations and experiments.

PMID:40232923 | DOI:10.1109/TVCG.2025.3561091

Categories: Literature Watch

VibTac: A High-Resolution High-Bandwidth Tactile Sensing Finger for Multi-Modal Perception in Robotic Manipulation

Tue, 2025-04-15 06:00

IEEE Trans Haptics. 2025 Apr 15;PP. doi: 10.1109/TOH.2025.3561049. Online ahead of print.

ABSTRACT

Tactile sensing is pivotal for enhancing robot manipulation abilities by providing crucial feedback for localized information. However, existing sensors often lack the necessary resolution and bandwidth required for intricate tasks. To address this gap, we introduce VibTac, a novel multi-modal tactile sensing finger designed to offer high-resolution and high-bandwidth tactile sensing simultaneously. VibTac seamlessly integrates vision-based and vibration-based tactile sensing modes to achieve high-resolution and high-bandwidth tactile sensing respectively, leveraging a streamlined human-inspired design for versatility in tasks. This paper outlines the key design elements of VibTac and its fabrication methods, highlighting the significance of the Elastomer Gel Pad (EGP) in its sensing mechanism. The sensor's multi-modal performance is validated through 3D reconstruction and spectral analysis to discern tactile stimuli effectively. In experimental trials, VibTac demonstrates its efficacy by achieving over 90% accuracy in insertion tasks involving objects emitting distinct sounds, such as ethernet connectors. Leveraging vision-based tactile sensing for object localization and employing a deep learning model for "click" sound classification, VibTac showcases its robustness in real-world scenarios. Video of the sensor working can be accessed at https://youtu.be/kmKIUlXGroo.

PMID:40232917 | DOI:10.1109/TOH.2025.3561049

Categories: Literature Watch

Learning to Learn Transferable Generative Attack for Person Re-Identification

Tue, 2025-04-15 06:00

IEEE Trans Image Process. 2025 Apr 15;PP. doi: 10.1109/TIP.2025.3558434. Online ahead of print.

ABSTRACT

Deep learning-based person re-identification (reid) models are widely employed in surveillance systems and inevitably inherit the vulnerability of deep networks to adversarial attacks. Existing attacks merely consider cross-dataset and cross-model transferability, ignoring the cross-test capability to perturb models trained in different domains. To powerfully examine the robustness of real-world re-id models, the Meta Transferable Generative Attack (MTGA) method is proposed, which adopts meta-learning optimization to promote the generative attacker producing highly transferable adversarial examples by learning comprehensively simulated transfer-based crossmodel&dataset&test black-box meta attack tasks. Specifically, cross-model&dataset black-box attack tasks are first mimicked by selecting different re-id models and datasets for meta-train and meta-test attack processes. As different models may focus on different feature regions, the Perturbation Random Erasing module is further devised to prevent the attacker from learning to only corrupt model-specific features. To boost the attacker learning to possess cross-test transferability, the Normalization Mix strategy is introduced to imitate diverse feature embedding spaces by mixing multi-domain statistics of target models. Extensive experiments show the superiority of MTGA, especially in cross-model&dataset and cross-model&dataset&test attacks, our MTGA outperforms the SOTA methods by 20.0% and 11.3% on mean mAP drop rate, respectively. The source codes are available at https://github.com/yuanbianGit/MTGA.

PMID:40232916 | DOI:10.1109/TIP.2025.3558434

Categories: Literature Watch

Automated pulmonary nodule classification from low-dose CT images using ERBNet: an ensemble learning approach

Tue, 2025-04-15 06:00

Med Biol Eng Comput. 2025 Apr 15. doi: 10.1007/s11517-025-03358-2. Online ahead of print.

ABSTRACT

The aim of this study was to develop a deep learning method for analyzing CT images with varying doses and qualities, aiming to categorize lung lesions into nodules and non-nodules. This study utilized the lung nodule analysis 2016 challenge dataset. Different low-dose CT (LDCT) images, including 10%, 20%, 40%, and 60% levels, were generated from the full-dose CT (FDCT) images. Five different 3D convolutional networks were developed to classify lung nodules from LDCT and reference FDCT images. The models were evaluated using 400 nodule and 400 non-nodule samples. An ensemble model was also developed to achieve a generalizable model across different dose levels. The model achieved an accuracy of 97.0% for nodule classification on FDCT images. However, the model exhibited relatively poor performance (60% accuracy) on LDCT images, indicating that dedicated models should be developed for each low-dose level. Dedicated models for handling LDCT led to dramatic increases in the accuracy of nodule classification. The dedicated low-dose models achieved a nodule classification accuracy of 90.0%, 91.1%, 92.7%, and 93.8% for 10%, 20%, 40%, and 60% of FDCT images, respectively. The accuracy of the deep learning models decreased gradually by almost 7% as LDCT images proceeded from 100 to 10%. However, the ensemble model led to an accuracy of 95.0% when tested on a combination of various dose levels. We presented an ensemble 3D CNN classifier for lesion classification, utilizing both LDCT and FDCT images. This model is able to analyze a combination of CT images with different dose levels and image qualities.

PMID:40232605 | DOI:10.1007/s11517-025-03358-2

Categories: Literature Watch

Multi-objective deep learning for lung cancer detection in CT images: enhancements in tumor classification, localization, and diagnostic efficiency

Tue, 2025-04-15 06:00

Discov Oncol. 2025 Apr 15;16(1):529. doi: 10.1007/s12672-025-02314-8.

ABSTRACT

OBJECTIVE: This study aims to develop and evaluate an advanced deep learning framework for the detection, classification, and localization of lung tumors in computed tomography (CT) scan images.

MATERIALS AND METHODS: The research utilized a dataset of 1608 CT scan images, including 623 cancerous and 985 non-cancerous cases, all carefully labeled for accurate tumor detection, classification (benign or malignant), and localization. The preprocessing involved optimizing window settings, adjusting slice thickness, and applying advanced data augmentation techniques to enhance the model's robustness and generalizability. The proposed model incorporated innovative components such as transformer-based attention layers, adaptive anchor-free mechanisms, and an improved feature pyramid network. These features enabled the model to efficiently handle detection, classification, and localization tasks. The dataset was split into 70% for training, 15% for validation, and 15% for testing. A multi-task loss function was used to balance the three objectives and optimize the model's performance. Evaluation metrics included mean average precision (mAP), intersection over union (IoU), accuracy, precision, and recall.

RESULTS: The proposed model demonstrated outstanding performance, achieving a mAP of 96.26%, IoU of 95.76%, precision of 98.11%, and recall of 98.83% on the test dataset. It outperformed existing models, including You Only Look Once (YOLO)v9 and YOLOv10, with YOLOv10 achieving a mAP of 95.23% and YOLOv9 achieving 95.70%. The proposed model showed faster convergence, better stability, and superior detection capabilities, particularly in localizing smaller tumors. Its multi-task learning framework significantly improved diagnostic accuracy and operational efficiency.

CONCLUSION: The proposed model offers a robust and scalable solution for lung cancer detection, providing real-time inference, multi-task learning, and high accuracy. It holds significant potential for clinical integration to improve diagnostic outcomes and patient care.

PMID:40232589 | DOI:10.1007/s12672-025-02314-8

Categories: Literature Watch

Development and application of deep learning-based diagnostics for pathologic diagnosis of gastric endoscopic submucosal dissection specimens

Tue, 2025-04-15 06:00

Gastric Cancer. 2025 Apr 15. doi: 10.1007/s10120-025-01612-y. Online ahead of print.

ABSTRACT

BACKGROUND: Accurate diagnosis of ESD specimens is crucial for managing early gastric cancer. Identifying tumor areas in serially sectioned ESD specimens requires experience and is time-consuming. This study aimed to develop and evaluate a deep learning model for diagnosing ESD specimens.

METHODS: Whole-slide images of 366 ESD specimens of adenocarcinoma were analyzed, with 2257 annotated regions of interest (tumor and muscularis mucosa) and 83,839 patch images. The development set was divided into training and internal validation sets. Tissue segmentation performance was evaluated using the internal validation set. A detection algorithm for tumor and submucosal invasion at the whole-slide image level was developed, and its performance was evaluated using a test set.

RESULTS: The model achieved Dice coefficients of 0.85 and 0.79 for segmentation of tumor and muscularis mucosa, respectively. In the test set, the diagnostic performance of tumor detection, measured by the AUROC, was 0.995, with a specificity of 1.000 and a sensitivity of 0.947. For detecting submucosal invasion, the model achieved an AUROC of 0.981, with a specificity of 0.956 and a sensitivity of 0.907. Pathologists' performance in diagnosing ESD specimens was evaluated with and without assistance from the deep learning model, and the model significantly reduced the mean diagnosis time (747 s without assistance vs. 478 s with assistance, P < 0.001).

CONCLUSION: The deep learning model demonstrated satisfactory performance in tissue segmentation and high accuracy in detecting tumors and submucosal invasion. This model can potentially serve as a screening tool in the histopathological diagnosis of ESD specimens.

PMID:40232558 | DOI:10.1007/s10120-025-01612-y

Categories: Literature Watch

Transformer-based skeletal muscle deep-learning model for survival prediction in gastric cancer patients after curative resection

Tue, 2025-04-15 06:00

Gastric Cancer. 2025 Apr 15. doi: 10.1007/s10120-025-01614-w. Online ahead of print.

ABSTRACT

BACKGROUND: We developed and evaluated a skeletal muscle deep-learning (SMDL) model using skeletal muscle computed tomography (CT) imaging to predict the survival of patients with gastric cancer (GC).

METHODS: This multicenter retrospective study included patients who underwent curative resection of GC between April 2008 and December 2020. Preoperative CT images at the third lumbar vertebra were used to develop a Transformer-based SMDL model for predicting recurrence-free survival (RFS) and disease-specific survival (DSS). The predictive performance of the SMDL model was assessed using the area under the curve (AUC) and benchmarked against both alternative artificial intelligence models and conventional body composition parameters. The association between the model score and survival was assessed using Cox regression analysis. An integrated model combining SMDL signature with clinical variables was constructed, and its discrimination and fairness were evaluated.

RESULTS: A total of 1242, 311, and 94 patients were assigned to the training, internal, and external validation cohorts, respectively. The Transformer-based SMDL model yielded AUCs of 0.791-0.943 for predicting RFS and DSS across all three cohorts and significantly outperformed other models and body composition parameters. The model score was a strong independent prognostic factor for survival. Incorporating the SMDL signature into the clinical model resulted in better prognostic prediction performance. The false-negative and false-positive rates of the integrated model were similar across sex and age subgroups, indicating robust fairness.

CONCLUSIONS: The Transformer-based SMDL model could accurately predict survival of GC and identify patients at high risk of recurrence or death, thereby assisting clinical decision-making.

PMID:40232557 | DOI:10.1007/s10120-025-01614-w

Categories: Literature Watch

Multi-viewpoint tampering detection for integral imaging

Tue, 2025-04-15 06:00

Opt Lett. 2025 Apr 15;50(8):2642-2645. doi: 10.1364/OL.557452.

ABSTRACT

Current camera-array-based integral imaging lacks tampering protection, making images vulnerable to falsification and requiring high computational costs. This Letter proposes an alternative 3D integral imaging scheme that ensures clear light field display while enabling tampering detection and self-recovery. Pixel mapping and deep learning co-extract depth and angular data pixel-wisely, regulating the region of interest of 3D light field for initial verification. Multi-viewpoint recovery information is embedded to reconstruct a complete elemental image array. When tampered with, the altered region can be identified and double-recovered. Experiments demonstrate remarkable parallax effects and effective tampering detection with recovery from multiple perspectives.

PMID:40232459 | DOI:10.1364/OL.557452

Categories: Literature Watch

Focusing properties and deep learning-based efficient tuning of symmetric butterfly beams

Tue, 2025-04-15 06:00

Opt Lett. 2025 Apr 15;50(8):2558-2561. doi: 10.1364/OL.557170.

ABSTRACT

In this Letter, we report what we believe to be a new type of abruptly autofocusing beams, termed symmetric butterfly Gaussian beams (SBGBs). The proposed beams appear to have a high degree of tunability for their focal position, focal length, focal intensity, and propagation trajectory. In addition, we propose a deep learning-based model for quick and accurate predictions of the propagation properties of SBGBs, achieving an average relative error of no more than 2.1% and being 8000 times faster than that of split-Fourier transform algorithms. This work may open a new platform for optical manipulation, optical communication, and biomedical applications.

PMID:40232438 | DOI:10.1364/OL.557170

Categories: Literature Watch

Advancing endometriosis detection in daily practice: a deep learning-enhanced multi-sequence MRI analytical model

Tue, 2025-04-15 06:00

Abdom Radiol (NY). 2025 Apr 15. doi: 10.1007/s00261-025-04942-8. Online ahead of print.

ABSTRACT

BACKGROUND AND PURPOSE: Endometriosis affects 5-10% of women of reproductive age. Despite its prevalence, diagnosing endometriosis through imaging remains challenging. Advances in deep learning (DL) are revolutionizing the diagnosis and management of complex medical conditions. This study aims to evaluate DL tools in enhancing the accuracy of multi-sequence MRI-based detection of endometriosis.

METHOD: We gathered a patient cohort from our institutional database, composed of patients with pathologically confirmed endometriosis from 2015 to 2024. We created an age-matched control group that underwent a similar MR protocol without an endometriosis diagnosis. We used sagittal fat-saturated T1-weighted (T1W FS) pre- and post-contrast and T2-weighted (T2W) MRIs. Our dataset was split at the patient level, allocating 12.5% for testing and conducting seven-fold cross-validation on the remainder. Seven abdominal radiologists with experience in endometriosis MRI and complex surgical planning and one women's imaging fellow with specific training in endometriosis MRI reviewed a random selection of images and documented their endometriosis detection.

RESULTS: 395 and 356 patients were included in the case and control groups respectively. The final 3D-DenseNet-121 classifier model demonstrated robust performance. Our findings indicated the most accurate predictions were obtained using T2W, T1W FS pre-, and post-contrast images. Using an ensemble technique on the test set resulted in an F1 Score of 0.881, AUROCC of 0.911, sensitivity of 0.976, and specificity of 0.720. Radiologists achieved 84.48% and 87.93% sensitivity without and with AI assistance in detecting endometriosis. The agreement among radiologists in predicting labels for endometriosis was measured as a Fleiss' kappa of 0.5718 without AI assistance and 0.6839 with AI assistance.

CONCLUSION: This study introduced the first DL model to use multi-sequence MRI on a large cohort, showing results equivalent to human detection by trained readers in identifying endometriosis.

PMID:40232413 | DOI:10.1007/s00261-025-04942-8

Categories: Literature Watch

Enhanced detection of autism spectrum disorder through neuroimaging data using stack classifier ensembled with modified VGG-19

Tue, 2025-04-15 06:00

Acta Radiol. 2025 Apr 15:2841851251333974. doi: 10.1177/02841851251333974. Online ahead of print.

ABSTRACT

BackgroundAutism spectrum disorder (ASD) is a neurodevelopmental disease marked by a variety of repetitive behaviors and social communication difficulties.PurposeTo develop a generalizable machine learning (ML) classifier that can accurately and effectively predict ASD in children.Material and MethodsThis paper makes use of neuroimaging data from the Autism Brain Imaging Data Exchange (ABIDE I and II) datasets through a combination of structural and functional magnetic resonance imaging data. Several ML models, such as Support Vector Machines (SVM), CatBoost, random forest (RF), and stack classifiers, were tested to demonstrate which model performs the best in ASD classification when used alongside a deep convolutional neural network.ResultsResults showed that stack classifier performed the best among the models, with the highest accuracy of 81.68%, sensitivity of 85.08%, and specificity of 79.13% for ABIDE I, and 81.34%, 83.61%, and 82.21% for ABIDE II, showing its superior ability to identify complex patterns in neuroimaging data. SVM performed poorly across all metrics, showing its limitations in dealing with high-dimensional neuroimaging data.ConclusionThe results show that the application of ML models, especially ensemble approaches like stack classifier, holds significant promise in improving the accuracy with which ASD is detected using neuroimaging and thus shows their potential for use in clinical applications and early intervention strategies.

PMID:40232228 | DOI:10.1177/02841851251333974

Categories: Literature Watch

Smart Grain Storage Solution: Integrated Deep Learning Framework for Grain Storage Monitoring and Risk Alert

Tue, 2025-04-15 06:00

Foods. 2025 Mar 18;14(6):1024. doi: 10.3390/foods14061024.

ABSTRACT

In order to overcome the notable limitations of current methods for monitoring grain storage states, particularly in the early warning of potential risks and the analysis of the spatial distribution of grain temperatures within the granary, this study proposes a multi-model fusion approach based on a deep learning framework for grain storage state monitoring and risk alert. This approach combines two advanced three-dimensional deep learning models, a grain storage state classification model based on 3D DenseNet and a temperature field prediction model based on 3DCNN-LSTM. First, the grain storage state classification model based on 3D DenseNet efficiently extracts features from three-dimensional grain temperature data to achieve the accurate classification of storage states. Second, the temperature prediction model based on 3DCNN-LSTM incorporates historical grain temperature and absolute water potential data to precisely predict the dynamic changes in the granary's temperature field. Finally, the grain temperature prediction results are input into the 3D DenseNet to provide early warnings for potential condensation and mildew risks within the grain pile. Comparative experiments with multiple baseline models show that the 3D DenseNet model achieves an accuracy of 97.38% in the grain storage state classification task, significantly outperforming other models. The 3DCNN-LSTM model shows high prediction accuracy in temperature forecasting, with MAE of 0.24 °C and RMSE of 0.28 °C. Furthermore, in potential risk alert experiments, the model effectively captures the temperature trend in the grain storage environment and provides early warnings, particularly for mildew and condensation risks, demonstrating the potential of this method for grain storage safety monitoring and risk alerting. This study provides a smart grain storage solution which contributes to ensuring food safety and enhancing the efficiency of grain storage management.

PMID:40232114 | DOI:10.3390/foods14061024

Categories: Literature Watch

The Fermentation Degree Prediction Model for Tieguanyin Oolong Tea Based on Visual and Sensing Technologies

Tue, 2025-04-15 06:00

Foods. 2025 Mar 13;14(6):983. doi: 10.3390/foods14060983.

ABSTRACT

The fermentation of oolong tea is a critical process that determines its quality and flavor. Current fermentation control relies on tea makers' sensory experience, which is labor-intensive and time-consuming. In this study, using Tieguanyin oolong tea as the research object, features including the tea water loss rate, aroma, image color, and texture were obtained using weight sensors, a tin oxide-type gas sensor, and a visual acquisition system. Support vector regression (SVR), random forest (RF) machine learning, and long short-term memory (LSTM) deep learning algorithms were employed to establish models for assessing the fermentation degree based on both single features and fused multi-source features, respectively. The results showed that in the test set of the fermentation degree models based on single features, the mean absolute error (MAE) ranged from 4.537 to 6.732, the root mean square error (RMSE) ranged from 5.980 to 9.416, and the coefficient of determination (R2) values varied between 0.898 and 0.959. In contrast, the data fusion models demonstrated superior performance, with the MAE reduced to 2.232-2.783, the RMSE reduced to 2.693-3.969, and R2 increased to 0.982-0.991, confirming that feature fusion enhanced characterization accuracy. Finally, the Sparrow Search Algorithm (SSA) was applied to optimize the data fusion models. After optimization, the models exhibited a MAE ranging from 1.703 to 2.078, a RMSE from 2.258 to 3.230, and R2 values between 0.988 and 0.994 on the test set. The application of the SSA further enhanced model accuracy, with the Fusion-SSA-LSTM model demonstrating the best performance. The research results enable online real-time monitoring of the fermentation degree of Tieguanyin oolong tea, which contributes to the automated production of Tieguanyin oolong tea.

PMID:40231982 | DOI:10.3390/foods14060983

Categories: Literature Watch

Unlocking chickpea flour potential: AI-powered prediction for quality assessment and compositional characterisation

Tue, 2025-04-15 06:00

Curr Res Food Sci. 2025 Mar 21;10:101030. doi: 10.1016/j.crfs.2025.101030. eCollection 2025.

ABSTRACT

The growing demand for sustainable, nutritious, and environmentally friendly food sources has placed chickpea flour as a vital component in the global shift to plant-based diets. However, the inherent variability in the composition of chickpea flour, influenced by genetic diversity, environmental conditions, and processing techniques, poses significant challenges to standardisation and quality control. This study explores the integration of deep learning models with near-infrared (NIR) spectroscopy to improve the accuracy and efficiency of chickpea flour quality assessment. Using a dataset comprising 136 chickpea varieties, the research compares the performance of several state-of-the-art deep learning models, including Convolutional Neural Networks (CNNs), Vision Transformers (ViTs), and Graph Convolutional Networks (GCNs), and compares the most effective model, CNN, against the traditional Partial Least Squares Regression (PLSR) method. The results demonstrate that CNN-based models outperform PLSR, providing more accurate predictions for key quality attributes such as protein content, starch, soluble sugars, insoluble fibres, total lipids, and moisture levels. The study highlights the potential of AI-enhanced NIR spectroscopy to revolutionise quality assessment in the food industry by offering a non-destructive, rapid, and reliable method for analysing chickpea flour. Despite the challenges posed by the limited dataset, deep learning models exhibit capabilities that suggest that further advancements would allow their industrial applicability. This research paves the way for broader applications of AI-driven quality control in food production, contributing to the development of more consistent and high-quality plant-based food products.

PMID:40231315 | PMC:PMC11995126 | DOI:10.1016/j.crfs.2025.101030

Categories: Literature Watch

Pages