Deep learning

Deep Learning on Misaligned Dual-Energy Chest X-ray Images Using Paired Cycle-Consistent Generative Adversarial Networks

Mon, 2025-05-05 06:00

J Imaging Inform Med. 2025 May 5. doi: 10.1007/s10278-025-01508-4. Online ahead of print.

ABSTRACT

Dual-energy subtraction (DES) chest X-ray images (CXRs) are often affected by motion artifacts resulting from patients' voluntary or involuntary movements, even in clinical settings. Additionally, the mediastinum and upper abdominal regions in low-energy (LE) CXRs are susceptible to signal insufficiency due to inadequate input photon numbers. Current image processing techniques for removing motion artifacts and statistical noise from DES-CXRs are insufficient, and potential algorithms for these tasks remain largely unexplored. We propose a framework based on paired cycle-consistency adversarial generative networks to effectively remove motion artifacts and statistical noise from DES-CXRs. The proposed method incorporates ensemble discriminators, differentiable augmentation, anti-aliased convolution layers, and a basic 8-layer U-Net generator. This method was trained and tested using a clinical image dataset comprising data of 600 examinations of individuals who underwent dual-energy chest X-ray imaging for diagnostic purposes, using a sixfold cross-validation approach. It demonstrated a remarkable improvement in motion artifact suppression in terms of an analysis of full width at the 10-percent maximum improved from 0.216 ± 0.0720 to 0.200 ± 0.0783 for the left lung region of interests including the cardiac region. Furthermore, it outperformed the method in a previous study in terms of a peak signal-to-noise ratio of 50.7 ± 3.68, structural similarity index of 0.997 ± 0.0152 for LE images, and Fréchet inception distance of 85.0 ± 3.52 for bone-suppressed DES images. The proposed method significantly outperforms existing techniques for removing motion artifacts and statistical noise and shows strong potential for clinical applications in chest X-ray imaging.

PMID:40325327 | DOI:10.1007/s10278-025-01508-4

Categories: Literature Watch

A Deep Learning-Based Framework for Automatic Determination of Developmental Dysplasia of the Hip from Graf Angles

Mon, 2025-05-05 06:00

J Imaging Inform Med. 2025 May 5. doi: 10.1007/s10278-025-01518-2. Online ahead of print.

ABSTRACT

Developmental dysplasia of the hip (DDH) is a common neonatal condition that necessitates early diagnosis to ensure effective treatment. The traditional Graf method, while widely used for evaluating infant hips via ultrasound, is limited by operator dependency and measurement variability. This research has proposed a framework using deep learning network, morphological operation and local maxima method to diagnose DDH in newborns using ultrasound images. The method utilizes DeepLabv3 + for image segmentation, evaluating multiple backbone architectures (ResNet50, InceptionResNetV2, MobilenetV2, and Xception) to identify the region of interest accurately. Local maxima method was used to determine the extremum points of the line defining the Graf angles. Denoising filters, including mean, median, and Wiener, are applied to determine local maxima points accurately. The evaluation comprises two stages: first, assessing the performance of DeepLabv3 + backbones in producing masks for Graf angles determination, and second, comparing the angles obtained through proposed framework with those determined by expert radiologists. Comparative analysis demonstrates that MobileNetV2 (94.64 accuracy, 86.99 Cohen's kappa, 94.31 F-score) surpasses other models in segmentation accuracy and measurement reliability. This conclusion is backed by key performance metrics such as accuracy, IoU, PSNR, F-score, SSIM, Cohen's kappa, as well as by the intraclass correlation coefficient and Bland-Altman analyses. The proposed framework shows considerable promise in automating hip ultrasound analysis for DDH diagnosis, minimizing operator dependency while enhancing measurement consistency.

PMID:40325325 | DOI:10.1007/s10278-025-01518-2

Categories: Literature Watch

From Image to Diagnosis: Convolutional Neural Networks in Tongue Lesions

Mon, 2025-05-05 06:00

J Imaging Inform Med. 2025 May 5. doi: 10.1007/s10278-025-01507-5. Online ahead of print.

ABSTRACT

Clinical examination of the tongue is essential for diagnosing systemic and local diseases. However, traditional diagnostic methods rely on subjective evaluation. Artificial intelligence, particularly convolutional neural networks, has shown promise in enhancing diagnostic accuracy in medical imaging. This study aimed to classify common tongue lesions using convolutional neural networks, improving diagnostic precision in routine dental examinations. A dataset of 1038 tongue images was analyzed, categorized into six classes: healthy, coated, fissured, hairy, geographic, and median rhomboid glossitis. The ResNet18 model was employed for binary classification, and ResNet50 for three-class classification. Preprocessing techniques, including image resizing and augmentation, were applied to optimize model performance. Performance was assessed using accuracy, precision, recall, and F1-score. The ResNet18 model achieved 100% accuracy in distinguishing healthy from hairy tongue lesions and demonstrated high performance in binary classification tasks. The ResNet50 model reached 96% accuracy for healthy-coated-hairy classification but faced challenges with other lesion groups. CNN-based models provide an effective, non-invasive tool for classifying tongue lesions, with ResNet18 excelling in binary classification. The findings suggest that artificial intelligence integration in maxillofacial radiology can enhance diagnostic reliability in routine dental practice. Further studies with larger datasets and real-time clinical applications are recommended to refine artificial intelligence-driven diagnostic tools.

PMID:40325324 | DOI:10.1007/s10278-025-01507-5

Categories: Literature Watch

An online 11 kv distribution system insulator defect detection approach with modified YOLOv11 and mobileNetV3

Mon, 2025-05-05 06:00

Sci Rep. 2025 May 5;15(1):15691. doi: 10.1038/s41598-025-99756-5.

ABSTRACT

With the advent of smart distribution grids, detection of defects in insulators with unmanned aerial vehicles as a part of distribution automation system (DAS) has attained a widespread attention. The defects are essential to detect to avoid damaging the service life of distribution lines, serious power loss and cascading power outages in extreme conditions. The intricate background, limited image dataset and small-scale object makes the problem of detection more complex. Owing to the exponential advancement in deep learning, deep learning-based insulator defect detection is gradually attaining a foothold in the research domain. This paper presents a novel approach for detecting insulator defects in an 11 kV distribution system using a modified version of You Only Look Once (YOLO V11) and the MobileNetV3 model. Data augmentation was applied as part of the preprocessing phase to train the proposed model. The model's performance was compared with earlier versions of YOLO and other existing methods to demonstrate its effectiveness. Additionally, multiple case studies were conducted to validate the method's robustness and reliability for insulator defect detection. This paper incorporates a modified version of YOLOv11 architecture using the constituent C3K2, SPFF and C2PSA algorithmic blocks, mounted with a MobileNetV3 classifier to allow lightweight framework in DAS based devices. Studies involving various real-life scenarios show the efficacy and applicability of the proposed algorithmic pipeline.

PMID:40325205 | DOI:10.1038/s41598-025-99756-5

Categories: Literature Watch

Innovative framework for fault detection and system resilience in hydropower operations using digital twins and deep learning

Mon, 2025-05-05 06:00

Sci Rep. 2025 May 5;15(1):15669. doi: 10.1038/s41598-025-98235-1.

ABSTRACT

Hydropower systems face significant challenges in load control and fault detection due to their complex operational dynamics. This study presents an innovative framework combining Digital Twin technology with Deep Learning to enhance fault detection, optimize operations, and improve system resilience. We developed a hybrid approach integrating a Digital Twin model of the hydropower system with advanced Deep Learning algorithms for real-time monitoring and predictive analysis. The proposed framework was evaluated through extensive simulations in a MATLAB environment, where it demonstrated remarkable improvements in system performance. The integration of Digital Twins allowed for precise real-time modeling of system behavior, while Deep Learning algorithms effectively identified and predicted faults. Our results show that the proposed method achieved a 12.14% reduction in fault detection time compared to traditional methods. Furthermore, the optimization of operational parameters led to a 8.97% increase in overall system efficiency and a 5.49% decrease in maintenance costs. In terms of fault detection accuracy, the Deep Learning-enhanced Digital Twin system achieved an 72% accuracy rate, significantly higher than the 65% accuracy observed with conventional techniques. The improved model not only enhanced fault detection but also contributed to a 8.03% reduction in energy loss and a 14.07% increase in power generation reliability. Overall, this research demonstrates that the integration of Digital Twins and Deep Learning provides a powerful, data-driven approach to optimizing hydropower systems. The proposed method offers substantial benefits in terms of operational efficiency, fault detection accuracy, and cost savings, positioning it as a significant advancement in the field.

PMID:40325162 | DOI:10.1038/s41598-025-98235-1

Categories: Literature Watch

Optimizing non small cell lung cancer detection with convolutional neural networks and differential augmentation

Mon, 2025-05-05 06:00

Sci Rep. 2025 May 5;15(1):15640. doi: 10.1038/s41598-025-98731-4.

ABSTRACT

Lung cancer remains one of the leading causes of cancer-related deaths worldwide, with early detection being critical to improving patient outcomes. Recent advancements in deep learning have shown promise in enhancing diagnostic accuracy, particularly through the use of Convolutional Neural Networks (CNNs). This study proposes the integration of Differential Augmentation (DA) with CNNs to address the critical challenge of memory overfitting, a limitation that hampers the generalization of models to unseen data. By introducing targeted augmentation strategies, such as adjustments in hue, brightness, saturation, and contrast, the CNN + DA model diversifies training data and enhances its robustness. The research utilized multiple datasets, including the IQ-OTH/NCCD dataset, to evaluate the proposed model against existing state-of-the-art methods. Hyperparameter tuning was performed using Random Search to optimize parameters, further improving performance. The results revealed that the CNN + DA model achieved an accuracy of 98.78%, outperforming advanced models like DenseNet, ResNet, and EfficientNetB0, as well as hybrid approaches including ensemble models. Additionally, statistical analyses, including Tukey's HSD post-hoc tests, confirmed the significance of the model's superior performance. These findings suggest that the CNN + DA model effectively addresses the limitations of prior works by reducing overfitting and ensuring reliable generalization across diverse datasets. The study concludes that the novel CNN + DA architecture provides a robust, accurate, and computationally efficient framework for lung cancer detection, positioning it as a valuable tool for clinical applications and paving the way for future research in medical image diagnostics.

PMID:40325128 | DOI:10.1038/s41598-025-98731-4

Categories: Literature Watch

Behavior recognition technology based on deep learning used in pediatric behavioral audiometry

Mon, 2025-05-05 06:00

Sci Rep. 2025 May 5;15(1):15648. doi: 10.1038/s41598-025-97519-w.

ABSTRACT

This study aims to explore the feasibility and accuracy of deep learning-based pediatric behavioral audiometry. The research provides a dedicated pediatric posture detection dataset, which contains a large number of video clips from children's behavioral hearing tests, encompassing various typical hearing test actions. A detection platform based on this dataset is also constructed, named intelligent diagnostic model of pediatric hearing based on optimized transformer (DoT); further, an estimation model of patient skeletal keypoints based on optimized transformer (POTR) was proposed to estimate human skeleton points. Based on this, the DoT approach was handled to perform posture recognition on videos of children undergoing behavioral hearing tests, thus enabling an automated hearing testing process. Through this platform, children's movements can be monitored and analyzed in real-time, allowing for the assessment of their hearing levels. Moreover, the study establishes decision rules based on specific actions, combining professional knowledge and experience in audiology to evaluate children's hearing levels based on their movement status. Firstly, we gathered image and video data related to posture in the process of conditioned play audiometry to test the hearing of 120 children aged 2.5 to 6 years old. Next, we built and optimized a deep learning model suitable for pediatric posture recognition. Finally, in the deployment and application phase, we deployed the trained pediatric posture recognition model into real-world application environments. We found that for children aged 2.5 - 4 years, the sensitivity of artificial behavior audiometry (0.900) was not as high as that of AI behavior audiometry (0.929), but the specificity of artificial behavior audiometry (0.824) and Area Under Curve (AUC) (0.901) was higher than that of AI behavior audiometry. For children aged 4-6 years, the sensitivity (0.943), specificity (0.947), and AUC (0.924) of artificial behavioral audiometry were higher than those of AI behavioral audiometry. The application of these rules facilitates objective assessment and diagnosis of children's hearing, providing essential foundations for early screening and treatment of children with hearing disorders.Trial Registration: Chinese Clinical Trial Registry: Registration number ChiCTR2100050416.

PMID:40325123 | DOI:10.1038/s41598-025-97519-w

Categories: Literature Watch

ProtoASNet: Comprehensive evaluation and enhanced performance with uncertainty estimation for aortic stenosis classification in echocardiography

Mon, 2025-05-05 06:00

Med Image Anal. 2025 Apr 24;103:103600. doi: 10.1016/j.media.2025.103600. Online ahead of print.

ABSTRACT

Aortic stenosis (AS) is a prevalent heart valve disease that requires accurate and timely diagnosis for effective treatment. Current methods for automated AS severity classification rely on black-box deep learning techniques, which suffer from a low level of trustworthiness and hinder clinical adoption. To tackle this challenge, we propose ProtoASNet, a prototype-based neural network designed to classify the severity of AS from B-mode echocardiography videos. ProtoASNet bases its predictions exclusively on the similarity scores between the input and a set of learned spatio-temporal prototypes, ensuring inherent interpretability. Users can directly visualize the similarity between the input and each prototype, as well as the weighted sum of similarities. This approach provides clinically relevant evidence for each prediction, as the prototypes typically highlight markers such as calcification and restricted movement of aortic valve leaflets. Moreover, ProtoASNet utilizes abstention loss to estimate aleatoric uncertainty by defining a set of prototypes that capture ambiguity and insufficient information in the observed data. This feature augments prototype-based models with the ability to explain when they may fail. We evaluate ProtoASNet on a private dataset and the publicly available TMED-2 dataset. It surpasses existing state-of-the-art methods, achieving a balanced accuracy of 80.0% on our private dataset and 79.7% on the TMED-2 dataset, respectively. By discarding cases flagged as uncertain, ProtoASNet achieves an improved balanced accuracy of 82.4% on our private dataset. Furthermore, by offering interpretability and an uncertainty measure for each prediction, ProtoASNet improves transparency and facilitates the interactive usage of deep networks in aiding clinical decision-making. Our source code is available at: https://github.com/hooman007/ProtoASNet.

PMID:40324320 | DOI:10.1016/j.media.2025.103600

Categories: Literature Watch

Forecasting climate change effects on Saline Lakes through advanced remote sensing and deep learning

Mon, 2025-05-05 06:00

Sci Total Environ. 2025 May 4;980:179582. doi: 10.1016/j.scitotenv.2025.179582. Online ahead of print.

ABSTRACT

Given the vital role of saline lakes in supporting ecosystems in arid regions, this study analyzes their long-term changes by assessing their characteristics and spectral reflectance properties. Alongside evaluating the physical and chemical variations of these lakes, the research integrates climate change modeling to predict future shifts in their features and assess ecological impacts on surrounding environments. By employing Super-Resolution Generative Adversarial Network (SRGAN) and Multiresolution Segmentation (MRS), this approach enhances satellite image resolution and enables more precise differentiation of key lake components-such as salt deposits, salinity levels, and moisture fluctuations. The results show that increasing image resolution with SRGAN and using these images as input data for image classification models improves the identification of physical characteristics and the prediction of chemical properties of lakes with greater detail. The proposed method, based on Cellular Automata (CA)-Markov modeling of albedo and infrared wave reflectance, predicts a roughly 15 % increase in salinity of the studied lakes by 2050, driven by rising temperatures, intensified evaporation, and declining moisture levels. Finally, the results of climate change predictions based on the Long Short-Term Memory (LSTM) algorithm, with high accuracy (R2 > 0.9), indicate increasing temperatures and evaporation in the coming years. Consequently, these rising temperatures will elevate salinity, drying, and albedo intensity in Chaka, Tuz, and Razzaza Lakes over the coming decades. This is supported by RCP8.5 scenarios, which project significant increases by 2100 that lead to greater evaporation and salinity. These changes have profound implications for surrounding ecosystems, particularly by affecting plant communities and accelerating desertification around these saline lakes.

PMID:40324314 | DOI:10.1016/j.scitotenv.2025.179582

Categories: Literature Watch

Current Technological Advances in Dysphagia Screening: Systematic Scoping Review

Mon, 2025-05-05 06:00

J Med Internet Res. 2025 May 5;27:e65551. doi: 10.2196/65551.

ABSTRACT

BACKGROUND: Dysphagia affects more than half of older adults with dementia and is associated with a 10-fold increase in mortality. The development of accessible, objective, and reliable screening tools is crucial for early detection and management.

OBJECTIVE: This systematic scoping review aimed to (1) examine the current state of the art in artificial intelligence (AI) and sensor-based technologies for dysphagia screening, (2) evaluate the performance of these AI-based screening tools, and (3) assess the methodological quality and rigor of studies on AI-based dysphagia screening tools.

METHODS: We conducted a systematic literature search across CINAHL, Embase, PubMed, and Web of Science from inception to July 4, 2024, following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) framework. In total, 2 independent researchers conducted the search, screening, and data extraction. Eligibility criteria included original studies using sensor-based instruments with AI to identify individuals with dysphagia or unsafe swallow events. We excluded studies on pediatric, infant, or postextubation dysphagia, as well as those using non-sensor-based assessments or diagnostic tools. We used a modified Quality Assessment of Diagnostic Accuracy Studies-2 tool to assess methodological quality, adding a "model" domain for AI-specific evaluation. Data were synthesized narratively.

RESULTS: This review included 24 studies involving 2979 participants (1717 with dysphagia and 1262 controls). In total, 75% (18/24) of the studies focused solely on per-individual classification rather than per-swallow event classification. Acoustic (13/24, 54%) and vibratory (9/24, 38%) signals were the primary modality sources. In total, 25% (6/24) of the studies used multimodal approaches, whereas 75% (18/24) used a single modality. Support vector machine was the most common AI model (15/24, 62%), with deep learning approaches emerging in recent years (3/24, 12%). Performance varied widely-accuracy ranged from 71.2% to 99%, area under the receiver operating characteristic curve ranged from 0.77 to 0.977, and sensitivity ranged from 63.6% to 100%. Multimodal systems generally outperformed unimodal systems. The methodological quality assessment revealed a risk of bias, particularly in patient selection (unclear in 18/24, 75% of the studies), index test (unclear in 23/24, 96% of the studies), and modeling (high risk in 13/24, 54% of the studies). Notably, no studies conducted external validation or domain adaptation testing, raising concerns about real-world applicability.

CONCLUSIONS: This review provides a comprehensive overview of technological advancements in AI and sensor-based dysphagia screening. While these developments show promise for continuous long-term tele-swallowing assessments, significant methodological limitations were identified. Future studies can explore how each modality can target specific anatomical regions and manifestations of dysphagia. This detailed understanding of how different modalities address various aspects of dysphagia can significantly benefit multimodal systems, enabling them to better handle the multifaceted nature of dysphagia conditions.

PMID:40324167 | DOI:10.2196/65551

Categories: Literature Watch

Training, Validating, and Testing Machine Learning Prediction Models for Endometrial Cancer Recurrence

Mon, 2025-05-05 06:00

JCO Precis Oncol. 2025 May;9:e2400859. doi: 10.1200/PO-24-00859. Epub 2025 May 5.

ABSTRACT

PURPOSE: Endometrial cancer (EC) is the most common gynecologic cancer in the United States with rising incidence and mortality. Despite optimal treatment, 15%-20% of all patients will recur. To better select patients for adjuvant therapy, it is important to accurately predict patients at risk for recurrence. Our objective was to train, validate, and test models of EC recurrence using lasso regression and other machine learning (ML) and deep learning (DL) analytics in a large, comprehensive data set.

METHODS: Data from patients with EC were downloaded from the Oncology Research Information Exchange Network database and stratified into low risk, The International Federation of Gynecology and Obstetrics (FIGO) grade 1 and 2, stage I (N = 329); high risk, or FIGO grade 3 or stages II, III, IV (N = 324); and nonendometrioid histology (N = 239) groups. Clinical, pathologic, genomic, and genetic data were used for the analysis. Genomic data included microRNA, long noncoding RNA, isoforms, and pseudogene expressions. Genetic variation included single-nucleotide variation (SNV) and copy-number variation (CNV). In the discovery phase, we selected variables informative for recurrence (P < .05), using univariate analyses of variance. Then, we trained, validated, and tested multivariate models using selected variables and lasso regression, MATLAB (ML), and TensorFlow (DL).

RESULTS: Recurrence clinic models for low-risk, high-risk, and high-risk nonendometrioid histology had AUCs of 56%, 70%, and 65%, respectively. For training, we selected models with AUC >80%: five for the low-risk group, 20 models for the high-risk group, and 20 for the nonendometrioid group. The two best low-risk models included clinical data and CNVs. For the high-risk group, three of the five best-performing models included pseudogene expression. For the nonendometrioid group, pseudogene expression and SNV were overrepresented in the best models.

CONCLUSION: Prediction models of EC recurrence built with ML and DL analytics had better performance than models with clinical and pathologic data alone. Prospective validation is required to determine clinical utility.

PMID:40324114 | DOI:10.1200/PO-24-00859

Categories: Literature Watch

TCN-QV: an attention-based deep learning method for long sequence time-series forecasting of gold prices

Mon, 2025-05-05 06:00

PLoS One. 2025 May 5;20(5):e0319776. doi: 10.1371/journal.pone.0319776. eCollection 2025.

ABSTRACT

Accurate prediction of gold prices is crucial for investment decision-making and national risk management. The time series data of gold prices exhibits random fluctuations, non-linear characteristics, and high volatility, making prediction extremely challenging. Various methods, from classical statistics to machine learning techniques like Random Forests, Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN), have achieved high accuracy, but they also have inherent limitations. To address these issues, a model that combines Temporal Convolutional Networks (TCN) with Query (Q) and Keys (K) attention mechanisms (TCN-QV) is proposed to enhance the accuracy of gold price predictions. The model begins by employing stacked dilated causal convolution layers within the TCN framework to effectively extract temporal features from the sequence data. Subsequently, an attention mechanism is introduced to enable adaptive weight distribution according to the information features. Finally, the predicted results are generated through a dense layer. This method is used to predict the time series data of gold prices in Shanghai. The optimized model demonstrates a substantial improvement in Mean Absolute Error (MAE) compared to the baseline model, achieving reductions of approximately 5.47% in the least favorable case and up to 33.69% in the most favorable scenario across four experimental datasets. Additionally, the model is tested across different time steps and shows satisfactory performance in long sequence predictions. To validate the necessity of the model components, this paper conducts ablation experiments to confirm the significance of each segment.

PMID:40324013 | DOI:10.1371/journal.pone.0319776

Categories: Literature Watch

Semisupervised adaptive learning models for IDH1 mutation status prediction

Mon, 2025-05-05 06:00

PLoS One. 2025 May 5;20(5):e0321404. doi: 10.1371/journal.pone.0321404. eCollection 2025.

ABSTRACT

The mutation status of isocitrate dehydrogenase1 (IDH1) in glioma is critical information for the diagnosis, treatment, and prognosis. Accurately determining such information from MRI data has emerged as a significant research challenge in recent years. Existing techniques for this problem often suffer from various limitations, such as the data waste and instability issues. To address such issues, we present a semisupervised adaptive deep learning model based on radiomics and rough sets for predicting the mutation status of IDH1 from MRI data. Firstly, our model uses a rough set algorithm to remove the redundant medical image features extracted by radiomics, while adding pseudo-labels for non-labeled data via statistical. T-tests to mitigate the common issue of insufficient datasets in medical imaging analysis. Then, it applies a Sand Cat Swarm Optimization (SCSO) algorithm to optimize the weight of pseudo-label data. Finally, our model adopts U-Net and CRNN to construct UCNet, a semisupervised classification model for classifying IDH1 mutation status. To validate our models, we use a preoperative MRI dataset with 316 glioma patients to evaluate the performance. Our study suggests that the prediction accuracy of glioma IDH1 mutation status reaches 95.63%. Our experimental results suggest that the study can effectively improve the utilization of glioma imaging data and the accuracy of intelligent diagnosis of glioma IDH1 mutation status.

PMID:40323991 | DOI:10.1371/journal.pone.0321404

Categories: Literature Watch

Improving fine-grained food classification using deep residual learning and selective state space models

Mon, 2025-05-05 06:00

PLoS One. 2025 May 5;20(5):e0322695. doi: 10.1371/journal.pone.0322695. eCollection 2025.

ABSTRACT

BACKGROUND: Food classification is the foundation for developing food vision tasks and plays a key role in the burgeoning field of computational nutrition. Due to the complexity of food requiring fine-grained classification, the Convolutional Neural Networks (CNNs) backbone needs additional structural design, whereas Vision Transformers (ViTs), containing the self-attention module, has increased computational complexity.

METHODS: We propose a ResVMamba model and validate its performance on processing complex food dataset. Unlike previous fine-grained classification models that heavily rely on attention mechanisms or hierarchical feature extraction, our method leverages a novel residual learning strategy within a state-space framework to improve representation learning. This approach enables the model to efficiently capture both global and local dependencies, surpassing the computational efficiency of Vision Transformers (ViTs) while maintaining high accuracy. We introduce an academically underestimated food dataset CNFOOD-241, and compare the CNFOOD-241 with other food databases.

RESULTS: The proposed ResVMamba surpasses current state-of-the-art (SOTA) models, achieving a Top-1 classification accuracy of 81.70% and a Top-5 accuracy of 96.83%. Our findings elucidate that our proposed methodology establishes a new benchmark for SOTA performance in food recognition on the CNFOOD-241 dataset.

CONCLUSIONS: We pioneer the integration of a residual learning framework within the VMamba model to concurrently harness both global and local state features. The code can be obtained on GitHub: https://github.com/ChiShengChen/ResVMamba.

PMID:40323945 | DOI:10.1371/journal.pone.0322695

Categories: Literature Watch

Contactless Estimation of Respiratory Frequency Using 3D-CNN on Thermal Images

Mon, 2025-05-05 06:00

IEEE J Biomed Health Inform. 2025 May 5;PP. doi: 10.1109/JBHI.2025.3567141. Online ahead of print.

ABSTRACT

Monitoring physiological parameters such as respiratory rate (f$_{R}$) is essential for diagnosing and managing various pathological conditions. Thermal imaging offers a promising contactless alternative to traditional methods, which often rely on partially invasive sensors or obtrusive wearable systems. However, existing approaches for f$_{R}$ estimation from thermal signals typically require extensive pre-processing and manual or semi-automatic region-of-interest (ROI) tracking, limiting their practical applicability. This study proposes a deep learning-based method for estimating f$_{R}$ directly from thermal videos, eliminating the need for complex pre-processing and ROI tracking. A 3D Convolutional Neural Network (3D-CNN) is developed to operate on raw thermal video data. To address challenges related to small datasets, the model is trained using data augmentation and transfer learning from synthetic datasets. Experimental results demonstrate that the proposed approach achieves a validation $R^{2}$ score of approximately 0.61 on both pre-processed and raw thermal videos. By simplifying the workflow, this method holds promise for enhancing the feasibility of thermal imaging in real-world applications, such as remote healthcare and driver monitoring in automotive applications.

PMID:40323749 | DOI:10.1109/JBHI.2025.3567141

Categories: Literature Watch

An End-to-End Deep Learning Generative Framework for Refinable Shape Matching and Generation

Mon, 2025-05-05 06:00

IEEE Trans Med Imaging. 2025 May 5;PP. doi: 10.1109/TMI.2025.3562756. Online ahead of print.

ABSTRACT

Generative modelling for shapes is a prerequisite for In-Silico Clinical Trials (ISCTs), which aim to cost-effectively validate medical device interventions using synthetic anatomical shapes, often represented as 3D surface meshes. However, constructing AI models to generate shapes closely resembling the real mesh samples is challenging due to variable vertex counts, connectivities, and the lack of dense vertex-wise correspondences across the training data. Employing graph representations for meshes, we develop a novel unsupervised geometric deep-learning model to establish refinable shape correspondences in a latent space, construct a population-derived atlas and generate realistic synthetic shapes. We additionally extend our proposed base model to a joint shape generative-clustering multi-atlas framework to incorporate further variability and preserve more details in the generated shapes. Experimental results using liver and left-ventricular models demonstrate the approach's applicability to computational medicine, highlighting its suitability for ISCTs through a comparative analysis.

PMID:40323742 | DOI:10.1109/TMI.2025.3562756

Categories: Literature Watch

The Application Status of Radiomics-Based Machine Learning in Intrahepatic Cholangiocarcinoma: Systematic Review and Meta-Analysis

Mon, 2025-05-05 06:00

J Med Internet Res. 2025 May 5;27:e69906. doi: 10.2196/69906.

ABSTRACT

BACKGROUND: Over the past few years, radiomics for the detection of intrahepatic cholangiocarcinoma (ICC) has been extensively studied. However, systematic evidence is lacking in the use of radiomics in this domain, which hinders its further development.

OBJECTIVE: To address this gap, our study delved into the status quo and application value of radiomics in ICC and aimed to offer evidence-based support to promote its systematic application in this field.

METHODS: PubMed, Web of Science, Cochrane Library, and Embase were comprehensively retrieved to determine relevant original studies. The study quality was appraised through the Radiomics Quality Score. In addition, subgroup analyses were undertaken according to datasets (training and validation sets), imaging sources, and model types.

RESULTS: Fifty-eight studies encompassing 12,903 patients were eligible, with an average Radiomics Quality Score of 9.21. Radiomics-based machine learning (ML) was mainly used to diagnose ICC (n=30), microvascular invasion (n=8), gene mutations (n=5), perineural invasion (PNI; n=2), lymph node (LN) positivity (n=2), and tertiary lymphoid structures (TLSs; n=2), and predict overall survival (n=6) and recurrence (n=9). The C-index, sensitivity (SEN), and specificity (SPC) of the ML model developed using clinical features (CFs) for ICC detection were 0.762 (95% CI 0.728-0.796), 0.72 (95% CI 0.66-0.77), and 0.72 (95% CI 0.66-0.78), respectively, in the validation dataset. In contrast, the C-index, SEN, and SPC of the radiomics-based ML model for detecting ICC were 0.853 (95% CI 0.824-0.882), 0.80 (95% CI 0.73-0.85), and 0.88 (95% CI 0.83-0.92), respectively. The C-index, SEN, and SPC of ML constructed using both radiomics and CFs for diagnosing ICC were 0.912 (95% CI 0.889-0.935), 0.77 (95% CI 0.72-0.81), and 0.90 (95% CI 0.86-0.92). The deep learning-based model that integrated both radiomics and CFs yielded a notably higher C-index of 0.924 (0.863-0.984) in the task of detecting ICC. Additional analyses showed that radiomics demonstrated promising accuracy in predicting overall survival and recurrence, as well as in diagnosing microvascular invasion, gene mutations, PNI, LN positivity, and TLSs.

CONCLUSIONS: Radiomics-based ML demonstrates excellent accuracy in the clinical diagnosis of ICC. However, studies involving specific tasks, such as diagnosing PNI and TLSs, are still scarce. The limited research on deep learning has hindered both further analysis and the development of subgroup analyses across various models. Furthermore, challenges such as data heterogeneity and interpretability caused by segmentation and imaging parameter variations require further optimization and refinement. Future research should delve into the application of radiomics to enhance its clinical use. Its integration into clinical practice holds great promise for improving decision-making, boosting diagnostic and treatment accuracy, minimizing unnecessary tests, and optimizing health care resource usage.

PMID:40323647 | DOI:10.2196/69906

Categories: Literature Watch

Predicting Postoperative Prognosis in Pediatric Malignant Tumor With MRI Radiomics and Deep Learning Models: A Retrospective Study

Mon, 2025-05-05 06:00

J Craniofac Surg. 2025 May 5. doi: 10.1097/SCS.0000000000011466. Online ahead of print.

ABSTRACT

OBJECTIVE: The aim of this study is to develop a multimodal machine learning model that integrates magnetic resonance imaging (MRI) radiomics, deep learning features, and clinical indexes to predict the 3-year postoperative disease-free survival (DFS) in pediatric patients with malignant tumors.

METHODS: A cohort of 260 pediatric patients with brain tumors who underwent R0 resection (aged ≤ 14 y) was retrospectively included in the study. Preoperative T1-enhanced MRI images and clinical data were collected. Image preprocessing involved N4 bias field correction and Z-score standardization, with tumor areas manually delineated using 3D Slicer. A total of 1130 radiomics features (Pyradiomics) and 511 deep learning features (3D ResNet-18) were extracted. Six machine learning models (eg, SVM, RF, LightGBM) were developed after dimensionality reduction through Lasso regression analysis, based on selected clinical indexes such as tumor diameter, GCS score, and nutritional status. Bayesian optimization was applied to adjust model parameters. The evaluation metrics included AUC, sensitivity, and specificity.

RESULTS: The fusion model (LightGBM) achieved an AUC of 0.859 and an accuracy of 85.2% in the validation set. When combined with clinical indexes, the final model's AUC improved to 0.909. Radiomics features, such as texture heterogeneity, and clinical indexes, including tumor diameter ≥ 5 cm and preoperative low albumin, significantly contributed to prognosis prediction.

CONCLUSION: The multimodal model demonstrated effective prediction of the 3-year postoperative DFS in pediatric brain tumors, offering a scientific foundation for personalized treatment.

PMID:40323639 | DOI:10.1097/SCS.0000000000011466

Categories: Literature Watch

Heart volume on health checkup CT scans inversely correlates with pulse rate: data-driven analysis using deep-learning segmentation

Mon, 2025-05-05 06:00

Jpn J Radiol. 2025 May 5. doi: 10.1007/s11604-025-01772-y. Online ahead of print.

ABSTRACT

PURPOSE: This study aims to elucidate correlation between heart volume on computed tomography (CT) and various health checkup examination data in the general population. Furthermore, this study aims to examine the utility of a deep-learning segmentation tool in the data-driven analysis of CT big data.

MATERIALS AND METHODS: Health checkup examination data and CT images acquired in 2013 and 2018 were retrospectively analyzed. We first quantified heart volume using a public deep-learning model, TotalSegmentator. The accuracy of segmentation was evaluated using Dice score on 30 randomly chosen images and annotation by a radiologist. Then, Spearman's partial correlation was calculated for 58 numerical items, and the analysis of covariance was performed for 13 categorical items, adjusting for the effect of gender, medication, height, weight, abdominal circumference, and age. The variables found to be significant proceeded to longitudinal analysis.

RESULTS: In the dataset, 7993 records were eligible for cross-sectional analysis and 1306 individuals were eligible for longitudinal analysis. Pulse rate was most strongly inversely correlated with the heart volume (Spearman's correlation coefficients ranging from - 0.29 to - 0.33). A 10 bpm increase in pulse rate was correlated with roughly a 0.5 percentage point decrease in the cardiothoracic ratio. Hemoglobin, hematocrit, total protein, albumin, and cholinesterase also showed weak inverse correlation. Five-year longitudinal analysis corroborated these findings.

CONCLUSIONS: We found that pulse rate was the strongest covariate of the heart volume on CT, rather than other cardiovascular-related variables such as blood pressure. The study also demonstrated the feasibility and utility of the artificial intelligence-assisted data-driven research on CT big data.

PMID:40323526 | DOI:10.1007/s11604-025-01772-y

Categories: Literature Watch

YOLOv11n for precision agriculture: lightweight and efficient detection of guava defects across diverse conditions

Mon, 2025-05-05 06:00

J Sci Food Agric. 2025 May 5. doi: 10.1002/jsfa.14331. Online ahead of print.

ABSTRACT

BACKGROUND: Automated fruit defect detection plays a critical role in improving postharvest quality assessment and supporting decision-making in agricultural supply chains. Guava defect detection presents specific challenges because of diverse disease types, varying maturity levels and inconsistent environmental conditions. Although existing you only look once (YOLO)-based models have shown promise in agricultural detection tasks, they often face limitations in balancing detection accuracy, inference speed and computational efficiency, particularly in resource-constrained settings. This study addresses this gap by evaluating four YOLO models (YOLOv8s, YOLOv5s, YOLOv9s and YOLOv11n) for detecting defective guava fruits across five diseases (scab, canker, chilling injury, mechanical damage and rot), three maturity levels (mature, half-mature and immature) and healthy fruits.

RESULTS: Diverse datasets facilitated robust training and evaluation. YOLOv11n achieved the highest mAP50-95 (98.0%) and exhibited bounding box loss (0.0565), classification loss (0.2787), inference time (3.9 milliseconds) and detection speed (255 FPS). YOLOv5s had the highest precision (94.9%), while YOLOv9s excelled in recall (96.2%). YOLOv8s offered a balanced performance across metrics. YOLOv11n outperformed all models with a lightweight architecture (2.6 million parameters) and low computational cost (6.3 giga floating-point operations per second), making it suitable for resource-constrained applications.

CONCLUSION: These results highlight YOLOv11n's potential for agricultural applications, such as automated defect detection and quality control, which require high accuracy and real-time performance across diverse conditions. This analysis provides insights into deploying YOLO models for agricultural quality assessment to enhance the efficiency and reliability of postharvest management. © 2025 Society of Chemical Industry.

PMID:40322977 | DOI:10.1002/jsfa.14331

Categories: Literature Watch

Pages