Deep learning
Sleep Apnea Detection Using EEG: A Systematic Review of Datasets, Methods, Challenges, and Future Directions
Ann Biomed Eng. 2025 Feb 12. doi: 10.1007/s10439-025-03691-5. Online ahead of print.
ABSTRACT
PURPOSE: Sleep Apnea (SA) affects an estimated 936 million adults globally, posing a significant public health concern. The gold standard for diagnosing SA, polysomnography, is costly and uncomfortable. Electroencephalogram (EEG)-based SA detection is promising due to its ability to capture distinctive sleep stage-related characteristics across different sub-band frequencies. This study aims to review and analyze research from the past decade on the potential of EEG signals in SA detection and classification focusing on various deep learning and machine learning techniques, including signal decomposition, feature extraction, feature selection, and classification methodologies.
METHOD: A systematic literature review using the preferred reporting items for systematic reviews and meta-Analysis (PRISMA) and PICO guidelines was conducted across 5 databases for publications from January 2010 to December 2024.
RESULTS: The review involved screening a total of 402 papers, with 63 selected for in-depth analysis to provide valuable insights into the application of EEG signals for SA detection. The findings underscore the potential of EEG-based methods in improving SA diagnosis.
CONCLUSION: This study provides valuable insights, showcasing significant advancements while identifying key areas for further exploration, thereby laying a strong foundation for future research in EEG-based SA detection.
PMID:39939549 | DOI:10.1007/s10439-025-03691-5
Automated grading of oleaster fruit using deep learning
Sci Rep. 2025 Feb 12;15(1):5206. doi: 10.1038/s41598-025-89358-6.
ABSTRACT
The agriculture sector is crucial to many economies, particularly in developing regions, with post-harvest technology emerging as a key growth area. The oleaster, valued for its nutritional and medicinal properties, has traditionally been graded manually based on color and appearance. As global demand rises, there is a growing need for efficient automated grading methods. Therefore, this study aimed to develop a real-time machine vision system for classifying oleaster fruit at various grading velocities. Initially, in the offline phase, a dataset containing video frames of four different quality classes of oleaster, categorized based on the Iranian national standard, was acquired at different linear conveyor belt velocities (ranging from 4.82 to 21.51 cm/s). The Mask R-CNN algorithm was used to segment the extracted frames to obtain the position and boundary of the samples. Experimental results indicated that, with a 100% detection rate and an average instance segmentation accuracy error ranging from 4.17 to 5.79%, the Mask R-CNN algorithm is capable of accurately segmenting all classes of oleaster at all the examined grading velocity levels. The results of the fivefold cross validation indicated that the general YOLOv8x and YOLOv8n models, created using the dataset obtained from all conveyor belt velocity levels, have a similarly reliable classification performance. Therefore, given its simpler architecture and lower processing time requirements, the YOLOv8n model was used to evaluate the grading system in real-time mode. The overall classification accuracy of this model was 92%, with a sensitivity range of 87.10-94.89% for distinguishing different classes of oleaster at a grading velocity of 21.51 cm/s. The results of this study demonstrate the effectiveness of deep learning-based models in developing grading machines for the oleaster fruit.
PMID:39939355 | DOI:10.1038/s41598-025-89358-6
Stroke Management and Analysis Risk Tool (SMART): An interpretable clinical application for diabetes-related stroke prediction
Nutr Metab Cardiovasc Dis. 2024 Dec 29:103841. doi: 10.1016/j.numecd.2024.103841. Online ahead of print.
ABSTRACT
BACKGROUND AND AIMS: The growing global burden of diabetes and stroke poses a significant public health challenge. This study aims to analyze factors and create an interpretable stroke prediction model for diabetic patients.
METHODS AND RESULTS: Data from 20,014 patients were collected from the Affiliated Drum Tower Hospital, Medical School of Nanjing University, between 2021 and 2022. After handling the missing values, feature engineering included LASSO, SVM-RFE, and multi-factor regression techniques. The dataset was split 8:2 for training and testing, with the Synthetic Minority Oversampling Technique (SMOTE) to balance classes. Various machine learning and deep learning techniques, such as Random Forest (RF) and deep neural networks (DNN), have been utilized for model training. SHAP and a dedicated website showed the interpretability and practicality of the model. This study identified 11 factors influencing stroke incidence, with the RF and DNN algorithms achieving AUC values of 0.95 and 0.91, respectively. The Stroke Management and Analysis Risk Tool (SMART) was developed for clinical use.
PRIMARY ENDPOINT: The predictive performance of SMART in assessing stroke risk in diabetic patients was evaluated using AUC.
SECONDARY ENDPOINTS: Evaluated accuracy (precision, recall, F1-score), interpretability via SHAP values, and clinical utility, emphasizing user interface. Statistical analysis of EHR data using univariate and multivariate methods, with model validation on a separate test set.
CONCLUSIONS: An interpretable stroke-predictive model was created for patients with diabetes. This model proposes that standard clinical and laboratory parameters can predict the stroke risk in individuals with diabetes.
PMID:39939252 | DOI:10.1016/j.numecd.2024.103841
Use of deep learning-accelerated T2 TSE for prostate MRI: Comparison with and without hyoscine butylbromide admission
Magn Reson Imaging. 2025 Feb 10:110358. doi: 10.1016/j.mri.2025.110358. Online ahead of print.
ABSTRACT
OBJECTIVE: To investigate the use of deep learning (DL) T2-weighted turbo spin echo (TSE) imaging sequence with deep learning acceleration (T2DL) in prostate MRI regarding the necessity of hyoscine butylbromide (HBB) administration for high image quality.
METHODS: One hundred twenty consecutive patients divided into four groups (30 for each group) were included in this study. All patients received a T2DL (version 2022/23) and a conventional T2 TSE (cT2) sequence on an implemented 3 T scanner and software system. Group A received cT2 with HBB compared to T2DL without HBB with a field of view (FOV) of 130 mm and group B with a FOV of 160 mm. Group C received both sequences with a FOV of 160 mm plus HBB and group D without HBB. Two radiologists independently evaluated all imaging datasets in a blinded reading regarding motion, sharpness, noise, and diagnostic confidence. Furthermore, we analyzed quantitative parameters by calculating edge rise distance (ERD), signal-to-noise-ratio (SNR), and contrast-to-noise-ratio (CNR). Friedman test was used for group comparisons.
RESULTS: Baseline characteristics showed no significant differences between groups A-D. After HBB cT2 showed less motion artifacts, more sharpness, and a higher diagnostic confidence than T2DL, though DL sequences had significantly lower noise (p < 0.01). Quantitative analysis revealed higher SNR and CNR for T2DL sequences (p < 0.01), while edge rise distance (ERD) remained similar. Inter-reader agreement was good to excellent, with ICCs ranging from 0.84 to 0.93. T2DL acquisition time was significantly lower than for cT2.
CONCLUSIONS: In our study, cT2 sequences with HBB showed superior image quality and diagnostic confidence while the T2DL sequence offer promising potential for reducing MRI acquisition times and performed better in quantitative measures like SNR and CNR. Additional studies are required to evaluate further adjusted and developed DL applications for prostate MRI on upcoming scanner generations and to assess tumor detection rates.
PMID:39938669 | DOI:10.1016/j.mri.2025.110358
Estimating the treatment effects of multiple drug combinations on multiple outcomes in hypertension
Cell Rep Med. 2025 Feb 5:101947. doi: 10.1016/j.xcrm.2025.101947. Online ahead of print.
ABSTRACT
Hypertension management is complex due to the need for multiple drug combinations and consideration of diverse outcomes. Traditional treatment effect estimation methods struggle to address this complexity, as they typically focus on binary treatments and binary outcomes. To overcome these challenges, we introduce a framework that accommodates multiple drug combinations and multiple outcomes (METO). METO uses multi-treatment encoding to handle drug combinations and sequences, distinguishing between effectiveness and safety outcomes by learning the outcome type during prediction. To mitigate confounding bias, METO employs an inverse probability weighting method for multiple treatments, assigning balance weights based on propensity scores. Evaluated on real-world data, METO achieves significant performance improvements over existing methods, with an average improvement of 6.4% in influence function-based precision of estimating heterogeneous effects. A case study demonstrates METO's ability to identify personalized antihypertensive treatments that optimize efficacy and minimize safety risks, highlighting its potential for improving hypertension treatment strategies.
PMID:39938524 | DOI:10.1016/j.xcrm.2025.101947
Segment Anything for Microscopy
Nat Methods. 2025 Feb 12. doi: 10.1038/s41592-024-02580-4. Online ahead of print.
ABSTRACT
Accurate segmentation of objects in microscopy images remains a bottleneck for many researchers despite the number of tools developed for this purpose. Here, we present Segment Anything for Microscopy (μSAM), a tool for segmentation and tracking in multidimensional microscopy data. It is based on Segment Anything, a vision foundation model for image segmentation. We extend it by fine-tuning generalist models for light and electron microscopy that clearly improve segmentation quality for a wide range of imaging conditions. We also implement interactive and automatic segmentation in a napari plugin that can speed up diverse segmentation tasks and provides a unified solution for microscopy annotation across different microscopy modalities. Our work constitutes the application of vision foundation models in microscopy, laying the groundwork for solving image analysis tasks in this domain with a small set of powerful deep learning models.
PMID:39939717 | DOI:10.1038/s41592-024-02580-4
Author Correction: An automated deep learning pipeline for EMVI classification and response prediction of rectal cancer using baseline MRI: a multi-centre study
NPJ Precis Oncol. 2025 Feb 12;9(1):45. doi: 10.1038/s41698-025-00827-7.
NO ABSTRACT
PMID:39939705 | DOI:10.1038/s41698-025-00827-7
Universal attention guided adversarial defense using feature pyramid and non-local mechanisms
Sci Rep. 2025 Feb 12;15(1):5237. doi: 10.1038/s41598-025-89267-8.
ABSTRACT
Deep Neural Networks (DNNs) have been shown to be vulnerable to adversarial examples, significantly hindering the development of deep learning technologies in high-security domains. A key challenge is that current defense methods often lack universality, as they are effective only against certain types of adversarial attacks. This study addresses this challenge by focusing on analyzing adversarial examples through changes in model attention, and classifying attack algorithms into attention-shifting and attention-attenuation categories. Our main novelty lies in proposing two defense modules: the Feature Pyramid-based Attention Space-guided (FPAS) module to counter attention-shifting attacks, and the Attention-based Non-Local (ANL) module to mitigate attention-attenuation attacks. These modules enhance the model's defense capability with minimal intrusion into the original model. By integrating FPAS and ANL into the Wide-ResNet model within a boosting framework, we demonstrate their synergistic defense capability. Even when adversarial examples are embedded with patches, our models showed significant improvements over the baseline, enhancing the average defense rate by 5.47% and 7.74%, respectively. Extensive experiments confirm that this universal defense strategy offers comprehensive protection against adversarial attacks at a lower implementation cost compared to current mainstream defense methods, and is also adaptable for integration with existing defense strategies to further enhance adversarial robustness.
PMID:39939692 | DOI:10.1038/s41598-025-89267-8
Deep learning-based prediction of possibility for immediate implant placement using panoramic radiography
Sci Rep. 2025 Feb 12;15(1):5202. doi: 10.1038/s41598-025-89219-2.
ABSTRACT
In this study, we investigated whether deep learning-based prediction of immediate implant placement is possible. Panoramic radiographs of 201 patients with 874 teeth (Group 1: 440 teeth difficult to place implant immediately after extraction, Group 2: 434 teeth possible of immediate implant placement after extraction) for extraction were evaluated for the training and testing of a deep learning model. DenseNet121, ResNet18, ResNet101, ResNeXt101, InceptionNetV3, and InceptionResNetV2 were used. Each model was trained using preprocessed dental data, and the dataset was divided into training, validation, and test sets to evaluate model performance. For each model, the sensitivity, precision, accuracy, balanced accuracy, and F1-score were all greater than 0.90. The results of this study confirm that deep-learning-based prediction of the possibility of immediate implant placement is possible at a fairly accurate level.
PMID:39939654 | DOI:10.1038/s41598-025-89219-2
Pre- and post- COVID-19 trends related to dementia caregiving on Twitter
Sci Rep. 2025 Feb 12;15(1):5173. doi: 10.1038/s41598-024-82405-8.
ABSTRACT
With the advent of new media, more people are turning to social media to share thoughts and emotions related to personal life experiences. We examined salient concerns of dementia caregivers on Twitter pre- and post-pandemic, aiming to shed light on how to better support and engage dementia caregivers post-COVID-19 pandemic. English tweets related to "dementia" and "caregiver" were extracted between 1st January 2013 and 31st December 2022. A supervised deep learning model (Bidirectional Encoder Representations from Transformers, BERT) was trained to select tweets describing individual's experience related to dementia caregiving. An unsupervised deep learning approach (BERT-based topic modelling) was applied to identify topics from selected tweets, with each topic further grouped into themes manually using thematic analysis. A total of 44,527 tweets were analysed, and stratified using the emergence of COVID-19 pandemic as a threshold. Three themes were derived: challenges of caregiving in dementia, strategies to inspire caregivers, and dementia-related stigmatization. Over time, there is a rising trend of tweets relating to dementia caregiving. Post-pandemic, challenges of caregiving remained the top discussed topic; with a notable increase in tweets related to dementia-related stigmatization (p < 0.001), especially in North America and other continents (and less so in Europe). The findings uncover a worrying trend of growing dementia-related stigmatization among the caregivers, manifested by caregivers internalizing publicly-held stigma and projecting negative stereotypes externally as a means to devalue others. The challenges faced by caregivers also remained a significant concern, highlighting the need for continued support and resources for caregivers even post-pandemic.
PMID:39939632 | DOI:10.1038/s41598-024-82405-8
Blockchain-integrated IoT device for advanced inspection of casting defects
Sci Rep. 2025 Feb 12;15(1):5300. doi: 10.1038/s41598-025-86777-3.
ABSTRACT
The quality control of investment casting remains a critical challenge due to defect detection, real-time processing, and data traceability inefficiencies. This study presents an innovative Blockchain-integrated IoT system for advanced inspection of casting defects, combining a ResNet-based deep learning model for defect detection and dimensional measurement with Blockchain technology to ensure data integrity and traceability. The system demonstrated a significant improvement in defect detection accuracy, achieving an F1-score of 0.94, alongside high data integrity (0.99) and traceability (0.98) metrics. Additionally, it processes each casting in an average of 2.3 s, supporting a throughput of 26 castings per minute. By addressing critical challenges in smart manufacturing, this approach enhances operational efficiency, regulatory compliance, and user confidence. While scalability and energy efficiency remain areas for improvement, the proposed method provides a transformative solution for Industry 4.0, fostering transparency and reliability in manufacturing processes.
PMID:39939622 | DOI:10.1038/s41598-025-86777-3
Comment on "An examination of daily CO(2) emissions prediction through a comparative analysis of machine learning, deep learning, and statistical models"
Environ Sci Pollut Res Int. 2025 Feb 13. doi: 10.1007/s11356-025-36087-y. Online ahead of print.
NO ABSTRACT
PMID:39939571 | DOI:10.1007/s11356-025-36087-y
A Deep-Learning Approach for Vocal Fold Pose Estimation in Videoendoscopy
J Imaging Inform Med. 2025 Feb 12. doi: 10.1007/s10278-025-01431-8. Online ahead of print.
ABSTRACT
Accurate vocal fold (VF) pose estimation is crucial for diagnosing larynx diseases that can eventually lead to VF paralysis. The videoendoscopic examination is used to assess VF motility, usually estimating the change in the anterior glottic angle (AGA). This is a subjective and time-consuming procedure requiring extensive expertise. This research proposes a deep learning framework to estimate VF pose from laryngoscopy frames acquired in the actual clinical practice. The framework performs heatmap regression relying on three anatomically relevant keypoints as a prior for AGA computation, which is estimated from the coordinates of the predicted points. The assessment of the proposed framework is performed using a newly collected dataset of 471 laryngoscopy frames from 124 patients, 28 of whom with cancer. The framework was tested in various configurations and compared with other state-of-the-art approaches (direct keypoints regression and glottal segmentation) for both pose estimation, and AGA evaluation. The proposed framework obtained the lowest root mean square error (RMSE) computed on all the keypoints (5.09, 6.56, and 6.40 pixels, respectively) among all the models tested for VF pose estimation. Also for the AGA evaluation, heatmap regression reached the lowest mean average error (MAE) ( 5 . 87 ∘ ). Results show that relying on keypoints heatmap regression allows to perform VF pose estimation with a small error, overcoming drawbacks of state-of-the-art algorithms, especially in challenging images such as pathologic subjects, presence of noise, and occlusion.
PMID:39939476 | DOI:10.1007/s10278-025-01431-8
Coordinating multiple mental faculties during learning
Sci Rep. 2025 Feb 13;15(1):5319. doi: 10.1038/s41598-025-89732-4.
ABSTRACT
Complex behavior is supported by the coordination of multiple brain regions. How do brain regions coordinate absent a homunculus? We propose coordination is achieved by a controller-peripheral architecture in which peripherals (e.g., the ventral visual stream) aim to supply needed inputs to their controllers (e.g., the hippocampus and prefrontal cortex) while expending minimal resources. We developed a formal model within this framework to address how multiple brain regions coordinate to support rapid learning from a few example images. The model captured how higher-level activity in the controller shaped lower-level visual representations, affecting their precision and sparsity in a manner that paralleled brain measures. In particular, the peripheral encoded visual information to the extent needed to support the smooth operation of the controller. Alternative models optimized by gradient descent irrespective of architectural constraints could not account for human behavior or brain responses, and, typical of standard deep learning approaches, were unstable trial-by-trial learners. While previous work offered accounts of specific faculties, such as perception, attention, and learning, the controller-peripheral approach is a step toward addressing next generation questions concerning how multiple faculties coordinate.
PMID:39939457 | DOI:10.1038/s41598-025-89732-4
A multicenter diagnostic study of thyroid nodule with Hashimoto's thyroiditis enabled by Hashimoto's thyroiditis nodule-artificial intelligence model
Eur Radiol. 2025 Feb 13. doi: 10.1007/s00330-025-11422-6. Online ahead of print.
ABSTRACT
OBJECTIVE: This study aimed to develop a Hashimoto's thyroiditis nodule-artificial intelligence (HTN-AI) model to optimize the diagnosis of thyroid nodules with Hashimoto's thyroiditis (HT) of which the efficiency and accuracy remain challenging.
DESIGN AND METHODS: This study included 5709 patients from 10 hospitals between January 2014 and March 2024. Among them, 5053 thyroid nodules were divided into training and testing sets in a 9:1 ratio. Then, we tested the model on an external dataset (n = 432). Finally, we prospectively recruited 224 patients with dynamic ultrasound videos acquired and employed the HTN-AI model to identify nodules from the dynamic ultrasound videos. Radiologists of varying seniority performed the categorization of thyroid nodules as benign and malignant, both with and without the assistance of the HTN-AI model, and their diagnostic performances were compared.
RESULTS: The results indicated that for the external testing set, the HTN-AI model achieved a Dice similarity coefficient (DSC) of 0.91, outperforming several other common convolutional neural network (CNN) models. Specifically, the DSCs of the HTN-AI model were similar for thyroid nodule patients with and without HT which were 0.91 ± 0.06 and 0.91 ± 0.09. Moreover, when the HTN-AI model was used to assist diagnosis, it demonstrated an improvement in the diagnostic performance of radiologists. The diagnostic areas under the receiver operating characteristic curve (AUCs) of the junior radiologists increased from 0.59, 0.59, and 0.57 to 0.68, 0.65, and 0.65.
CONCLUSIONS: This research demonstrates that the HTN-AI model has excellent performance in identifying thyroid nodules associated with HT and can assist radiologists with more accurate and efficient diagnoses of thyroid nodules.
KEY POINTS: Question The study developed an HTN-AI model aimed at assisting in the diagnosis of thyroid nodules in patients with HT. Findings The HTN-AI model achieved great performance with a Dice similarity coefficient (DSC) of 0.91, and consistent performance across patients with and without HT. Clinical relevance The HTN-AI model enhances the accuracy and efficiency of thyroid nodule diagnosis, particularly in patients with HT. By assisting radiologists at varying experience levels, this model supports improved decision-making in the management of thyroid nodules.
PMID:39939425 | DOI:10.1007/s00330-025-11422-6
Long duration multi-channel surface electromyographic signals during walking at natural pace: Data acquisition and analysis
PLoS One. 2025 Feb 12;20(2):e0318560. doi: 10.1371/journal.pone.0318560. eCollection 2025.
ABSTRACT
Variability of myoelectric activity during walking is the result of human capability to adapt to both intrinsic and extrinsic perturbations. The availability of sEMG signals lasting at least some minutes (instead of seconds) is needed to comprehensively analyze the variability of surface electromyographic (sEMG) signals. The current study introduces a dataset of long-lasting sEMG signals recorded during walking sessions of 31 healthy subjects, aged between 20 and 30 years, conducted at the Movement Analysis Lab of Università Politecnica delle Marche, Ancona, Italy. The sEMG signals were captured from ten distinct lower-limb muscles (five per leg), including gastrocnemius lateralis (GL), tibialis anterior (TA), rectus femoris (RF), hamstrings (Ham), and vastus lateralis (VL). Synchronized electrogoniometric and foot-floor-contact signals are also supplied to enable the spatial/temporal analysis of the sEMG signals. The experimental procedure involves subjects walking barefoot on level ground for approximately 5 minutes at their natural speed and pace, following an eight-shaped path featuring linear diagonal segments, curves, accelerations, and decelerations. An advanced analysis of the sEMG signals was performed to test the reliability and usability of the current dataset. The considerable duration of the signals makes this dataset particularly useful for studies where a significant volume of data is crucial, such as machine/deep learning approaches, investigations examining the variability of muscle recruitment during physiological walking, validations of the reliability of novel sEMG-based algorithms, and assembly of reference datasets for pathological condition characterization.
PMID:39937870 | DOI:10.1371/journal.pone.0318560
A novel deep learning-based framework with particle swarm optimisation for intrusion detection in computer networks
PLoS One. 2025 Feb 12;20(2):e0316253. doi: 10.1371/journal.pone.0316253. eCollection 2025.
ABSTRACT
Intrusion detection plays a significant role in the provision of information security. The most critical element is the ability to precisely identify different types of intrusions into the network. However, the detection of intrusions poses a important challenge, as many new types of intrusion are now generated by cyber-attackers every day. A robust system is still elusive, despite the various strategies that have been proposed in recent years. Hence, a novel deep-learning-based architecture for detecting intrusions into a computer network is proposed in this paper. The aim is to construct a hybrid system that enhances the efficiency and accuracy of intrusion detection. The main contribution of our work is a novel deep learning-based hybrid architecture in which PSO is used for hyperparameter optimisation and three well-known pre-trained network models are combined in an optimised way. The suggested method involves six key stages: data gathering, pre-processing, deep neural network (DNN) architecture design, optimisation of hyperparameters, training, and evaluation of the trained DNN. To verify the superiority of the suggested method over alternative state-of-the-art schemes, it was evaluated on the KDDCUP'99, NSL-KDD and UNSW-NB15 datasets. Our empirical findings show that the proposed model successfully and correctly classifies different types of attacks with 82.44%, 90.42% and 93.55% accuracy values obtained on UNSW-B15, NSL-KDD and KDDCUP'99 datasets, respectively, and outperforms alternative schemes in the literature.
PMID:39937819 | DOI:10.1371/journal.pone.0316253
Unlocking the power of AI for phenotyping fruit morphology in Arabidopsis
Gigascience. 2025 Jan 6;14:giae123. doi: 10.1093/gigascience/giae123.
ABSTRACT
Deep learning can revolutionise high-throughput image-based phenotyping by automating the measurement of complex traits, a task that is often labour-intensive, time-consuming, and prone to human error. However, its precision and adaptability in accurately phenotyping organ-level traits, such as fruit morphology, remain to be fully evaluated. Establishing the links between phenotypic and genotypic variation is essential for uncovering the genetic basis of traits and can also provide an orthologous test of pipeline effectiveness. In this study, we assess the efficacy of deep learning for measuring variation in fruit morphology in Arabidopsis using images from a multiparent advanced generation intercross (MAGIC) mapping family. We trained an instance segmentation model and developed a pipeline to phenotype Arabidopsis fruit morphology, based on the model outputs. Our model achieved strong performance with an average precision of 88.0% for detection and 55.9% for segmentation. Quantitative trait locus analysis of the derived phenotypic metrics of the MAGIC population identified significant loci associated with fruit morphology. This analysis, based on automated phenotyping of 332,194 individual fruits, underscores the capability of deep learning as a robust tool for phenotyping large populations. Our pipeline for quantifying pod morphological traits is scalable and provides high-quality phenotype data, facilitating genetic analysis and gene discovery, as well as advancing crop breeding research.
PMID:39937596 | DOI:10.1093/gigascience/giae123
Deep learning-based spatio-temporal fusion for high-fidelity ultra-high-speed X-ray radiography
J Synchrotron Radiat. 2025 Mar 1. doi: 10.1107/S1600577525000323. Online ahead of print.
ABSTRACT
Full-field ultra-high-speed (UHS) X-ray imaging experiments have been well established to characterize various processes and phenomena. However, the potential of UHS experiments through the joint acquisition of X-ray videos with distinct configurations has not been fully exploited. In this paper, we investigate the use of a deep learning-based spatio-temporal fusion (STF) framework to fuse two complementary sequences of X-ray images and reconstruct the target image sequence with high spatial resolution, high frame rate and high fidelity. We applied a transfer learning strategy to train the model and compared the peak signal-to-noise ratio (PSNR), average absolute difference (AAD) and structural similarity (SSIM) of the proposed framework on two independent X-ray data sets with those obtained from a baseline deep learning model, a Bayesian fusion framework and the bicubic interpolation method. The proposed framework outperformed the other methods with various configurations of the input frame separations and image noise levels. With three subsequent images from the low-resolution (LR) sequence of a four times lower spatial resolution and another two images from the high-resolution (HR) sequence of a 20 times lower frame rate, the proposed approach achieved average PSNRs of 37.57 dB and 35.15 dB, respectively. When coupled with the appropriate combination of high-speed cameras, the proposed approach will enhance the performance and therefore the scientific value of UHS X-ray imaging experiments.
PMID:39937516 | DOI:10.1107/S1600577525000323
Forensic dental age estimation with deep learning: a modified xception model for panoramic X-Ray images
Forensic Sci Med Pathol. 2025 Feb 12. doi: 10.1007/s12024-025-00962-4. Online ahead of print.
ABSTRACT
PURPOSE: This study aimed to develop an improved method for forensic age estimation using deep learning models applied to orthopantomography (OPG) images, focusing on distinguishing individuals under 12 years old from those aged 12 and above.
METHODS: A dataset of 1941 pediatric patients aged between five and 15 years was collected from two radiology departments. The primary research question addressed the identification of the most effective deep learning model for this task. Various deep learning models including Xception, ResNet, ShuffleNet, InceptionV3, DarkNet, NasNet, DenseNet, EfficientNet, MobileNet, ResNet18, GoogleNet, SqueezeNet, and AlexNet were evaluated using traditional metrics like Classification Accuracy (CA), Sensitivity (SE), Specificity (SP), Kappa (K), Area Under the Curve (AUC), alongside a novel Polygon Area Metric (PAM) designed to handle imbalanced datasets common in forensic applications.
RESULTS: "Forensic Xception" model derived from Xception outperformed others, achieving a PAM score of 0.8828. This model demonstrated superior performance in accurately classifying individuals' age groups, with high CA, SE, SP, K, AUC, and F1 Score. Notably, the introduction of the PAM metric provided a comprehensive evaluation of classifier performance.
CONCLUSION: This study represents a significant advancement in forensic age estimation from OPG images, emphasizing the potential of deep learning models, particularly the "Forensic Xception" model, in accurately classifying individuals based on age, especially in legal contexts. This research suggests a promising avenue for further advancements in forensic dental age estimation, with future studies encouraged to explore additional datasets, refine models, and address ethical and legal considerations.
PMID:39937388 | DOI:10.1007/s12024-025-00962-4