Deep learning
Predictive Modeling of Anticancer Drug Sensitivity Using REFINED CNN
Methods Mol Biol. 2025;2932:259-271. doi: 10.1007/978-1-0716-4566-6_14.
ABSTRACT
Over the past decade, convolutional neural networks (CNNs) have revolutionized predictive modeling of data containing spatial correlations, specifically excelling at image analysis tasks due to their embedded feature extraction and improved generalization. However, outside of image or sequence data, datasets typically lack the structural correlation needed to exploit the benefits of CNN modeling. This is especially true regarding anticancer drug sensitivity prediction tasks, as the data used is often tabular without any embedded information in the ordering or locations of the features when utilizing data other than DNA or RNA sequences. This chapter provides a computational procedure, REpresentation of Features as Images with NEighborhood Dependencies (REFINED), that maps high-dimensional feature vectors into compact 2D images suitable for CNN-based deep learning. The pairing of REFINED mappings with CNNs enables enhanced predictive performance through reduced model parameterization and improved embedded feature extraction as compared to fully connected alternatives utilizing the high-dimensional feature vectors.
PMID:40779115 | DOI:10.1007/978-1-0716-4566-6_14
Single-Molecule SERS Detection of Phosphorylation in Serine and Tyrosine Using Deep Learning-Assisted Plasmonic Nanopore
J Phys Chem Lett. 2025 Aug 8:8418-8426. doi: 10.1021/acs.jpclett.5c01753. Online ahead of print.
ABSTRACT
Single-molecule detection of post-translational modifications (PTMs) such as phosphorylation plays a crucial role in early diagnosis of diseases and therapeutics development. Although single-molecule surface-enhanced Raman spectroscopy (SM-SERS) detection of PTMs has been demonstrated, the data analysis and detection accurracies were hindered by interference from citrate signals and lack of reference databases. Previous reports required complete coverage of the nanoparticle surface by analyte molecules to replace citrates, hampering the detection limit. Here, we developed a high-accuracy SM-SERS approach by combining a plasmonic particle-in-pore sensor to collect SM-SERS spectra of phosphorylation at Serine and Tyrosine, k-means-based clustering for citrate signal removal, and a one-dimensional convolutional neural network (1D-CNN) for phosphorylation identification. Significantly, we collected SM-SERS data with submonolayer analyte coverage of the particle surface and discriminated the phosphorylation in Serine and Tyrosine with over 95% and 97% accuracy, respectively. Finally, the 1D-CNN features were interpreted by a one-dimensional gradient feature weight and SM-SERS peak occurrence frequencies.
PMID:40778942 | DOI:10.1021/acs.jpclett.5c01753
General Purpose Deep Learning Attenuation Correction Improves Diagnostic Accuracy of SPECT MPI: A Multicenter Study
JACC Cardiovasc Imaging. 2025 Aug 1:S1936-878X(25)00331-6. doi: 10.1016/j.jcmg.2025.06.010. Online ahead of print.
ABSTRACT
BACKGROUND: Single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) uses computed tomography (CT)-based attenuation correction (AC) to improve diagnostic accuracy. Deep learning (DL) has the potential to generate synthetic AC images, as an alternative to CT-based AC.
OBJECTIVES: This study evaluated whether DL-generated synthetic SPECT images could enhance accuracy of conventional SPECT MPI.
METHODS: Study investigators developed a DL model in a multicenter cohort of 4,894 patients from 4 sites to generate simulated SPECT AC images (DeepAC). The model was externally validated in 746 patients from 72 sites in a clinical trial (A Phase 3 Multicenter Study to Assess PET Imaging of Flurpiridaz F 18 Injection in Patients With CAD; NCT01347710) and in 320 patients from another external site. In the first external cohort, the study assessed the diagnostic accuracy for obstructive coronary artery disease (CAD)-defined as left main coronary artery stenosis ≥50% or ≥70% in other vessels-for total perfusion deficit (TPD). In the latter, the study completed change analysis and compared quantitative scores for AC, DeepAC, and nonattenuation correction (NC) with clinical scores.
RESULTS: In the first external cohort (mean age, 63 ± 9.5 years; 69.0% male), 206 patients (27.6%) had obstructive CAD. The area under the receiver-operating characteristic curve (AUC) of DeepAC TPD (0.77; 95% CI: 0.73-0.81) was higher than the NC TPD (AUC: 0.73; 95% CI: 0.69-0.77; P < 0.001). In the second external cohort, DeepAC quantitative scores had closer agreement with actual AC scores compared with NC.
CONCLUSIONS: In a multicenter external cohort, DeepAC improved prediction performance for obstructive CAD. This approach could enhance diagnostic accuracy in facilities using conventional SPECT systems without requiring additional equipment, imaging time, or radiation exposure.
PMID:40778900 | DOI:10.1016/j.jcmg.2025.06.010
Source localization in shallow ocean using a deep learning approach with range-dependent sound speed profile modeling
JASA Express Lett. 2025 Aug 1;5(8):086001. doi: 10.1121/10.0038764.
ABSTRACT
Model-based deep learning approaches provide an alternative scheme to address the problem of the shortage of training data. However, performance degradation caused by sound speed profile (SSP) mismatch remains a critical challenge, particularly in shallow-water environments influenced by internal waves. In this paper, a simple range-dependent SSP model is integrated into the deep learning approach for source localization. The network trained on simulated data generated with the range-dependent SSP model performs well on validation data and generalizes to experimental test data after transfer learning with limited experimental samples.
PMID:40778845 | DOI:10.1121/10.0038764
Deep Learning-Enhanced CTA for Noninvasive Prediction of First Variceal Haemorrhage in Cirrhosis: A Multi-Centre Study
Liver Int. 2025 Sep;45(9):e70274. doi: 10.1111/liv.70274.
ABSTRACT
BACKGROUND AND AIMS: The first variceal haemorrhage (FVH) is a life-threatening complication of liver cirrhosis that requires timely intervention; however, noninvasive tools for accurately predicting FVH remain limited. This study aimed to develop noninvasive, deep learning-enhanced computed tomographic angiography (CTA) models for early and accurate FVH prediction.
METHODS: This multi-centre retrospective study included 184 cirrhotic patients (FVH: n = 107, non-FVH: n = 77) enrolled from December 2014 to May 2022. Patients were randomly divided (7:3) into training and validation cohorts. CTA and clinical data were collected and analysed. A novel Vision-Transformer (ViT) network, combined with reinforcement learning (RL), was applied to CTA images to predict FVH and was compared with convolutional neural networks (CNNs). Models were evaluated using the area under the receiver operating characteristic curve (AUC) and decision curve analysis (DCA), and feature importance was determined from model coefficients and gradients.
RESULTS: The ViT + RL* model demonstrated superior diagnostic performance, achieving an AUC of 0.985 (95% CI, 0.955-1.0) in the validation cohort and 0.956 (95% CI, 0.919-0.988) in the training cohort, outperforming traditional CNNs. DCA and the area under the curve confirmed the enhanced clinical utility of the ViT + RL* model compared to CNNs; the ViT + RL* model highlighted critical regions in the liver, spleen, oesophageal lumen, and abdominal vessels. Meanwhile, clinical data identified creatinine and prothrombin time as potential predictive factors, with moderate predictive performance.
CONCLUSIONS: The novel deep learning-enhanced CTA models offer a robust, non-invasive method for predicting FVH, with the ViT + RL* model demonstrating excellent efficacy; thus providing a valuable tool for early risk stratification in cirrhotic patients.
PMID:40778828 | DOI:10.1111/liv.70274
Deep Learning for Hyperpolarized NMR of Intrinsically Disordered Proteins Without Resolution Loss: Access to Short-Lived Intermediates
Chemistry. 2025 Aug 8:e02067. doi: 10.1002/chem.202502067. Online ahead of print.
ABSTRACT
The inherently low sensitivity of solution-state Nuclear Magnetic Resonance (NMR) has long limited its ability to characterize transient biomolecular states at atomic resolution. While dissolution dynamic nuclear polarization (dDNP) offers a compensating signal enhancement, its broader use has been hindered by rapid polarization decay, causing severe spectral distortion. Here, we introduce HyperW-Decon, an approach that enables high-sensitivity, high-resolution NMR of biomolecules in solution. HyperW-Decon combines two key aspects: (i) the use of hyperpolarized water (HyperW) to transfer polarization to proteins through rapid proton exchange; and (ii) a theory-driven, machine learning (ML)-based deconvolution method that corrects polarization-induced artifacts without requiring external reference signals. This approach is based on a first-principles understanding of dDNP line shapes and delivers a scalable solution to spectral distortion. Applied to intrinsically disordered proteins (IDPs) involved in biomineralization, HyperW-Decon reveals previously inaccessible, short-lived ion-peptide encounter complexes with residue resolution.
PMID:40778633 | DOI:10.1002/chem.202502067
ChewNet: A multimodal dataset for invivo and invitro beef and plant-based burger patty boluses with images, texture, and force profiles
Data Brief. 2025 Jul 16;62:111890. doi: 10.1016/j.dib.2025.111890. eCollection 2025 Oct.
ABSTRACT
This dataset presents a comprehensive multimodal collection of data acquired from the chewing of beef and plant-based burger patties using both human participants (Invivo) and a biomimicking robotic chewing device (Invitro). The primary objective of the data collection was to discover relationships regarding the change in food bolus properties with the number of robotic chewing cycles as the human swallowing threshold is achieved, which will facilitate the development of deep learning models capable of predicting mechanical and textural properties of chewed food boluses from images. In the in-vivo experiments, expectorated bolus samples were collected from three healthy adult male participants, who chewed food samples until just before swallowing. The chewed boluses were then imaged using a 12MP camera and a flatbed scanner, followed by Texture Profile Analysis (TPA) to measure texture parameters. The dataset comprises two main folders Invivo and Invitro. The Invivo data thus comprises high-resolution images and corresponding TPA metrics at the near-swallowing stage. In the Invitro experiments, a 3 degree of freedom linkage chewing robot (ChewBot) with a soft robotic oral cavity was used to simulate human mastication. The robot performed controlled mastication using different molar trajectories that varied in lateral shearing effect. Food samples were chewed for up to 40 chewing cycles, with artificial saliva introduced at 10 % of the food's weight. For each experimental condition, the dataset includes real-time images captured immediately after each the robotic chewing cycle, force profiles recorded at 100 ms intervals during the chewing, and TPA metrics of the resulting bolus after every 5 chewing cycles. This dataset has significant reuse potential in various fields. In food science, it can support studies on the mechanical breakdown of meat and meat alternatives, aiding in the reformulation of plant-based foods to better mimic desirable animal-based food textures. This dataset supports rehabilitation in health sciences by aiding personalized diet design for individuals with jaw disorders or dysphagia and guiding texture-appropriate menu options for patients. In robotics and artificial mastication, it informs the development of chewing systems. It also enables machine learning applications for predicting food texture from images, allowing automated, non-invasive analysis.
PMID:40778379 | PMC:PMC12329220 | DOI:10.1016/j.dib.2025.111890
Applications of Computer Vision for Infectious Keratitis: A Systematic Review
Ophthalmol Sci. 2025 Jun 19;5(6):100861. doi: 10.1016/j.xops.2025.100861. eCollection 2025 Nov-Dec.
ABSTRACT
CLINICAL RELEVANCE: Corneal ulcers cause preventable blindness in >2 million individuals annually, primarily affecting low- and middle-income countries. Prompt and accurate pathogen identification is essential for targeted antimicrobial treatment, yet current diagnostic methods are costly and slow and require specialized expertise, limiting accessibility.
METHODS: We systematically reviewed literature published from 2017 to 2024, identifying 37 studies that developed or validated artificial intelligence (AI) models for pathogen detection and related classification tasks in infectious keratitis. The studies were analyzed for model types, input modalities, datasets, ground truth determination methods, and validation practices.
RESULTS: Artificial intelligence models demonstrated promising accuracy in pathogen detection using image interpretation techniques. Common limitations included limited generalizability, lack of diverse datasets, absence of multilabeled classification methods, and variability in ground truth standards. Most studies relied on single-center retrospective datasets, limiting applicability in routine clinical practice.
CONCLUSIONS: Artificial intelligence shows significant potential to improve pathogen detection in infectious keratitis, enhancing both diagnostic accuracy and accessibility globally. Future research should address identified limitations by increasing dataset diversity, adopting multilabel classification, implementing prospective and multicenter validations, and standardizing ground truth definitions.
FINANCIAL DISCLOSURES: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
PMID:40778364 | PMC:PMC12329105 | DOI:10.1016/j.xops.2025.100861
Automated Segmentation of Subretinal Fluid from OCT: A Vision Transformer Approach with Cross-Validation
Ophthalmol Sci. 2025 Jun 16;5(6):100852. doi: 10.1016/j.xops.2025.100852. eCollection 2025 Nov-Dec.
ABSTRACT
PURPOSE: We present an algorithm to segment subretinal fluid (SRF) on individual B-scan slices in patients with rhegmatogenous retinal detachment (RRD). Particular attention is paid to robustness, with a fivefold cross-validation approach and a hold-out test set.
DESIGN: Retrospective, cross-sectional study.
PARTICIPANTS: A total of 3819 B-scan slices across 98 time points from 45 patients were used in this study.
METHODS: Subretinal fluid was segmented on all scans. A base SegFormer model, pretrained on 4 massive data sets, was further trained on raw B-scans from the retinal OCT fluid challenge data set of 4532 slices: an open data set of intraretinal fluid, SRF, and pigment epithelium detachment. When adequate performance was reached, transfer learning was used to train the model on our in-house data set, to segment SRF by generating a pixel-wise mask of presence/absence of SRF. A fivefold cross-validation approach was used, with an additional hold-out test set. All folds were first trained and cross-validated and then additionally tested on the hold-out set. Mean (averaged across images) and total (summed across all pixels, irrespective of image) Dice coefficients were calculated for each fold.
MAIN OUTCOME MEASURES: Subretinal fluid volume after surgical intervention for RRD.
RESULTS: The average total Dice coefficient across the validation folds was 0.92, the average mean Dice coefficient was 0.82, and the median Dice was 0.92. For the test set, the average total Dice coefficient was 0.94, the average mean Dice coefficient was 0.82, and the median Dice was 0.92. The model showed strong interfold consistency on the hold-out set, with a standard deviation of only 0.03.
CONCLUSIONS: The SegFormer model for SRF segmentation demonstrates a strong ability to segment SRF. This result holds up to cross-validation and hold-out testing, across all folds. The model is available open-source online.
FINANCIAL DISCLOSURES: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
PMID:40778358 | PMC:PMC12329092 | DOI:10.1016/j.xops.2025.100852
A novel deep learning model based on multimodal contrast-enhanced ultrasound dynamic video for predicting occult lymph node metastasis in papillary thyroid carcinoma
Front Endocrinol (Lausanne). 2025 Jul 24;16:1634875. doi: 10.3389/fendo.2025.1634875. eCollection 2025.
ABSTRACT
OBJECTIVE: This study aimed to evaluate the value of constructing a multimodal deep-learning video model based on 2D ultrasound and contrast-enhanced ultrasound (CEUS) dynamic video for the preoperative prediction of OLNM in papillary thyroid carcinoma (PTC) patients.
METHODS: A retrospective analysis was conducted on 396 cases of clinically lymph node-negative PTC cases with ultrasound images collected between January and September 2023. Five representative deep learning architectures were pre-trained to construct deep learning static image models (DL_image), CEUS dynamic video models (DL_CEUSvideo), and combined models (DL_combined). The area under the receiver operating characteristic curve (AUC) was used to evaluate model performance, with comparisons made using the Delong test. A P-value of less than 0.05 was considered statistically significant.
RESULTS: The DL_CEUSvideo, DL_image, and DL_combined models were successfully developed and demonstrated. The AUC values were 0.826 (95% CI: 0.771-0.881), 0.759 (95% CI: 0.690-0.828), and 0.926 (95% CI: 0.891-0.962) in the training set, and 0.701 (95% CI: 0.589-0.813), 0.624 (95% CI: 0.502-0.745), and 0.734 (95% CI: 0.627-0.842) in the test set. Finally, sensitivity, specificity, and accuracy for the DL_CEUSvideo, DL_image, and DL_combined models were 0.836, 0.671, 0.704; 0.673, 0.716, 0.707; and 0.818, 0.902, 0.886 in the training set, and 0.556, 0.775, 0.724; 0.556, 0.674, 0.647; and 0.704, 0.663, 0.672 in the test set, respectively.
CONCLUSION: These results demonstrated that the multimodal deep learning dynamic video model could preoperatively predict OLNM in PTC patients. The DL_CEUSvideo model outperformed the DL_image model, while the DL_combined model significantly enhanced sensitivity without compromising specificity.
PMID:40778281 | PMC:PMC12329689 | DOI:10.3389/fendo.2025.1634875
Automated detection of diabetic retinopathy lesions in ultra-widefield fundus images using an attention-augmented YOLOv8 framework
Front Cell Dev Biol. 2025 Jul 24;13:1608580. doi: 10.3389/fcell.2025.1608580. eCollection 2025.
ABSTRACT
OBJECTIVE: To enhance the automatic detection precision of diabetic retinopathy (DR) lesions, this study introduces an improved YOLOv8 model specifically designed for the precise identification of DR lesions.
METHOD: This study integrated two attention mechanisms, convolutional exponential moving average (convEMA) and convolutional simple attention module (convSimAM), into the backbone of the YOLOv8 model. A dataset consisting of 3,388 ultra-widefield (UWF) fundus images obtained from patients with DR, each with a resolution of 2,600 × 2048 pixels, was utilized for both training and testing purposes. The performances of the three models-YOLOv8, YOLOv8+ convEMA, and YOLOv8+ convSimAM-were systematically compared.
RESULTS: A comparative analysis of the three models revealed that the original YOLOv8 model suffers from missed detection issues, achieving a precision of 0.815 for hemorrhage spot detection. YOLOv8+ convEMA improved hemorrhage detection precision to 0.906, while YOLOv8+ convSimAM achieved the highest value of 0.910, demonstrating the enhanced sensitivity of spatial attention. The proposed model also maintained comparable precision in detecting hard exudates while improving recall to 0.804. It demonstrated the best performance in detecting cotton wool spots and the epiretinal membrane. Overall, the proposed method provides a fine-tuned model specialized in subtle lesion detection, providing an improved solution for DR lesion assessment.
CONCLUSION: In this study, we proposed two attention-augmented YOLOv8 models-YOLOv8+ convEMA and YOLOv8+ convSimAM-for the automated detection of DR lesions in UWF fundus images. Both models outperformed the baseline YOLOv8 in terms of detection precision, average precision, and recall. Among them, YOLOv8+ convSimAM achieved the most balanced and accurate results across multiple lesion types, demonstrating an enhanced capability to detect small, low-contrast, and structurally complex features. These findings support the effectiveness of lightweight attention mechanisms in optimizing deep learning models for high-precision DR lesion detection.
PMID:40778265 | PMC:PMC12328430 | DOI:10.3389/fcell.2025.1608580
Advancing Spine Fracture Detection: The Role of Artificial Intelligence in Clinical Practice
Korean J Neurotrauma. 2025 Jul 18;21(3):172-182. doi: 10.13004/kjnt.2025.21.e22. eCollection 2025 Jul.
ABSTRACT
Vertebral fractures are prevalent skeletal injuries commonly associated with osteoporosis, trauma, and degenerative diseases. Early and accurate diagnosis is crucial to prevent complications such as chronic pain and progressive spinal deformities. In recent years, artificial intelligence (AI) has emerged as a powerful tool in medical imaging to support automatic detection and classification of vertebral fractures. This review provides an overview of AI-based approaches for spinal fracture diagnosis and summarizes recent advances in deep learning (DL) and machine learning (ML) models. The performance of AI models, mainly evaluated by sensitivity, specificity, and accuracy metrics, varies with imaging modality and dataset size, with computed tomography-based models demonstrating superior diagnostic accuracy. In addition, AI-assisted workflows have been shown to improve diagnostic efficiency, reducing the time required for fracture detection. Despite these advances, challenges remain, such as dataset variability, the need for large-scale annotated datasets, and standardization of evaluation metrics. Future research should focus on improving model generalization, integrating multimodal imaging data, and validating AI applications in real-world clinical settings to further improve vertebral fracture diagnosis and patient management.
PMID:40778250 | PMC:PMC12325887 | DOI:10.13004/kjnt.2025.21.e22
Cascaded Multimodal Deep Learning in the Differential Diagnosis, Progression Prediction, and Staging of Alzheimer's and Frontotemporal Dementia
medRxiv [Preprint]. 2025 Jul 21:2024.09.23.24314186. doi: 10.1101/2024.09.23.24314186.
ABSTRACT
Dementia is a complex condition whose multifaceted nature poses significant challenges in the diagnosis, prognosis, and treatment of patients. Despite the availability of large open-source data fueling a wealth of promising research, effective translation of preclinical findings to clinical practice remains difficult. This barrier is largely due to the complexity of unstructured and disparate preclinical and clinical data, which traditional analytical methods struggle to handle. Novel analytical techniques involving Deep Learning (DL), however, are gaining significant traction in this regard. Here, we have investigated the potential of a cascaded multimodal DL-based system (TelDem), assessing the ability to integrate and analyze a large, heterogeneous dataset (n=7,159 patients), applied to three clinically relevant use cases. Using a Cascaded Multi-Modal Mixing Transformer (CMT), we assessed TelDem's validity and (using a Cross-Modal Fusion Norm - CMFN) model explainability in (i) differential diagnosis between healthy individuals, AD, and three sub-types of frontotemporal lobar degeneration (ii) disease staging from healthy cognition to mild cognitive impairment (MCI) and AD, and (iii) predicting progression from MCI to AD. Our findings show that the CMT enhances diagnostic and prognostic accuracy when incorporating multimodal data compared to unimodal modeling and that cerebrospinal fluid (CSF) biomarkers play a key role in accurate model decision making. These results reinforce the power of DL technology in tapping deeper into already existing data, thereby accelerating preclinical dementia research by utilizing clinically relevant information to disentangle complex dementia pathophysiology.
PMID:40778154 | PMC:PMC12330412 | DOI:10.1101/2024.09.23.24314186
Ultra low-power, wearable, accelerated shallow-learning fall detection for elderly at-risk persons
Smart Health (Amst). 2024 Sep;33:100498. doi: 10.1016/j.smhl.2024.100498. Epub 2024 Jun 5.
ABSTRACT
This work focuses on the development and manufacturing of a wireless, wearable, low-power fall detection sensor (FDS) designed to predict and detect falls in elderly at-risk individuals. Unintentional falls are a significant risk in this demographic, often resulting from diminished physical capabilities such as reduced hand grip strength and complications from conditions like arthritis, vertigo, and neuromuscular issues. To address this, we utilize advanced low-power field-programmable gate arrays (FPGAs) to implement a fixed-function neural network capable of categorizing activities of daily life (ADLs), including the detection of falls. This system employs a Convolutional Neural Network (CNN) model, trained and validated using the Caffe deep learning framework with data collected from human subjects experiments. This system integrates an ST Microelectronics LSM6DSOX inertial measurement unit (IMU) sensor, embedded with an ultra-low-power Lattice iCE40UP FPGA, which samples and stores joint acceleration and orientation rate. Additionally, we have acquired and published a dataset of 3D accelerometer and gyroscope measurements from predefined ADLs and falls, using volunteer human subjects. This innovative approach aims to enhance the safety and well-being of older adults by providing timely and accurate fall detection and prediction. In this paper, we present an innovative approach to utilizing a compact Convolutional Neural Network (CNN) core for accelerating convolutional operations on a machine learning model, suitable for deployment on an ultra-low power FPGA.
PMID:40777999 | PMC:PMC12327353 | DOI:10.1016/j.smhl.2024.100498
Automatic segmentation of chest X-ray images via deep-improved various U-Net techniques
Digit Health. 2025 Aug 6;11:20552076251366855. doi: 10.1177/20552076251366855. eCollection 2025 Jan-Dec.
ABSTRACT
OBJECTIVES: Accurate segmentation of medical images is vital for effective disease diagnosis and treatment planning. This is especially important in resource-constrained environments. This study aimed to evaluate the performance of various U-Net-based deep learning architectures for chest X-ray (CXR) segmentation and identify the most effective model in terms of both accuracy and computational efficiency.
METHODS: We assessed the segmentation performance of eight U-Net variants: U-Net7, U-Net9, U-Net11, U-Net13, U-Net16, U-Net32, U-Net64, and U-Net128. The evaluation was conducted using a publicly available CXR dataset categorized into normal, COVID-19, and viral pneumonia classes. Each image was paired with a corresponding segmentation mask. Image preprocessing involved resizing, noise filtering, and normalization to standardize input quality. All models were trained under identical experimental conditions to ensure a fair comparison. Performance was evaluated using two key metrics: Intersection over Union (IoU) and Dice Coefficient (DC). Additionally, computational efficiency was measured by comparing the total number of trainable parameters and the training time for each model.
RESULTS: U-Net9 achieved the highest performance among all tested models. It recorded a DC of 0.98 and an IoU of 0.96, outperforming both shallower and deeper U-Net architectures. Models with increased depth or filter width, such as U-Net128, showed diminishing returns in accuracy. These models also incurred significantly higher computational costs. In contrast, U-Net16 and U-Net32 demonstrated reduced segmentation accuracy compared to U-Net9. Overall, U-Net9 provided the optimal balance between precision and computational efficiency for CXR segmentation tasks.
CONCLUSION: The U-Net9 architecture offers a superior solution for CXR image segmentation. It combines high segmentation accuracy with computational practicality, making it suitable for real-world applications. Its implementation can support radiologists by enabling faster and more reliable diagnoses. This can lead to improved clinical decision-making and reduced diagnostic delays. Future work will focus on integrating U-Net9 with multimodal imaging data, such as combining CXR with computerized tomography or MRI scans. Additionally, exploration of advanced architectures, including attention mechanisms and hybrid models, is planned to further enhance segmentation performance.
PMID:40777837 | PMC:PMC12329272 | DOI:10.1177/20552076251366855
Lensless camera: Unraveling the breakthroughs and prospects
Fundam Res. 2024 Mar 30;5(4):1725-1736. doi: 10.1016/j.fmre.2024.03.019. eCollection 2025 Jul.
ABSTRACT
Lensless imaging is an innovative and swiftly advancing technology at the intersection of optics, imaging technology, and computational science. It captures scene by directly recording the interference or diffraction patterns of light, subsequently utilizing algorithms to reconstruct the original image from these patterns. Lensless imaging transforms traditional imaging paradigms, offering newfound design flexibility and the capacity to seamlessly integrate within diverse imaging ecosystems. This paper aims to provide an overview of significant developments in optical modulation elements, image sensors, and reconstruction algorithms. The novel application scenarios that benefit from lensless computational imaging are presented. The opportunities and challenges associated with lensless camera are discussed for further improving its performance.
PMID:40777808 | PMC:PMC12327861 | DOI:10.1016/j.fmre.2024.03.019
Advanced skin cancer prediction with medical image data using MobileNetV2 deep learning and optimized techniques
Sci Rep. 2025 Aug 7;15(1):28962. doi: 10.1038/s41598-025-14963-4.
ABSTRACT
Skin cancer, especially melanoma, has become one of the most widespread and deadly diseases today. The chances of successful treatment are greatly reduced if the melanoma is not treated in its early stages because it could spread aggressively. Hence, the diagnosis of skin cancer is very challenging as skin lesions are highly subjective to analyze and that type of expertise is exceedingly specialized. While there is an increase in the prevalence of skin cancer across the globe, there is an increase need of automated diagnostic systems that could aid medical personnel in making appropriate decisions within the requisite timelines. This study proposes construction of a deep learning model built on the MobileNetV2 architecture that has been memetic optimized for hyperparameter tuning. The memetic algorithm employs both global and localized search techniques to fine-tune the model parameters that include learning rate, batch size, and number of epochs to boost the efficacy of the model. This makes it possible for the proposed model to achieve high performance while remaining economical on resources. This makes the model suitable for real world clinical settings. The model achieved exceptional results, with 98.48% accuracy, 97.67% precision, and 100% recall, highlighting its strong ability to detect malignant lesions. The ROC AUC score of 99.79% further demonstrates its outstanding capability to differentiate between benign and malignant lesions. Notably, visualizations such as the Grad-CAM heatmap and Superimposed Image were crucial in providing interpretability to the model's decision-making process. The Grad-CAM heatmap highlighted the regions of interest in the lesions, showing how the model focused on key structural features. The Superimposed Image combined these heatmaps with the original lesion images, making it clear which parts of the lesions influenced the model's classification. These results underscore the potential of deep learning models, optimized with the memetic algorithm, to significantly improve skin cancer detection. By offering both high accuracy and interpretability, this model presents a valuable tool for dermatologists, facilitating faster and more reliable early diagnosis and ultimately improving patient outcomes.
PMID:40775513 | DOI:10.1038/s41598-025-14963-4
Multi-module UNet++ for colon cancer histopathological image segmentation
Sci Rep. 2025 Aug 7;15(1):28895. doi: 10.1038/s41598-025-13636-6.
ABSTRACT
In the pathological diagnosis of colorectal cancer, the precise segmentation of glandular and cellular contours serves as the fundamental basis for achieving accurate clinical diagnosis. However, this task presents significant challenges due to complex phenomena such as nuclear staining heterogeneity, variations in nuclear size, boundary overlap, and nuclear clustering. With the continuous advancement of deep learning techniques-particularly encoder-decoder architectures-and the emergence of various high-performance functional modules, multi module collaborative fusion has become an effective approach to enhance segmentation performance. To this end, this study proposes the RPAU-Net++ model, which integrates the ResNet-50 encoder (R), the Joint Pyramid Fusion Module (P), and the Convolutional Block Attention Module (A) into the UNet++ framework, forming a multi-module-enhanced segmentation architecture. Specifically, ResNet-50 mitigates gradient vanishing and degradation issues in deep network training through residual skip connections, thereby improving model convergence stability and feature representation depth. JPFM achieves progressive fusion of cross-layer features via a multi-scale feature pyramid, enhancing the encoding capability for complex tissue structures and fine boundary information. CBAM employs adaptive weight allocation in both spatial and channel dimensions to focus on target region features while effectively suppressing irrelevant background noise, thereby improving feature discriminability. Comparative experiments on the GlaS and CoNIC colorectal cancer pathology datasets, as well as the more challenging PanNuke dataset, demonstrate that RPAU-Net++ significantly outperforms mainstream models in key segmentation metrics such as IoU and Dice, providing a more accurate solution for pathological image segmentation in colorectal cancer.
PMID:40775016 | DOI:10.1038/s41598-025-13636-6
NSPLformer: exploration of non-stationary progressively learning model for time series prediction
Sci Rep. 2025 Aug 7;15(1):28904. doi: 10.1038/s41598-025-13680-2.
ABSTRACT
Although Transformers perform well in time series prediction, they struggle when dealing with real-world data where the joint distribution changes over time. Previous studies have focused on reducing the non-stationarity of sequences through smoothing, but this approach strips the sequences of their inherent non-stationarity, which may lack predictive guidance for sudden events in the real world. To address the contradiction between sequence predictability and model capability, this paper proposes an efficient model design for multivariate non-stationary time series based on Transformers. This design is based on two core components: (1)Low-cost non-stationary attention mechanism, which restores intrinsic non-stationary information to time-dependent relationships at a lower computational cost by approximating the distinguishable attention learned in the original sequence.; (2) dual-data-stream Progressively learning, which designs an auxiliary output stream to improve information aggregation mechanisms, enabling the model to learn residuals of supervised signals layer by layer.The proposed model outperforms the mainstream Tranformer with an average improvement of 5.3% on multiple datasets, which provides theoretical support for the analysis of non-stationary engineering data.
PMID:40775010 | DOI:10.1038/s41598-025-13680-2
Multiaxial vibration data for blade fault diagnosis in multirotor unmanned aerial vehicles
Sci Data. 2025 Aug 7;12(1):1383. doi: 10.1038/s41597-025-05692-4.
ABSTRACT
This dataset presents multiaxial vibration signals collected from a multirotor unmanned aerial vehicle (UAV) operating in hover mode for the purpose of blade fault diagnosis. Vibration measurements were recorded at the geometric center of the UAV, where the centerlines of the four rotor arms intersect, using a triaxial accelerometer. The dataset captures variations across the X, Y, and Z axes under different blade fault conditions, including healthy, minor imbalance, severe imbalance, and screw loosening scenarios. Each flight scenario was repeated under controlled conditions to ensure consistency and high-quality labeling. The resulting soft-labeled dataset includes time-domain signals from numerous test flights and has been used in multiple prior studies involving classical and deep learning-based fault classification techniques. This curated data collection provides a valuable resource for researchers in UAV health monitoring, vibration analysis, and machine learning-based fault diagnosis. The dataset is particularly useful for the development and benchmarking of signal processing pipelines and classification models aimed at identifying blade-level faults in multirotor UAV systems.
PMID:40774972 | DOI:10.1038/s41597-025-05692-4