Deep learning
Deep learning for detecting and early predicting chronic obstructive pulmonary disease from spirogram time series
NPJ Syst Biol Appl. 2025 Feb 15;11(1):18. doi: 10.1038/s41540-025-00489-y.
ABSTRACT
Chronic Obstructive Pulmonary Disease (COPD) is a chronic lung condition characterized by airflow obstruction. Current diagnostic methods primarily rely on identifying prominent features in spirometry (Volume-Flow time series) to detect COPD, but they are not adept at predicting future COPD risk based on subtle data patterns. In this study, we introduce a novel deep learning-based approach, DeepSpiro, aimed at the early prediction of future COPD risk. DeepSpiro consists of four key components: SpiroSmoother for stabilizing the Volume-Flow curve, SpiroEncoder for capturing volume variability-pattern through key patches of varying lengths, SpiroExplainer for integrating heterogeneous data and explaining predictions through volume attention, and SpiroPredictor for predicting the disease risk of undiagnosed high-risk patients based on key patch concavity, with prediction horizons of 1-5 years, or even longer. Evaluated on the UK Biobank dataset, DeepSpiro achieved an AUC of 0.8328 for COPD detection and demonstrated strong predictive performance for future COPD risk (p-value < 0.001). In summary, DeepSpiro can effectively predict the long-term progression of COPD disease.
PMID:39955293 | DOI:10.1038/s41540-025-00489-y
Improving Radiotherapy Plan Quality for Nasopharyngeal Carcinoma With Enhanced UNet Dose Prediction
Cancer Med. 2025 Feb;14(4):e70688. doi: 10.1002/cam4.70688.
ABSTRACT
BACKGROUND: Individualized dose prediction is critical for optimizing radiation treatment planning. This study introduces DESIRE, an enhanced UNet-based dose prediction model with progrEssive feature fuSion and dIfficult Region lEarning, tailored for nasopharyngeal carcinoma (NPC) patients receiving volumetric modulated arc therapy. We aimed to assess the impact of integrating DESIRE into the treatment planning process to improve plan quality.
METHODS: This retrospective study included 131 NPC patients diagnosed at Jiangxi Cancer Hospital between 2017 and 2020. Twenty patients were randomly allocated to a testing cohort, while the remaining 111 comprised a training cohort. Target delineation included three planning target volumes (PTVs): PTV70, PTV60, and PTV55, along with several organs at risk (OARs). The DESIRE model predicted dose distributions, and discrepancies between DESIRE's predictions and the ground truth (GT) were quantified using dosimetric metrics and gamma pass rates. Two junior physicians used DESIRE's predictions for treatment planning, and their plans were compared to the GT.
RESULTS: Most of DESIRE's predicted dosimetric metrics closely aligned with GT (mean difference < 1 Gy), with no significant differences (p > 0.05) in Dmean and D1 values across OARs. While significant differences were observed in PTV metrics, the mean differences in D98, D95, D50, and Dmean between DESIRE and GT did not exceed 1 Gy. Assisted by DESIRE, the junior physicians' plans were comparable to the GT in nearly all OARs, with no significant differences in dosimetric metrics. The conformity index (CI) and homogeneity index (HI) for PTV70 surpassed the GT (0.847 ± 0.036 vs. 0.827 ± 0.037 for CI, and 0.057 ± 0.009 vs. 0.052 ± 0.008 for HI). The average three-dimensional gamma passing rates were 0.85 for PTV70 and 0.87 for the 35-Gy isodose line.
CONCLUSIONS: The DESIRE model shows promise for patient-specific dose prediction, enhancing junior physicians' treatment planning capabilities and improving plan quality.
PMID:39953816 | DOI:10.1002/cam4.70688
MemoCMT: multimodal emotion recognition using cross-modal transformer-based feature fusion
Sci Rep. 2025 Feb 14;15(1):5473. doi: 10.1038/s41598-025-89202-x.
ABSTRACT
Speech emotion recognition has seen a surge in transformer models, which excel at understanding the overall message by analyzing long-term patterns in speech. However, these models come at a computational cost. In contrast, convolutional neural networks are faster but struggle with capturing these long-range relationships. Our proposed system, MemoCMT, tackles this challenge using a novel "cross-modal transformer" (CMT). This CMT can effectively analyze local and global speech features and their corresponding text. To boost efficiency, MemoCMT leverages recent advancements in pre-trained models: HuBERT extracts meaningful features from the audio, while BERT analyzes the text. The core innovation lies in how the CMT component utilizes and integrates these audio and text features. After this integration, different fusion techniques are applied before final emotion classification. Experiments show that MemoCMT achieves impressive performance, with the CMT using min aggregation achieving the highest unweighted accuracy (UW-Acc) of 81.33% and 91.93%, and weighted accuracy (W-Acc) of 81.85% and 91.84% respectively on benchmark IEMOCAP and ESD corpora. The results of our system demonstrate the generalization capacity and robustness for real-world industrial applications. Moreover, the implementation details of MemoCMT are publicly available at https://github.com/tpnam0901/MemoCMT/ for reproducibility purposes.
PMID:39953105 | DOI:10.1038/s41598-025-89202-x
Early detection of Parkinson's disease using a multi area graph convolutional network
Sci Rep. 2025 Feb 14;15(1):5561. doi: 10.1038/s41598-024-82027-0.
ABSTRACT
Parkinson's disease is a neurological disorder, and early diagnosis is crucial for the treatment and quality of life of patients. Gait movement disorder is a significant manifestation of PD, and automated gait assessment is key to achieving automated detection of PD patients. With the development of deep learning, in order to improve the accuracy of early Parkinson's disease detection and enhance the robustness of motion recognition models, this study introduces an innovative deep learning approach, namely Multi-area Attention Spatiotemporal Directed Graph Convolutional Network (Ma-ST-DGN). The model effectively captures temporal and spatial information from the movement data of subjects to better understand subtle movement abnormalities in patients. Simultaneously, by reconstructing human skeleton features using directed graphs and introducing a multi-area self-attention mechanism, the model can adaptively focus on key information in different areas and apply more effective fusion strategies on features from different areas, thereby increasing sensitivity to potential signs of Parkinson's disease. By more effectively integrating global and local area information, the model captures subtle manifestations of PD. We use the first Parkinson's disease gait dataset, PD-Walk, consisting of walking videos of 95 PD patients and 96 healthy individuals. Extensive experiments on this clinical video dataset demonstrate that the model achieves the best performance to date, with an accuracy of 88.7%, far superior to existing sensor and vision-based Parkinson's gait assessment methods. Therefore, the method proposed in this study may be effective for early diagnosis of PD in clinical practice.
PMID:39952991 | DOI:10.1038/s41598-024-82027-0
Model-constrained deep learning for online fault diagnosis in Li-ion batteries over stochastic conditions
Nat Commun. 2025 Feb 14;16(1):1651. doi: 10.1038/s41467-025-56832-8.
ABSTRACT
For the intricate and infrequent safety issues of batteries, online safety fault diagnosis over stochastic working conditions is indispensable. In this work, we employ deep learning methods to develop an online fault diagnosis network for lithium-ion batteries operating under unpredictable conditions. The network integrates battery model constraints and employs a framework designed to manage the evolution of stochastic systems, thereby enabling fault real-time determination. We evaluate the performance using a dataset of 18.2 million valid entries from 515 vehicles. The results demonstrate our proposed algorithm outperforms other relevant approaches, enhancing the true positive rate by over 46.5% within a false positive rate range of 0 to 0.2. Meanwhile, we identify the trigger probability for four safety fault samples, namely, electrolyte leakage, thermal runaway, internal short circuit, and excessive aging. The proposed network is adaptable to packs of varying structures, thereby reducing the cost of implementation. Our work explores the application of deep learning for real-state prediction and diagnosis of batteries, demonstrating potential improvements in battery safety and economic benefits.
PMID:39952987 | DOI:10.1038/s41467-025-56832-8
Reducing inference cost of Alzheimer's disease identification using an uncertainty-aware ensemble of uni-modal and multi-modal learners
Sci Rep. 2025 Feb 14;15(1):5521. doi: 10.1038/s41598-025-86110-y.
ABSTRACT
While multi-modal deep learning approaches trained using magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG PET) data have shown promise in the accurate identification of Alzheimer's disease, their clinical applicability is hindered by the assumption that both modalities are always available during model inference. In practice, clinicians adjust diagnostic tests based on available information and specific clinical contexts. We propose a novel MRI- and FDG PET-based multi-modal deep learning approach that mimics clinical decision-making by incorporating uncertainty estimates of an MRI-based model (generated using Monte Carlo dropout and evidential deep learning) to determine the necessity of an FDG PET scan, and only inputting the FDG PET to a multi-modal model when required. This approach significantly reduces the reliance on FDG PET scans, which are costly and expose patients to radiation. Our approach reduces the need for FDG PET by up to 92% without compromising model performance, thus optimizing resource use and patient safety. Furthermore, using a global model explanation technique, we provide insights into how anatomical changes in brain regions-such as the entorhinal cortex, amygdala, and ventricles-can positively or negatively influence the need for FDG PET scans in alignment with clinical understanding of Alzheimer's disease.
PMID:39952976 | DOI:10.1038/s41598-025-86110-y
ROASMI: accelerating small molecule identification by repurposing retention data
J Cheminform. 2025 Feb 14;17(1):20. doi: 10.1186/s13321-025-00968-8.
ABSTRACT
The limited replicability of retention data hinders its application in untargeted metabolomics for small molecule identification. While retention order models hold promise in addressing this issue, their predictive reliability is limited by uncertain generalizability. Here, we present the ROASMI model, which enables reliable prediction of retention order within a well-defined application domain by coupling data-driven molecular representation and mechanistic insights. The generalizability of ROASMI is proven by 71 independent reversed-phase liquid chromatography (RPLC) datasets. The application of ROASMI to four real-world datasets demonstrates its advantages in distinguishing coexisting isomers with similar fragmentation patterns and in annotating detection peaks without informative spectra. ROASMI is flexible enough to be retrained with user-defined reference sets and is compatible with other MS/MS scorers, making further improvements in small-molecule identification.
PMID:39953609 | DOI:10.1186/s13321-025-00968-8
Deep learning-assisted screening and diagnosis of scoliosis: segmentation of bare-back images via an attention-enhanced convolutional neural network
J Orthop Surg Res. 2025 Feb 14;20(1):161. doi: 10.1186/s13018-025-05564-y.
ABSTRACT
BACKGROUND: Traditional diagnostic tools for scoliosis screening necessitate a substantial number of specialized personnel and equipment, leading to inconvenience that can result in missed opportunities for early diagnosis and optimal treatment. We have developed a deep learning-based image segmentation model to enhance the efficiency of scoliosis screening.
METHODS: A total of 350 patients with scoliosis and 108 healthy subjects were included in this study. The dataset was created using their bare back images and standing full-length anteroposterior spinal X-rays. An attention mechanism was incorporated into the original U-Net architecture to build a Dual AttentionUNet model for image segmentation. The entire dataset was divided into the training (321 cases), validation (46 cases), and test (91 cases) sets in a 7:1:2 ratio. The training set was used to train the Dual AttentionUNet model, and the validation set was used to fine-tune hyperparameters and prevent overfitting during training. The performance of the model was evaluated in the test set. After automatic segmentation of the back contour, a back asymmetry index was calculated via computer vision algorithms to classify scoliosis into different severities. The accuracy of classifications was statistically compared to those of three clinical experts.
RESULTS: Following the segmentation of bare back images and the application of computer vision algorithms, the Dual AttentionUNet model achieved an accuracy, precision, and recall rate of over 90% in predicting severe scoliosis. Notably, the model achieved an AUC value of 0.93 in identifying whether the subjects had scoliosis, which was higher than the 0.92 achieved by the deputy chief physician. In identifying severe scoliosis, their AUC values were 0.95 and 0.96, respectively.
CONCLUSION: The Dual AttentionUNet model, based on only bare back images, achieved accuracy and precision comparable to clinical physicians in determining scoliosis severity. Radiation-free, cost-saving, easy-to-operate and noninvasive, this model provides a novel option for large-scale scoliosis screening.
PMID:39953540 | DOI:10.1186/s13018-025-05564-y
Leveraging deep learning for nonlinear shape representation in anatomically parameterized statistical shape models
Int J Comput Assist Radiol Surg. 2025 Feb 14. doi: 10.1007/s11548-025-03330-3. Online ahead of print.
ABSTRACT
PURPOSE: Statistical shape models (SSMs) are widely used for morphological assessment of anatomical structures. However, a key limitation is the need for a clear relationship between the model's shape coefficients and clinically relevant anatomical parameters. To address this limitation, this paper proposes a novel deep learning-based anatomically parameterized SSM (DL-ANATSSM) by introducing a nonlinear relationship between anatomical parameters and bone shape information.
METHODS: Our approach utilizes a multilayer perceptron model trained on a synthetic femoral bone population to learn the nonlinear mapping between anatomical measurements and shape parameters. The trained model is then fine-tuned on a real bone dataset. We compare the performance of DL-ANATSSM with a linear ANATSSM generated using least-squares regression for baseline evaluation.
RESULTS: When applied to a previously unseen femoral bone dataset, DL-ANATSSM demonstrated superior performance in predicting 3D bone shape based on anatomical parameters compared to the linear baseline model. The impact of fine-tuning was also investigated, with results indicating improved model performance after this process.
CONCLUSION: The proposed DL-ANATSSM is therefore a more precise and interpretable SSM, which is directly controlled by clinically relevant parameters. The proposed method holds promise for applications in both morphometry analysis and patient-specific 3D model generation without preoperative images.
PMID:39953355 | DOI:10.1007/s11548-025-03330-3
Hybrid Approach to Classifying Histological Subtypes of Non-small Cell Lung Cancer (NSCLC): Combining Radiomics and Deep Learning Features from CT Images
J Imaging Inform Med. 2025 Feb 14. doi: 10.1007/s10278-025-01442-5. Online ahead of print.
ABSTRACT
This study aimed to develop a hybrid model combining radiomics and deep learning features derived from computed tomography (CT) images to classify histological subtypes of non-small cell lung cancer (NSCLC). We analyzed CT images and radiomics features from 235 patients with NSCLC, including 110 with adenocarcinoma (ADC) and 112 with squamous cell carcinoma (SCC). The dataset was split into a training set (75%) and a test set (25%). External validation was conducted using the NSCLC-Radiomics database, comprising 24 patients each with ADC and SCC. A total of 1409 radiomics and 8192 deep features underwent principal component analysis (PCA) and ℓ2,1-norm minimization for feature reduction and selection. The optimal feature sets for classification included 27 radiomics features, 20 deep features, and 55 combined features (30 deep and 25 radiomics). The average area under the receiver operating characteristic curve (AUC) for radiomics, deep, and combined features were 0.6568, 0.6689, and 0.7209, respectively, across the internal and external test sets. Corresponding average accuracies were 0.6013, 0.6376, and 0.6564. The combined model demonstrated superior performance in classifying NSCLC subtypes, achieving higher AUC and accuracy in both test datasets. These results suggest that the proposed hybrid approach could enhance the accuracy and reliability of NSCLC subtype classification.
PMID:39953259 | DOI:10.1007/s10278-025-01442-5
Ischemic Stroke Lesion Core Segmentation from CT Perfusion Scans Using Attention ResUnet Deep Learning
J Imaging Inform Med. 2025 Feb 14. doi: 10.1007/s10278-025-01407-8. Online ahead of print.
ABSTRACT
Accurate segmentation of ischemic stroke lesions is crucial for refining diagnosis, prognosis, and treatment planning. Manual identification is time-consuming and challenging, especially in urgent clinical scenarios. This paper presents an innovative deep learning-based system for automated segmentation of ischemic stroke lesions from Computed Tomography Perfusion (CTP) datasets. This paper introduces a deep learning-based system designed to segment ischemic stroke lesions from Computed Tomography Perfusion (CTP) datasets. The proposed approach integrates Edge Enhancing Diffusion (EED) filtering as a preprocessing step, acting as a form of hard attention to emphasize affected regions. Besides the Attention ResUnet (AttResUnet) architecture with a modified decoder path, incorporating spatial and channel attention mechanisms to capture long-range dependencies. The system was evaluated using the ISLES challenge 2018 dataset with a fivefold cross-validation approach. The proposed framework achieved a noteworthy average Dice Similarity Coefficient (DSC) score of 59%. This performance underscores the effectiveness of combining EED filtering with attention mechanisms in the AttResUnet architecture for accurate stroke lesion segmentation. The fold-wise analysis revealed consistent performance across different data subsets, with slight variations highlighting the model's generalizability. The proposed approach offers a reliable and generalizable tool for automated ischemic stroke lesion segmentation, potentially improving efficiency and accuracy in clinical settings.
PMID:39953256 | DOI:10.1007/s10278-025-01407-8
Impact of Combined Deep Learning Image Reconstruction and Metal Artifact Reduction Algorithm on CT Image Quality in Different Scanning Conditions for Maxillofacial Region with Metal Implants: A Phantom Study
J Imaging Inform Med. 2025 Feb 14. doi: 10.1007/s10278-024-01287-4. Online ahead of print.
ABSTRACT
This study aims to investigate the impact of combining deep learning image reconstruction (DLIR) and metal artifacts reduction (MAR) algorithms on the quality of CT images with metal implants under different scanning conditions. Four images of the maxillofacial region in pigs were taken using different metal implants for evaluation. The scans were conducted at three different dose levels (CTDIvol: 20/10/5 mGy). The images were reconstructed using three different methods: filtered back projection (FBP), adaptive statistical iterative reconstruction with Veo at a 50% level (AV50), and DLIR at three levels (low, medium, and high). Regions of interest (ROIs) were identified in various tissues (near/far/reference fat, muscle, bone) both with and without metal implants and artifacts. Parameters such as standard deviation (SD), signal-to-noise ratio (SNR), contrast-to-noise ratio (CNR), and metal artifact index (MAI) were calculated. Additionally, two experienced radiologists evaluated the subjective image quality (IQ) using a 5-point Likert scale. (1) Both observers rated MAR generated significantly lower artifact scores than non-MAR in all types of tissues (P < 0.01), except for the far shadow and bloom in bone (phantoms 1, 3, 4) and the far bloom in muscle (phantom 3) without significant differences (P = 1.0). (2) Under the same scanning condition, DLIR at three levels produced a smaller SD than those of FBP and AV50 (P < 0.05). (3) Compared to FBP and AV50, DLIR denoted a better reduction of MAI and improvement of SNR and CNR (P < 0.05) for most tissues between the four phantoms. (4) Subjective overall IQ was superior with the increasement of DLIR level (P < 0.05) and both observers agreed that DLIR produced better artifact reductions compared with FBP and AV50. The combination of DLIR and MAR algorithms can enhance image quality, significantly reduce metal artifacts, and offer high clinical value.
PMID:39953255 | DOI:10.1007/s10278-024-01287-4
CT-based detection of clinically significant portal hypertension predicts post-hepatectomy outcomes in hepatocellular carcinoma
Eur Radiol. 2025 Feb 14. doi: 10.1007/s00330-025-11411-9. Online ahead of print.
ABSTRACT
BACKGROUND: While the CT-based method of detecting clinically significant portal hypertension (CSPH) emerged as a noninvasive alternative for evaluating CSPH, its predictive ability for post-hepatectomy outcomes is unknown. Therefore, this study aimed to evaluate the impact of CT-based CSPH on outcomes following hepatectomy for hepatocellular carcinoma (HCC).
METHODS: This retrospective single-center study included patients with advanced chronic liver disease (ACLD) who underwent hepatectomy for very early or early-stage HCC between January 2017 and December 2018. CSPH was assessed using CT-based criteria, which included splenomegaly determined by deep learning-based spleen volume measurements with personalized reference thresholds, and the presence of gastroesophageal varices (GEV), spontaneous portosystemic shunt or ascites. Logistic regression and competing risk analyses were used to identify factors associated with severe post-hepatectomy liver failure (PHLF), hepatic decompensation, and liver-related death or transplantation. The predictive performance of existing models for PHLF was compared using both CT-based and conventional CSPH criteria (endoscopic GEV or splenomegaly with thrombocytopenia).
RESULTS: Among 593 patients (460 men; mean age 57.9 ± 9.3 years), 41 (6.9%) developed severe PHLF. The median follow-up period was 62 months. CT-based CSPH independently predicted severe PHLF (OR 7.672 [95% CI 3.209-18.346]), hepatic decompensation (subdistribution hazard ratio (sHR) 4.518 [1.868-10.929]), and liver-related death or transplantation (sHR 2.756 [1.315-5.773]). When integrated into existing models, CT-based CSPH outperformed conventional CSPH in predicting severe PHLF (AUC 0.724 vs. 0.694 for EASL algorithm (p = 0.036) and 0.854 vs. 0.830 for Wang's model (p = 0.011)).
CONCLUSIONS: CT-based CSPH is a strong predictor of poor post-hepatectomy outcomes in HCC patients with ACLD, offering a noninvasive surgical risk assessment tool.
KEY POINTS: Question Can CT-based detection of clinically significant portal hypertension (CSPH) serve as a noninvasive predictor of post-hepatectomy outcomes in hepatocellular carcinoma (HCC) patients? Findings CT-based CSPH independently predicted severe post-hepatectomy liver failure, hepatic decompensation, and liver-related death or transplantation, outperforming conventional CSPH criteria in predictive performance. Clinical relevance CT-based CSPH offers a noninvasive and effective tool for surgical risk assessment in HCC patients, potentially improving the selection of candidates for hepatectomy and optimizing patient outcomes.
PMID:39953152 | DOI:10.1007/s00330-025-11411-9
Retrieval-augmented generation improves precision and trust of a GPT-4 model for emergency radiology diagnosis and classification: a proof-of-concept study
Eur Radiol. 2025 Feb 14. doi: 10.1007/s00330-025-11445-z. Online ahead of print.
ABSTRACT
OBJECTIVES: This study evaluated the effect of enhancing a GPT-4 model with retrieval-augmented generation on its ability to diagnose and classify traumatic injuries based on radiology reports.
MATERIALS AND METHODS: In this prospective proof-of-concept study, we used retrieval-augmented generation as a zero-shot learning approach to provide expert knowledge from the RadioGraphics top ten reading list for trauma radiology to the GPT-4 model, creating the context-aware TraumaCB. Radiological report findings of 50 traumatic injuries were independently generated by two radiologists. The performance of the TraumaCB compared to the generic GPT-4 was evaluated by three board-certified radiologists, assessing the accuracy and trustworthiness of the chatbot responses in the 100 reports created.
RESULTS: The TraumaCB achieved 100% correct diagnoses, 96% correct classification, and 87% correct grading, outperforming the generic GPT-4 with 93% correct diagnoses, 70% correct classification, and 48% correct grading. TraumaCB sources consistently achieved a median rating of 5.0 for explanation and trust. Challenges encountered mainly involved traumatic injuries lacking widely accepted classification systems.
CONCLUSION: Augmenting a commercial GPT-4 model with retrieval-augmented generation improves its diagnostic and classification capabilities, positioning it as a valuable tool for efficiently assessing traumatic injuries across various anatomical regions in trauma radiology.
KEY POINTS: Question Retrieval-augmented generation has the potential to enhance generic chatbots with task-specific knowledge of emergency radiology. Findings The TraumaCB excelled in accuracy, particularly in injury classification and grading, and provided explanations along with the sources used, increasing transparency and facilitating verification. Clinical relevance The TraumaCB provides accurate, fast, and transparent access to trauma radiology classifications, potentially increasing the efficiency of image interpretation in emergency departments and enabling customized reports based on local or individual preferences.
PMID:39953150 | DOI:10.1007/s00330-025-11445-z
Establishing the effect of computed tomography reconstruction kernels on the measure of bone mineral density in opportunistic osteoporosis screening
Sci Rep. 2025 Feb 14;15(1):5449. doi: 10.1038/s41598-025-88551-x.
ABSTRACT
Opportunistic computed tomography (CT) scans, which can assess relevant bones of interest, offer a potential solution for identifying osteoporotic individuals. However, it has been well documented that image protocol parameters, such as reconstruction kernel, impact the quantitative analysis of volumetric bone mineral density (vBMD) from CT scans. The purpose of this study was to investigate the impact that CT reconstruction kernels have on quantitative results for vBMD from clinical CT scans using phantom and internal calibration. 45 clinical CT scans were reconstructed using the standard kernel and seven alternative kernels: soft, chest, detail, edge, bone, bone plus and lung [GE HealthCare]. Two methods of image calibration, internal and phantom, were used to calibrate the scans. The total hip and fourth lumbar vertebra (L4) were extracted from the scans via deep learning segmentation. Integral vBMD was calculated based on both calibration techniques from CT scans reconstructed with the eight kernels. Linear regression and Bland-Altman analyses were used to determine the coefficient of determination [Formula: see text] and to quantify the agreement between the different kernels. Differences between the reconstruction kernels were determined using paired t tests, and mean differences from the standard were computed. Using internal calibration, the smoothest kernel (soft) yielded a mean difference of -0.95 mg/cc (-0.33%) compared to the reference standard at the L4 vertebra and 2.07 mg/cc (0.51%) at the left femur. The sharpest kernel (lung) yielded a mean difference of 25.36 mg/cc (9.63%) at the L4 vertebra and -25.10 mg/cc (-5.98%) at the left femur. Alternatively, using phantom calibration soft yielded higher mean differences than internal calibration at both locations, with mean differences of 1.21 mg/cc (0.42%) at the L4 vertebra and 2.53 mg/cc (0.65%) at the left femur. The most error-prone results stemmed from the use of the lung kernel, as this kernel displayed a mean difference of -21.90 mg/cc (-7.38%) and -17.24 mg/cc (-4.34%) at the L4 vertebra and femur, respectively. These results indicate when performing opportunistic CT analysis, errors due to interchanging smoothing kernels soft, chest and detail are negligible, but that interchanging between sharpening kernels (lung, bone, bone plus, edge) results in large errors that can significantly impact vBMD measures for osteoporosis screening and diagnosis.
PMID:39953113 | DOI:10.1038/s41598-025-88551-x
Fourier-inspired single-pixel holography
Opt Lett. 2025 Feb 15;50(4):1269-1272. doi: 10.1364/OL.547399.
ABSTRACT
Fourier-inspired single-pixel holography (FISH) is an effective digital holography (DH) approach that utilizes a single-pixel detector instead of a conventional camera to capture light field information. FISH combines the Fourier single-pixel imaging and off-axis holography technique, allowing one to acquire useful information directly, rather than recording the hologram in the spatial domain and filtering unwanted terms in the Fourier domain. Furthermore, we employ a deep learning technique to jointly optimize the sampling mask and the imaging enhancement model, to achieve high-quality results at a low sampling ratio. Both simulations and experimental results demonstrate the effectiveness of FISH in single-pixel phase imaging. FISH combines the strengths of single-pixel imaging (SPI) and DH, potentially expanding DH's applications to specialized spectral bands and low-light environments while equipping SPI with capabilities for phase detection and coherent gating.
PMID:39951780 | DOI:10.1364/OL.547399
Unsupervised cross talk suppression for self-interference digital holography
Opt Lett. 2025 Feb 15;50(4):1261-1264. doi: 10.1364/OL.544342.
ABSTRACT
Self-interference digital holography extends the application of digital holography to non-coherent imaging fields such as fluorescence and scattered light, providing a new solution, to the best of our knowledge, for wide field 3D imaging of low coherence or partially coherent signals. However, cross talk information has always been an important factor limiting the resolution of this imaging method. The suppression of cross talk information is a complex nonlinear problem, and deep learning can easily obtain its corresponding nonlinear model through data-driven methods. However, in real experiments, it is difficult to obtain such paired datasets to complete training. Here, we propose an unsupervised cross talk suppression method based on a cycle-consistent generative adversarial network (CycleGAN) for self-interference digital holography. Through the introduction of a saliency constraint, the unsupervised model, named crosstalk suppressing with unsupervised neural network (CS-UNN), can learn the mapping between two image domains without requiring paired training data while avoiding distortions of the image content. Experimental analysis has shown that this method can suppress cross talk information in reconstructed images without the need for training strategies on a large number of paired datasets, providing an effective solution for the application of the self-interference digital holography technology.
PMID:39951778 | DOI:10.1364/OL.544342
Application of Surface-Enhanced Raman Spectroscopy in Head and Neck Cancer Diagnosis
Anal Chem. 2025 Feb 14. doi: 10.1021/acs.analchem.4c02796. Online ahead of print.
ABSTRACT
Surface-enhanced Raman spectroscopy (SERS) has emerged as a crucial analytical tool in the field of oncology, particularly presenting significant challenges for the diagnosis and treatment of head and neck cancer. This Review provides an overview of the current status and prospects of SERS applications, highlighting their profound impact on molecular biology-level diagnosis, tissue-level identification, HNC therapeutic monitoring, and integration with emerging technologies. The application of SERS for single-molecule assays such as epidermal growth factor receptors and PD-1/PD-L1, gene expression analysis, and tumor microenvironment characterization is also explored. This Review showcases the innovative applications of SERS in liquid biopsies such as high-throughput lateral flow analysis for ctDNA quantification and salivary diagnostics, which can offer rapid and highly sensitive assays suitable for immediate detection. At the tissue level, SERS enables cancer cell visualization and intraoperative tumor margin identification, enhancing surgical precision and decision-making. The role of SERS in radiotherapy, chemotherapy, and targeted therapy is examined along with its use in real-time pharmacokinetic studies to monitor treatment response. Furthermore, this Review delves into the synergistic relationship between SERS and artificial intelligence, encompassing machine learning and deep learning algorithms, marking the dawn of a new era in precision oncology. The integration of SERS with genomics, metabolomics, transcriptomics, proteomics, and single-cell omics at the multiomics level will revolutionize our comprehension and management of HNC. This Review offers an overview of the transformative impacts of SERS and examines future directions as well as challenges in this dynamic research field.
PMID:39951652 | DOI:10.1021/acs.analchem.4c02796
Fast fault diagnosis of smart grid equipment based on deep neural network model based on knowledge graph
PLoS One. 2025 Feb 14;20(2):e0315143. doi: 10.1371/journal.pone.0315143. eCollection 2025.
ABSTRACT
The smart grid is on the basis of physical grid, introducing all kinds of advanced communications technology and form a new type of power grid. It can not only meet the demand of users and realize the optimal allocation of resources, but also improve the safety, economy and reliability of power supply, it has become a major trend in the future development of electric power industry. But on the other hand, the complex network architecture of smart grid and the application of various high-tech technologies have also greatly increased the probability of equipment failure and the difficulty of fault diagnosis, and timely discovery and diagnosis of problems in the operation of smart grid equipment has become a key measure to ensure the safety of power grid operation. From the current point of view, the existing smart grid equipment fault diagnosis technology has problems that the application program is more complex, and the fault diagnosis rate is generally not high, which greatly affects the efficiency of smart grid maintenance. Therefore, Based on this, this paper adopts the multimodal semantic model of deep learning and knowledge graph, and on the basis of the original target detection network YOLOv4 architecture, introduces knowledge graph to unify the characterization and storage of the input multimodal information, and innovatively combines the YOLOv4 target detection algorithm with the knowledge graph to establish a smart grid equipment fault diagnosis model. Experiments show that compared with the existing fault detection algorithms, the YOLOv4 algorithm constructed in this paper is more accurate, faster and easier to operate.
PMID:39951439 | DOI:10.1371/journal.pone.0315143
Hybrid-RViT: Hybridizing ResNet-50 and Vision Transformer for Enhanced Alzheimer's disease detection
PLoS One. 2025 Feb 14;20(2):e0318998. doi: 10.1371/journal.pone.0318998. eCollection 2025.
ABSTRACT
Alzheimer's disease (AD) is a leading cause of disability worldwide. Early detection is critical for preventing progression and formulating effective treatment plans. This study aims to develop a novel deep learning (DL) model, Hybrid-RViT, to enhance the detection of AD. The proposed Hybrid-RViT model integrates the pre-trained convolutional neural network (ResNet-50) with the Vision Transformer (ViT) to classify brain MRI images across different stages of AD. The ResNet-50 adopted for transfer learning, facilitates inductive bias and feature extraction. Concurrently, ViT processes sequences of image patches to capture long-distance relationships via a self-attention mechanism, thereby functioning as a joint local-global feature extractor. The Hybrid-RViT model achieved a training accuracy of 97% and a testing accuracy of 95%, outperforming previous models. This demonstrates its potential efficacy in accurately identifying and classifying AD stages from brain MRI data. The Hybrid-RViT model, combining ResNet-50 and ViT, shows superior performance in AD detection, highlighting its potential as a valuable tool for medical professionals in interpreting and analyzing brain MRI images. This model could significantly improve early diagnosis and intervention strategies for AD.
PMID:39951414 | DOI:10.1371/journal.pone.0318998