Deep learning
Urban fabric decoded: High-precision building material identification via deep learning and remote sensing
Environ Sci Ecotechnol. 2025 Feb 3;24:100538. doi: 10.1016/j.ese.2025.100538. eCollection 2025 Mar.
ABSTRACT
Precise identification and categorization of building materials are essential for informing strategies related to embodied carbon reduction, building retrofitting, and circularity in urban environments. However, existing building material databases are typically limited to individual projects or specific geographic areas, offering only approximate assessments. Acquiring large-scale and precise material data is hindered by inadequate records and financial constraints. Here, we introduce a novel automated framework that harnesses recent advances in sensing technology and deep learning to identify roof and facade materials using remote sensing data and Google Street View imagery. The model was initially trained and validated on Odense's comprehensive dataset and then extended to characterize building materials across Danish urban landscapes, including Copenhagen, Aarhus, and Aalborg. Our approach demonstrates the model's scalability and adaptability to different geographic contexts and architectural styles, providing high-resolution insights into material distribution across diverse building types and cities. These findings are pivotal for informing sustainable urban planning, revising building codes to lower carbon emissions, and optimizing retrofitting efforts to meet contemporary standards for energy efficiency and emission reductions.
PMID:40034611 | PMC:PMC11875798 | DOI:10.1016/j.ese.2025.100538
TriSwinUNETR lobe segmentation model for computing DIR-free CT-ventilation
Front Oncol. 2025 Feb 17;15:1475133. doi: 10.3389/fonc.2025.1475133. eCollection 2025.
ABSTRACT
PURPOSE: Functional radiotherapy avoids the delivery of high-radiation dosages to high-ventilated lung areas. Methods to determine CT-ventilation imaging (CTVI) typically rely on deformable image registration (DIR) to calculate volume changes within inhale/exhale CT image pairs. Since DIR is a non-trivial task that can bias CTVI, we hypothesize that lung volume changes needed to calculate CTVI can be computed from AI-driven lobe segmentations in inhale/exhale phases, without DIR. We utilize a novel lobe segmentation pipeline (TriSwinUNETR), and the resulting inhale/exhale lobe volumes are used to calculate CTVI.
METHODS: Our pipeline involves three SwinUNETR networks, each trained on 6,501 CT image pairs from the COPDGene study. An initial network provides right/left lung segmentations used to define bounding boxes for each lung. Bounding boxes are resized to focus on lung volumes and then lobes are segmented with dedicated right and left SwinUNETR networks. Fine-tuning was conducted on CTs from 11 patients treated with radiotherapy for non-small cell lung cancer. Five-fold cross-validation was then performed on 51 LUNA16 cases with manually delineated ground truth. Breathing-induced volume change was calculated for each lobe using AI-defined lobe volumes from inhale/exhale phases, without DIR. Resulting lobar CTVI values were validated with 4DCT and positron emission tomography (PET)-Galligas ventilation imaging for 19 lung cancer patients. Spatial Spearman correlation between TriSwinUNETR lobe ventilation and ground-truth PET-Galligas ventilation was calculated for each patient.
RESULTS: TriSwinUNETR achieved a state-of-the-art mean Dice score of 93.72% (RUL: 93.49%, RML: 85.78%, RLL: 95.65%, LUL: 97.12%, LLL: 96.58%), outperforming best-reported accuracy of 92.81% for the lobe segmentation task. CTVI calculations yielded a median Spearman correlation coefficient of 0.9 across 19 cases, with 13 cases exhibiting correlations of at least 0.5, indicating strong agreement with PET-Galligas ventilation.
CONCLUSION: Our TriSwinUNETR pipeline demonstrated superior performance in the lobe segmentation task, while our segmentation-based CTVI exhibited strong agreement with PET-Galligas ventilation. Moreover, as our approach leverages deep-learning for segmentation, it provides interpretable ventilation results and facilitates quality assurance, thereby reducing reliance on DIR.
PMID:40034599 | PMC:PMC11872890 | DOI:10.3389/fonc.2025.1475133
Machine learning uncovers novel sex-specific dementia biomarkers linked to autism and eye diseases
J Alzheimers Dis Rep. 2025 Feb 13;9:25424823251317177. doi: 10.1177/25424823251317177. eCollection 2025 Jan-Dec.
ABSTRACT
BACKGROUND: Recently, microRNAs (miRNAs) have attracted significant interest as predictive biomarkers for various types of dementia, including Alzheimer's disease (AD), vascular dementia (VaD), dementia with Lewy bodies (DLB), normal pressure hydrocephalus (NPH), and mild cognitive impairment (MCI). Machine learning (ML) methods enable the integration of miRNAs into highly accurate predictive models of dementia.
OBJECTIVE: To investigate the differential expression of miRNAs across dementia subtypes compared to normal controls (NC) and analyze their enriched biological and disease pathways. Additionally, to evaluate the use of these miRNAs in binary and multiclass ML models for dementia prediction in both overall and sex-specific datasets.
METHODS: Using data comprising 1685 Japanese individuals (GSE120584 and GSE167559), we performed differential expression analysis to identify miRNAs associated with five dementia groups in both overall and sex-specific datasets. Pathway enrichment analyses were conducted to further analyze these miRNAs. ML classifiers were used to create predictive models of dementia.
RESULTS: We identified novel differentially expressed miRNA biomarkers distinguishing NC from five dementia subtypes. Incorporating these miRNAs into ML classifiers resulted in up to a 27% improvement in dementia risk prediction. Pathway analysis highlighted neuronal and eye disease pathways associated with dementia risk. Sex-specific analyses revealed unique biomarkers for males and females, with miR-128-1-5 as a protective factor for males in AD, VaD, and DLB, and miR-4488 as a risk factor for female AD, highlighting distinct pathways and potential therapeutic targets for each sex.
CONCLUSIONS: Our findings support existing dementia etiology research and introduce new potential and sex-specific miRNA biomarkers.
PMID:40034518 | PMC:PMC11864256 | DOI:10.1177/25424823251317177
Contrastive self-supervised learning for neurodegenerative disorder classification
Front Neuroinform. 2025 Feb 17;19:1527582. doi: 10.3389/fninf.2025.1527582. eCollection 2025.
ABSTRACT
INTRODUCTION: Neurodegenerative diseases such as Alzheimer's disease (AD) or frontotemporal lobar degeneration (FTLD) involve specific loss of brain volume, detectable in vivo using T1-weighted MRI scans. Supervised machine learning approaches classifying neurodegenerative diseases require diagnostic-labels for each sample. However, it can be difficult to obtain expert labels for a large amount of data. Self-supervised learning (SSL) offers an alternative for training machine learning models without data-labels.
METHODS: We investigated if the SSL models can be applied to distinguish between different neurodegenerative disorders in an interpretable manner. Our method comprises a feature extractor and a downstream classification head. A deep convolutional neural network, trained with a contrastive loss, serves as the feature extractor that learns latent representations. The classification head is a single-layer perceptron that is trained to perform diagnostic group separation. We used N = 2,694 T1-weighted MRI scans from four data cohorts: two ADNI datasets, AIBL and FTLDNI, including cognitively normal controls (CN), cases with prodromal and clinical AD, as well as FTLD cases differentiated into its phenotypes.
RESULTS: Our results showed that the feature extractor trained in a self-supervised way provides generalizable and robust representations for the downstream classification. For AD vs. CN, our model achieves 82% balanced accuracy on the test subset and 80% on an independent holdout dataset. Similarly, the Behavioral variant of frontotemporal dementia (BV) vs. CN model attains an 88% balanced accuracy on the test subset. The average feature attribution heatmaps obtained by the Integrated Gradient method highlighted hallmark regions, i.e., temporal gray matter atrophy for AD, and insular atrophy for BV.
CONCLUSION: Our models perform comparably to state-of-the-art supervised deep learning approaches. This suggests that the SSL methodology can successfully make use of unannotated neuroimaging datasets as training data while remaining robust and interpretable.
PMID:40034453 | PMC:PMC11873101 | DOI:10.3389/fninf.2025.1527582
Application of Machine Learning in the Diagnosis of Temporomandibular Disorders: An Overview
Oral Dis. 2025 Mar 3. doi: 10.1111/odi.15300. Online ahead of print.
ABSTRACT
OBJECTIVES: Temporomandibular disorders (TMDs) refer to a group of disorders related to the temporomandibular joint (TMJ), the diagnosis of which is important in dental practice but remains challenging for nonspecialists. With the development of machine learning (ML) methods, ML-based TMDs diagnostic models have shown great potential. The purpose of this review is to summarize the application of ML in TMDs diagnosis, as well as future directions and possible challenges.
METHODS: PubMed, Google Scholar, and Web of Science databases were searched for electronic literature published up to October 2024, in order to describe the current application of ML in the classification and diagnosis of TMDs.
RESULTS: We summarized the application of various ML methods in the diagnosis and classification of different subtypes of TMDs and described the role of different imaging modalities in constructing diagnostic models. Ultimately, we discussed future directions and challenges that ML methods may confront in the application of TMDs diagnosis.
CONCLUSIONS: The screening and diagnosis models of TMDs based on ML methods hold significant potential for clinical application, but still need to be further verified by a large number of multicenter data and longitudinal studies.
PMID:40033467 | DOI:10.1111/odi.15300
Machine learning for the rElapse risk eValuation in acute biliary pancreatitis: The deep learning MINERVA study protocol
World J Emerg Surg. 2025 Mar 3;20(1):17. doi: 10.1186/s13017-025-00594-7.
ABSTRACT
BACKGROUND: Mild acute biliary pancreatitis (MABP) presents significant clinical and economic challenges due to its potential for relapse. Current guidelines advocate for early cholecystectomy (EC) during the same hospital admission to prevent recurrent acute pancreatitis (RAP). Despite these recommendations, implementation in clinical practice varies, highlighting the need for reliable and accessible predictive tools. The MINERVA study aims to develop and validate a machine learning (ML) model to predict the risk of RAP (at 30, 60, 90 days, and at 1-year) in MABP patients, enhancing decision-making processes.
METHODS: The MINERVA study will be conducted across multiple academic and community hospitals in Italy. Adult patients with a clinical diagnosis of MABP, in accordance with the revised Atlanta Criteria, who have not undergone EC during index admission will be included. Exclusion criteria encompass non-biliary aetiology, severe pancreatitis, and the inability to provide informed consent. The study involves both retrospective data from the MANCTRA-1 study and prospective data collection. Data will be captured using REDCap. The ML model will utilise convolutional neural networks (CNN) for feature extraction and risk prediction. The model includes the following steps: the spatial transformation of variables using kernel Principal Component Analysis (kPCA), the creation of 2D images from transformed data, the application of convolutional filters, max-pooling, flattening, and final risk prediction via a fully connected layer. Performance metrics such as accuracy, precision, recall, and area under the ROC curve (AUC) will be used to evaluate the model.
DISCUSSION: The MINERVA study aims to address the specific gap in predicting RAP risk in MABP patients by leveraging advanced ML techniques. By incorporating a wide range of clinical and demographic variables, the MINERVA score aims to provide a reliable, cost-effective, and accessible tool for healthcare professionals. The project emphasises the practical application of AI in clinical settings, potentially reducing the incidence of RAP and associated healthcare costs.
TRIAL REGISTRATION: ClinicalTrials.gov ID: NCT06124989.
PMID:40033414 | DOI:10.1186/s13017-025-00594-7
Development and validation of a deep learning algorithm for prediction of pediatric recurrent intussusception in ultrasound images and radiographs
BMC Med Imaging. 2025 Mar 3;25(1):67. doi: 10.1186/s12880-025-01582-8.
ABSTRACT
PURPOSES: To develop a predictive model for recurrent intussusception based on abdominal ultrasound (US) images and abdominal radiographs.
METHODS: A total of 3665 cases of intussusception were retrospectively collected from January 2017 to December 2022. The cohort was randomly assigned to training and validation sets at a 6:4 ratio. Two types of images were processed: abdominal grayscale US images and abdominal radiographs. These images served as inputs for the deep learning algorithm and were individually processed by five detection models for training, with each model predicting its respective categories and probabilities. The optimal models were selected individually for decision fusion to obtain the final predicted categories and their probabilities.
RESULTS: With US, the VGG11 model showed the best performance, achieving an area under the receiver operating characteristic curve (AUC) of 0.669 (95% CI: 0.635-0.702). In contrast, with radiographs, the ResNet18 model excelled with an AUC of 0.809 (95% CI: 0.776-0.841). We then employed two fusion methods. In the averaging fusion method, the two models were combined to reach a diagnostic decision. Specifically, a soft voting scheme was used to average the probabilities predicted by each model, resulting in an AUC of 0.877 (95% CI: 0.846-0.908). In the stacking fusion method, a meta-model was built based on the predictions of the two optimal models. This approach notably enhanced the overall predictive performance, with LightGBM emerging as the top performer, achieving an AUC of 0.897 (95% CI: 0.869-0.925). Both fusion methods demonstrated excellent performance.
CONCLUSIONS: Deep learning algorithms developed using multimodal medical imaging may help predict recurrent intussusception.
CLINICAL TRIAL NUMBER: Not applicable.
PMID:40033220 | DOI:10.1186/s12880-025-01582-8
Sugarcane leaf disease classification using deep neural network approach
BMC Plant Biol. 2025 Mar 4;25(1):282. doi: 10.1186/s12870-025-06289-0.
ABSTRACT
OBJECTIVE: The objective is to develop a reliable deep learning (DL) based model that can accurately diagnose diseases. It seeks to address the challenges posed by the traditional approach of manually diagnosing diseases to enhance the control of disease and sugarcane production.
METHODS: In order to identify the diseases in sugarcane leaves, this study used EfficientNet architectures along with other well-known convolutional neural network (ConvNet) models such as DenseNet201, ResNetV2, InceptionV4, MobileNetV3 and RegNetX. The models were trained and tested on the Sugarcane Leaf Dataset (SLD) which consists of 6748 images of healthy and diseased leaves, across 11 disease classes. To provide a valid evaluation for the proposed models, the dataset was additionally split into subsets for training (70%), validation (15%) and testing (15%). The models provided were also assessed inclusively in terms of accuracy, further evaluation also took into account level of model's complexity and its depth.
RESULTS: EfficientNet-B7 and DenseNet201 achieved the highest classification accuracy rates of 99.79% and 99.50%, respectively, among 14 models tested. To ensure a robust evaluation and reduce potential biases, 5-fold cross-validation was used, further validating the consistency and reliability of the models across different dataset partitions. Analysis revealed no direct correlation between model complexity, depth, and accuracy for the 11-class sugarcane dataset, emphasizing that optimal performance is not solely dependent on the model's architecture or depth but also on its adaptability to the dataset.
DISCUSSION: The study demonstrates the effectiveness of DL models, particularly EfficientNet-B7 and DenseNet201, for fast, accurate, and automatic disease detection in sugarcane leaves. These systems offer a significant improvement over traditional manual methods, enabling farmers and agricultural managers to make timely and informed decisions, ultimately reducing crop loss and enhancing overall sugarcane yield. This work highlights the transformative potential of DL in agriculture.
PMID:40033192 | DOI:10.1186/s12870-025-06289-0
Prediction of Lymph Node Metastasis in Lung Cancer Using Deep Learning of Endobronchial Ultrasound Images With Size on CT and PET-CT Findings
Respirology. 2025 Mar 3. doi: 10.1111/resp.70010. Online ahead of print.
ABSTRACT
BACKGROUND AND OBJECTIVE: Echo features of lymph nodes (LNs) influence target selection during endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA). This study evaluates deep learning's diagnostic capabilities on EBUS images for detecting mediastinal LN metastasis in lung cancer, emphasising the added value of integrating a region of interest (ROI), LN size on CT, and PET-CT findings.
METHODS: We analysed 2901 EBUS images from 2055 mediastinal LN stations in 1454 lung cancer patients. ResNet18-based deep learning models were developed to classify images of true positive malignant and true negative benign LNs diagnosed by EBUS-TBNA using different inputs: original images, ROI images, and CT size and PET-CT data. Model performance was assessed using the area under the receiver operating characteristic curve (AUROC) and other diagnostic metrics.
RESULTS: The model using only original EBUS images showed the lowest AUROC (0.870) and accuracy (80.7%) in classifying LN images. Adding ROI information slightly increased the AUROC (0.896) without a significant difference (p = 0.110). Further adding CT size resulted in a minimal change in AUROC (0.897), while adding PET-CT (original + ROI + PET-CT) showed a significant improvement (0.912, p = 0.008 vs. original; p = 0.002 vs. original + ROI + CT size). The model combining original and ROI EBUS images with CT size and PET-CT findings achieved the highest AUROC (0.914, p = 0.005 vs. original; p = 0.018 vs. original + ROI + PET-CT) and accuracy (82.3%).
CONCLUSION: Integrating an ROI, LN size on CT, and PET-CT findings into the deep learning analysis of EBUS images significantly enhances the diagnostic capability of models for detecting mediastinal LN metastasis in lung cancer, with the integration of PET-CT data having a substantial impact.
PMID:40033122 | DOI:10.1111/resp.70010
Deep-learning enabled combined measurement of tumour cell density and tumour infiltrating lymphocyte density as a prognostic biomarker in colorectal cancer
BJC Rep. 2025 Mar 3;3(1):12. doi: 10.1038/s44276-025-00123-8.
ABSTRACT
BACKGROUND: Within the colorectal cancer (CRC) tumour microenvironment, tumour infiltrating lymphocytes (TILs) and tumour cell density (TCD) are recognised prognostic markers. Measurement of TILs and TCD using deep-learning (DL) on haematoxylin and eosin (HE) whole slide images (WSIs) could aid management.
METHODS: HE WSIs from the primary tumours of 127 CRC patients were included. DL was used to quantify TILs across different regions of the tumour and TCD at the luminal surface. The relationship between TILs, TCD, and cancer-specific survival was analysed.
RESULTS: Median TIL density was higher at the invasive margin than the luminal surface (963 vs 795 TILs/mm2, P = 0.010). TILs and TCD were independently prognostic in multivariate analyses (HR 4.28, 95% CI 1.87-11.71, P = 0.004; HR 2.72, 95% CI 1.19-6.17, P = 0.017, respectively). Patients with both low TCD and low TILs had the poorest survival (HR 10.0, 95% CI 2.51-39.78, P = 0.001), when compared to those with a high TCD and TILs score.
CONCLUSIONS: DL derived TIL and TCD score were independently prognostic in CRC. Patients with low TILs and TCD are at the highest risk of cancer-specific death. DL quantification of TILs and TCD could be used in combination alongside other validated prognostic biomarkers in routine clinical practice.
PMID:40033106 | DOI:10.1038/s44276-025-00123-8
An intelligent framework for skin cancer detection and classification using fusion of Squeeze-Excitation-DenseNet with Metaheuristic-driven ensemble deep learning models
Sci Rep. 2025 Mar 3;15(1):7425. doi: 10.1038/s41598-025-92293-1.
ABSTRACT
Skin cancer is the most dominant and critical method of cancer, which arises all over the world. Its damaging effects can range from disfigurement to major medical expenditures and even death if not analyzed and preserved timely. Conventional models of skin cancer recognition require a complete physical examination by a specialist, which is time-wasting in a few cases. Computer-aided medicinal analytical methods have gained massive popularity due to their efficiency and effectiveness. This model can assist dermatologists in the initial recognition of skin cancer, which is significant for early diagnosis. An automatic classification model utilizing deep learning (DL) can help doctors perceive the kind of skin lesion and improve the patient's health. The classification of skin cancer is one of the hot topics in the research field, along with the development of DL structure. This manuscript designs and develops a Detection of Skin Cancer Using an Ensemble Deep Learning Model and Gray Wolf Optimization (DSC-EDLMGWO) method. The proposed DSC-EDLMGWO model relies on the recognition and classification of skin cancer in biomedical imaging. The presented DSC-EDLMGWO model initially involves the image preprocessing stage at two levels: contract enhancement using the CLAHE method and noise removal using the wiener filter (WF) model. Furthermore, the proposed DSC-EDLMGWO model utilizes the SE-DenseNet method, which is the fusion of the squeeze-and-excitation (SE) module and DenseNet to extract features. For the classification process, the ensemble of DL models, namely the long short-term memory (LSTM) technique, extreme learning machine (ELM) model, and stacked sparse denoising autoencoder (SSDA) method, is employed. Finally, the gray wolf optimization (GWO) method optimally adjusts the ensemble DL models' hyperparameter values, resulting in more excellent classification performance. The effectiveness of the DSC-EDLMGWO approach is evaluated using a benchmark image database, with outcomes measured across various performance metrics. The experimental validation of the DSC-EDLMGWO approach portrayed a superior accuracy value of 98.38% and 98.17% under HAM10000 and ISIC datasets across other techniques.
PMID:40033075 | DOI:10.1038/s41598-025-92293-1
Improving accuracy for inferior alveolar nerve segmentation with multi-label of anatomical adjacent structures using active learning in cone-beam computed tomography
Sci Rep. 2025 Mar 3;15(1):7441. doi: 10.1038/s41598-025-91725-2.
ABSTRACT
Recent advancements in deep learning have revolutionized digital dentistry, highlighting the importance of precise dental segmentation. This study leverages active learning with the three-dimensional (3D) nnU-net and multi-labels to improve segmentation accuracy of dental anatomies, including the maxillary sinuses, maxilla, mandible, and inferior alveolar nerves (IAN), which are important for implant planning, in 3D cone-beam computed tomography (CBCT) scans. Segmentation accuracy was compared using single-label, adjacent pair-label, and multi-label relevant anatomic structures with 60 CBCT scans from Kooalldam Dental Hospital and externally validated using data from Seoul National University Dental Hospital. The dataset was divided into three training stages for active learning. The evaluation metrics were assessed through the Dice similarity coefficient (DSC) and mean absolute difference. The overall internal test set DSCs from the multi-label, single-label, and pair-label models were 95%, 91% (paired t-test; p = 0.01), and 93% (p = 0.03), respectively. The DSC of the IAN in the internal and external datasets increased from 83% to 79%, 87% and 81%, to 90% and 86% for the single-label, pair-label, and multi-label models, respectively (all p = 0.01). Prediction accuracy improved over time, significantly reducing the manual correction time. Our active learning and multi-label strategies facilitated accurate automatic segmentation.
PMID:40033040 | DOI:10.1038/s41598-025-91725-2
Quantitative analysis and evaluation of winter and summer landscape colors in the Yangzhou ancient Canal utilizing deep learning
Sci Rep. 2025 Mar 3;15(1):7500. doi: 10.1038/s41598-025-91483-1.
ABSTRACT
Color is an important index for human visual evaluation of landscape, and it is also a key factor affecting people's recognition and experience of heritage landscape. In this study, five important sites of the Yangzhou Grand Canal were selected for the color quantification analysis by using the Deep Learning(DL) scene parsing algorithm. The color characteristics of the winter and summer landscape of the five sites were evaluated as well as the Scenic Beauty Estimation (SBE) value. Furthermore, the correlation analysis between the color characteristics and the SBE value was established in order to study the relationship between color characteristics and the landscape beauty. The main results are as follows: ①.The dominant color of the five sites is blue and green, the building color is mainly orange and yellow in both winter and summer. The dominant plant color in five sites is green in summer, whereas in winter, changes to yellow(Site5:YZJGD) or cyan(Site1:DGGD, Site3:GZGD); ②.The overall color saturation is low in winter with the percentages of Very Low Saturation in almost each site(except site5:YZJGD)reach 80-98%. Summer has Medium Saturation colors, the percentage of Mid Saturation of sky in Site 2(GMS) in summer is 44.87%. ③. The landscapes have low brightness in winter and higher brightness in summer in all sites, sky is the only category whose High Brightness value exceeds 50% in both seasons.And in winter, landscapes are most prevalent in Low Brightness and Medium Brightness. In summer, the percentages of Medium Brightness and High Brightness increase.④.The color diversity of the sites in winter varies significantly, whereas the color diversity of the sites in summer varies slightly.The highest color diversity of plants is found in DGGD(Diversity > 1.5). ⑤.In winter, the highest SBE value is found in Site2:GMS(0.5956), and the lowest SBE value is found in Site5:YZJGD(- 0.8216),which is a large gap(1.4172).The highest average SBE value is in Site2:GMS(0.5062), followed by Site3:GZGD (0.2091), which both have average values greater than zero. ⑥.Correlation analysis revealed that there is no significant correlation between the saturation and SBE values(p > 0.05).However, the Pearson correlation coefficients which are - 0.625(winter) and 0.689(summer) indicate strong correlation.Meanwhile, there is no significant correlation between the color diversity and SBE values(p > 0.05). However, the Pearson correlation coefficients are 0.807(winter) and - 0.747(summer), indicating strong correlation.This study provides an in-depth examination of the Canal landscape color, it is hoped to promote the systematic and scientific study of landscape colors and provide a theoretical basis for the scientific design of heritage landscape color.
PMID:40033036 | DOI:10.1038/s41598-025-91483-1
Initial findings creating a temperature prediction model using vibroacoustic signals originating from tissue needle interactions
Sci Rep. 2025 Mar 3;15(1):7393. doi: 10.1038/s41598-025-92202-6.
ABSTRACT
This research explores the acquisition and analysis of vibroacoustic signals generated during tissue-tool interactions, using a conventional aspiration needle enhanced with a proximally mounted MEMS audio sensor, to extract temperature information. Minimally invasive temperature monitoring is critical in thermotherapy applications, but current methods often rely on additional sensors or simulations of typical tissue behavior. In this study, a commercially available needle was inserted into water-saturated foams with temperatures ranging from 25 to 55 °C, varied in 5° increments. Given that temperature affects the speed of sound, water's heat capacity, and the mechanical properties of most tissues, it was hypothesized that the vibroacoustic signals recorded during needle insertion would carry temperature-dependent information. The acquired signals were segmented, processed, and analyzed using signal processing techniques and a deep learning algorithm. Results demonstrated that the audio signals contained distinct temperature-dependent features, enabling temperature prediction with a root mean squared error of approximately 3 °C. We present these initial laboratory findings, highlighting significant potential for refinement. This novel approach could pave the way for a real-time, minimally invasive method for thermal monitoring in medical applications.
PMID:40032997 | DOI:10.1038/s41598-025-92202-6
Model-based convolution neural network for 3D Near-infrared spectral tomography
IEEE Trans Med Imaging. 2025 Jan 14;PP. doi: 10.1109/TMI.2025.3529621. Online ahead of print.
ABSTRACT
Near-infrared spectral tomography (NIRST) is a non-invasive imaging technique that provides functional information about biological tissues. Due to diffuse light propagation in tissue and limited boundary measurements, NIRST image reconstruction presents an ill-posed and ill-conditioned computational problem that is difficult to solve. To address this challenge, we developed a reconstruction algorithm (Model-CNN) that integrates a diffusion equation model with a convolutional neural network (CNN). The CNN learns a regularization prior to restrict solutions to the space of desirable chromophore concentration images. Efficacy of Model-CNN was evaluated by training on numerical simulation data, and then applying the network to physical phantom and clinical patient NIRST data. Results demonstrated the superiority of Model-CNN over the conventional Tikhonov regularization approach and a deep learning algorithm (FC-CNN) in terms of absolute bias error (ABE) and peak signal-to-noise ratio (PSNR). Specifically, in comparison to Tikhonov regularization, Model-CNN reduced average ABE by 55% for total hemoglobin (HbT) and 70% water (H2O) concentration, while improved PSNR by an average of 5.3 dB both for HbT and H2O images. Meanwhile, image processing time was reduced by 82%, relative to the Tikhonov regularization. As compared to FC-CNN, the Model-CNN achieved a 91% reduction in ABE for HbT and 75% for H2O images, with increases in PSNR by 7.3 dB and 4.7 dB, respectively. Notably, this Model-CNN approach was not trained on patient data; but instead, was trained on simulated phantom data with simpler geometrical shapes and optical source-detector configurations; yet, achieved superior image recovery when faced with real-world data.
PMID:40031020 | DOI:10.1109/TMI.2025.3529621
Combining Pre- and Post-Demosaicking Noise Removal for RAW Video
IEEE Trans Image Process. 2025 Jan 15;PP. doi: 10.1109/TIP.2025.3527886. Online ahead of print.
ABSTRACT
Denoising is one of the fundamental steps of the processing pipeline that converts data captured by a camera sensor into a display-ready image or video. It is generally performed early in the pipeline, usually before demosaicking, although studies swapping their order or even conducting them jointly have been proposed. With the advent of deep learning, the quality of denoising algorithms has steadily increased. Even so, modern neural networks still have a hard time adapting to new noise levels and scenes, which is indispensable for real-world applications. With those in mind, we propose a self-similarity-based denoising scheme that weights both a pre- and a post-demosaicking denoiser for Bayer-patterned CFA video data. We show that a balance between the two leads to better image quality, and we empirically find that higher noise levels benefit from a higher influence pre-demosaicking. We also integrate temporal trajectory prefiltering steps before each denoiser, which further improve texture reconstruction. The proposed method only requires an estimation of the noise model at the sensor, accurately adapts to any noise level, and is competitive with the state of the art, making it suitable for real-world videography.
PMID:40031011 | DOI:10.1109/TIP.2025.3527886
Torsion Graph Neural Networks
IEEE Trans Pattern Anal Mach Intell. 2025 Jan 13;PP. doi: 10.1109/TPAMI.2025.3528449. Online ahead of print.
ABSTRACT
Geometric deep learning (GDL) models have demonstrated a great potential for the analysis of non-Euclidian data. They are developed to incorporate the geometric and topological information of non-Euclidian data into the end-to-end deep learning architectures. Motivated by the recent success of discrete Ricci curvature in graph neural network (GNNs), we propose TorGNN, an analytic Torsion enhanced Graph Neural Network model. The essential idea is to characterize graph local structures with an analytic torsion based weight formula. Mathematically, analytic torsion is a topological invariant that can distinguish spaces which are homotopy equivalent but not homeomorphic. In our TorGNN, for each edge, a corresponding local simplicial complex is identified, then the analytic torsion (for this local simplicial complex) is calculated, and further used as a weight (for this edge) in message-passing process. Our TorGNN model is validated on link prediction tasks from sixteen different types of networks and node classification tasks from four types of networks. It has been found that our TorGNN can achieve superior performance on both tasks, and outperform various state-of-the-art models. This demonstrates that analytic torsion is a highly efficient topological invariant in the characterization of graph structures and can significantly boost the performance of GNNs.
PMID:40030998 | DOI:10.1109/TPAMI.2025.3528449
Latent Weight Quantization for Integerized Training of Deep Neural Networks
IEEE Trans Pattern Anal Mach Intell. 2025 Jan 9;PP. doi: 10.1109/TPAMI.2025.3527498. Online ahead of print.
ABSTRACT
Existing methods for integerized training speed up deep learning by using low-bitwidth integerized weights, activations, gradients, and optimizer buffers. However, they overlook the issue of full-precision latent weights, which consume excessive memory to accumulate gradient-based updates for optimizing the integerized weights. In this paper, we propose the first latent weight quantization schema for general integerized training, which minimizes quantization perturbation to training process via residual quantization with optimized dual quantizer. We leverage residual quantization to eliminate the correlation between latent weight and integerized weight for suppressing quantization noise. We further propose dual quantizer with optimal nonuniform codebook to avoid frozen weight and ensure statistically unbiased training trajectory as full-precision latent weight. The codebook is optimized to minimize the disturbance on weight update under importance guidance and achieved with a three-segment polyline approximation for hardware-friendly implementation. Extensive experiments show that the proposed schema allows integerized training with lowest 4-bit latent weight for various architectures including ResNets, MobileNetV2, and Transformers, and yields negligible performance loss in image classification and text generation. Furthermore, we successfully fine-tune Large Language Models with up to 13 billion parameters on one single GPU using the proposed schema.
PMID:40030978 | DOI:10.1109/TPAMI.2025.3527498
Learning-Based Modeling and Predictive Control for Unknown Nonlinear System With Stability Guarantees
IEEE Trans Neural Netw Learn Syst. 2025 Jan 10;PP. doi: 10.1109/TNNLS.2024.3525264. Online ahead of print.
ABSTRACT
This work focuses on the safety of learning-based control for unknown nonlinear system, considering the stability of learned dynamics and modeling mismatch between the learned dynamics and the true one. A learning-based scheme imposing the stability constraint is proposed in this work for modeling and stable control of unknown nonlinear system. Specifically, a linear representation of unknown nonlinear dynamics is established using the Koopman theory. Then, a deep learning approach is utilized to approximate embedding functions of Koopman operator for unknown system. For the safe manipulation of proposed scheme in the real-world applications, a stable constraint of learned dynamics and Lipschitz constraint of embedding functions are considered for learning a stable model for prediction and control. Moreover, a robust predictive control scheme is adopted to eliminate the effect of modeling mismatch between the learned dynamics and the true one, such that the stabilization of unknown nonlinear system is achieved. Finally, the effectiveness of proposed scheme is demonstrated on the tethered space robot (TSR) with unknown nonlinear dynamics.
PMID:40030974 | DOI:10.1109/TNNLS.2024.3525264
Irregular Artificial Vision Optimization Strategies Based on Transformer Saliency Detection
IEEE J Biomed Health Inform. 2025 Jan 10;PP. doi: 10.1109/JBHI.2024.3524642. Online ahead of print.
ABSTRACT
To improve the performance of object recognition under artificial prosthetic vision, this study proposes a two-stage method. The first stage is to extract the saliency and edge Mask of the object (SMP, EMP). Then, the irregular visual information of the object is processed using Irregularity Correction (IC). We design eye-hand coordination tasks and simulate artificial vision with retinal prostheses to validate strategy effectiveness, and select direct pixelation (DP) as a control group. Each subject retained a phosphene map in the same stochastic pattern in all his/her trails. The real-time experimental results showed that the deep saliency-based optimization strategies improved the performance of the subjects when completing tasks, in terms of head movement, recognition accuracy, and response time, and counts for successful small-objects recognition. The subjects have the smallest-scale average head movement (76.53 deg ± 20.75 deg), higher average objects recognition accuracy (91.18% ± 2.52%), and less time for finishing the task (35.71 s ± 8.66 s) and better successful search times of the small target objects (1.35 ± 0.33) under the SMP strategy. When integrating with IC, subjects' average performances have further improved to 63.39 ± 15.38 deg, 94.22% ± 3.94%, 25.76 s ± 6.24 s and 1.05 ± 0.30 respectively, which also significantly outperformed the DP condition. These results indicated that when utilizing the deep-learning-based saliency detection and IC processing, subjects could shorten the searching process and were able to discern the target objects more reliably. This work could be informative to future prosthetic devices considering implementation with the technique of artificial intelligence.
PMID:40030970 | DOI:10.1109/JBHI.2024.3524642