Deep learning
Parametric-MAA: fast, object-centric avoidance of metal artifacts for intraoperative CBCT
Int J Comput Assist Radiol Surg. 2025 Apr 5. doi: 10.1007/s11548-025-03348-7. Online ahead of print.
ABSTRACT
PURPOSE: Metal artifacts remain a persistent issue in intraoperative CBCT imaging. Particularly in orthopedic and trauma applications, these artifacts obstruct clinically relevant areas around the implant, reducing the modality's clinical value. Metal artifact avoidance (MAA) methods have shown potential to improve image quality through trajectory adjustments, but often fail in clinical practice due to their focus on irrelevant objects and high computational demands. To address these limitations, we introduce the novel parametric metal artifact avoidance (P-MAA) method.
METHODS: The P-MAA method first detects keypoints in two scout views using a deep learning model. These keypoints are used to model each clinically relevant object as an ellipsoid, capturing its position, extent, and orientation. We hypothesize that fine details of object shapes are less critical for artifact reduction. Based on these ellipsoidal representations, we devise a computationally efficient metric for scoring view trajectories, enabling fast, CPU-based optimization. A detection model for object localization was trained using both simulated and real data and validated on real clinical cases. The scoring method was benchmarked against a raytracing-based approach.
RESULTS: The trained detection model achieved a mean average recall of 0.78, demonstrating generalizability to unseen clinical cases. The ellipsoid-based scoring method closely approximated results using raytracing and was effective in complex clinical scenarios. Additionally, the ellipsoid method provided a 33-fold increase in speed, without the need for GPU acceleration.
CONCLUSION: The P-MAA approach provides a feasible solution for metal artifact avoidance in CBCT imaging, enabling fast trajectory optimization while focusing on clinically relevant objects. This method represents a significant step toward practical intraoperative implementation of MAA techniques.
PMID:40186717 | DOI:10.1007/s11548-025-03348-7
A magnetic resonance image-based deep learning radiomics nomogram for hepatocyte cytokeratin 7 expression: application to predict cholestasis progression in children with pancreaticobiliary maljunction
Pediatr Radiol. 2025 Apr 5. doi: 10.1007/s00247-025-06225-2. Online ahead of print.
ABSTRACT
BACKGROUND: Hepatocyte cytokeratin 7 (CK7) is a reliable marker for evaluating the severity of cholestasis in chronic cholestatic cholangiopathies. However, there is currently no noninvasive test available to assess the status of hepatocyte CK7 in pancreaticobiliary maljunction patients.
OBJECTIVE: We aimed to develop a deep learning radiomics nomogram using magnetic resonance images (MRIs) to preoperatively identify the hepatocyte CK7 status and assess cholestasis progression in patients with pancreaticobiliary maljunction.
MATERIALS AND METHODS: In total, 180 pancreaticobiliary maljunction patients were retrospectively enrolled and were randomly divided into a training cohort (n = 144) and a validation cohort (n = 36). CK7 status was determined through immunohistochemical analysis. Pyradiomics and pretrained ResNet50 were used to extract radiomics and deep learning features, respectively. To construct the radiomics and deep learning signature, feature selection methods including the minimum redundancy-maximum relevance and least absolute shrinkage and selection operator were employed. The integrated deep learning radiomics nomogram model was constructed by combining the imaging signatures and valuable clinical feature.
RESULTS: The deep learning signature exhibited superior predictive performance compared with the radiomics signature, as evidenced by the higher area under the curve (AUC) values in validation cohort (0.92 vs. 0.81). Further, the deep learning radiomics nomogram, which incorporated the radiomics signature, deep learning signature, and Komi classification, demonstrated excellent predictive ability for CK7 expression, with AUC value of 0.95 in the validation cohort.
CONCLUSION: The proposed deep learning radiomics nomogram exhibits promising performance in accurately identifying hepatic CK7 expression, thus facilitating prediction of cholestasis progression and perhaps earlier initiation of treatment in pancreaticobiliary maljunction children.
PMID:40186654 | DOI:10.1007/s00247-025-06225-2
Deep learning-based denoising image reconstruction of body magnetic resonance imaging in children
Pediatr Radiol. 2025 Apr 5. doi: 10.1007/s00247-025-06230-5. Online ahead of print.
ABSTRACT
BACKGROUND: Radial k-space sampling is widely employed in paediatric magnetic resonance imaging (MRI) to mitigate motion and aliasing artefacts. Artificial intelligence (AI)-based image reconstruction has been developed to enhance image quality and accelerate acquisition time.
OBJECTIVE: To assess image quality of deep learning (DL)-based denoising image reconstruction of body MRI in children.
MATERIALS AND METHODS: Children who underwent thoraco-abdominal MRI employing radial k-space filling technique (PROPELLER) with conventional and DL-based image reconstruction between April 2022 and January 2023 were eligible for this retrospective study. Only cases with previous MRI including comparable PROPELLER sequences with conventional image reconstruction were selected. Image quality was compared between DL-reconstructed axial T1-weighted and T2-weighted images and conventionally reconstructed images from the same PROPELLER acquisition. Quantitative image quality was assessed by signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) of the liver and spleen. Qualitative image quality was evaluated by three observers using a 4-point Likert scale and included presence of noise, motion artefact, depiction of peripheral lung vessels and subsegmental bronchi at the lung bases, sharpness of abdominal organ borders, and visibility of liver and spleen vessels. Image quality was compared with the Wilcoxon signed-rank test. Scan time length was compared to prior MRI obtained with conventional image reconstruction.
RESULTS: In 21 children (median age 7 years, range 1.5 years to 15.8 years) included, the SNR and CNR of the liver and spleen on T1-weighted and T2-weighted images were significantly higher with DL-reconstruction (P<0.001) than with conventional reconstruction. The DL-reconstructed images showed higher overall image quality, with improved delineation of the peripheral vessels and the subsegmental bronchi in the lung bases, sharper abdominal organ margins and increased visibility of the peripheral vessels in the liver and spleen. Not respiratory-gated DL-reconstructed T1-weighted images demonstrated more pronounced respiratory motion artefacts in comparison to conventional reconstruction (P=0.015), while there was no difference for the respiratory-gated T2-weighted images. The median scan time per slice was reduced from 6.3 s (interquartile range, 4.2 - 7.0 s) to 4.8 s (interquartile range, 4.4 - 4.9 s) for the T1-weighted images and from 5.6 s (interquartile range, 5.4 - 5.9 s) to 4.2 s (interquartile range, 3.9 - 4.8 s) for the T2-weighted images.
CONCLUSION: DL-based denoising image reconstruction of paediatric body MRI sequences employing radial k-space sampling allowed for improved overall image quality at shorter scan times. Respiratory motion artefacts were more pronounced on ungated T1-weighted images.
PMID:40186652 | DOI:10.1007/s00247-025-06230-5
Classification of ocular surface diseases: Deep learning for distinguishing ocular surface squamous neoplasia from pterygium
Graefes Arch Clin Exp Ophthalmol. 2025 Apr 5. doi: 10.1007/s00417-025-06804-x. Online ahead of print.
ABSTRACT
PURPOSE: Given the significance and potential risks associated with Ocular Surface Squamous Neoplasia (OSSN) and the importance of its differentiation from other conditions, we aimed to develop a Deep Learning (DL) model differentiating OSSN from pterygium (PTG) using slit photographs.
METHODS: A dataset comprising slit photographs of 162 patients including 77 images of OSSN and 85 images of PTG was assembled. After manual segmentation of the images, a Python-based transfer learning approach utilizing the EfficientNet B7 network was employed for automated image segmentation. GoogleNet, a pre-trained neural network was used to categorize the images into OSSN or PTG. To evaluate the performance of our DL model, K-Fold 10 Cross Validation was implemented, and various performance metrics were measured.
RESULTS: There was a statistically significant difference in mean age between the OSSN (63.23 ± 13.74 years) and PTG groups (47.18 ± 11.53) (P-value =.000). Furthermore, 84.41% of patients in the OSSN group and 80.00% of the patients in the PTG group were male. Our classification model, trained on automatically segmented images, demonstrated reliable performance measures in distinguishing OSSN from PTG, with an Area Under Curve (AUC) of 98%, sensitivity, F1 score, and accuracy of 94%, and a Matthews Correlation Coefficient (MCC) of 88%.
CONCLUSIONS: This study presents a novel DL model that effectively segments and classifies OSSN from PTG images with a relatively high accuracy. In addition to its clinical use, this model can be potentially used as a telemedicine application.
PMID:40186633 | DOI:10.1007/s00417-025-06804-x
A deep learning model for multiclass tooth segmentation on cone-beam computed tomography scans
Am J Orthod Dentofacial Orthop. 2025 Apr 5:S0889-5406(25)00101-5. doi: 10.1016/j.ajodo.2025.02.014. Online ahead of print.
ABSTRACT
INTRODUCTION: Machine learning, a common artificial intelligence technology in medical image analysis, enables computers to learn statistical patterns from pairs of data and annotated labels. Supervised learning in machine learning allows the computer to predict how a specific anatomic structure should be segmented in new patients. This study aimed to develop and validate a deep learning algorithm that automatically creates 3-dimensional surface models of human teeth from a cone-beam computed tomography scan.
METHODS: A multiresolution dataset, including 216 × 272 × 272, 512 × 512 × 512, and 576 × 768 × 768. Ground truth labels for teeth segmentation were generated. Random partitioning was applied to allocate 140 patients to the training set, 40 to the validation set, and 30 scans for testing and model performance evaluation. Different evaluation metrics were used for assessment.
RESULTS: Our teeth identification model has achieved an accuracy of 87.92% ± 4.43% on the test set. The general (binary) teeth segmentation model achieved a notably higher accuracy, segmenting the teeth with 93.16% ± 1.18%.
CONCLUSIONS: The success of our model not only validates the efficacy of using artificial intelligence for dental imaging analysis but also sets a promising foundation for future advancements in automated and precise dental segmentation techniques.
PMID:40186597 | DOI:10.1016/j.ajodo.2025.02.014
Open-source deep-learning models for segmentation of normal structures for prostatic and gynecological high-dose-rate brachytherapy: Comparison of architectures
J Appl Clin Med Phys. 2025 Apr 5:e70089. doi: 10.1002/acm2.70089. Online ahead of print.
ABSTRACT
BACKGROUND: The use of deep learning-based auto-contouring algorithms in various treatment planning services is increasingly common. There is a notable deficit of commercially or publicly available models trained on large or diverse datasets containing high-dose-rate (HDR) brachytherapy treatment scans, leading to poor performance on images that include HDR implants.
PURPOSE: To implement and evaluate automatic organs-at-risk (OARs) segmentation models for use in prostatic-and-gynecological computed tomography (CT)-guided high-dose-rate brachytherapy treatment planning.
METHODS AND MATERIALS: 1316 computed tomography (CT) scans and corresponding segmentation files from 1105 prostatic-or-gynecological HDR patients treated at our institution from 2017 to 2024 were used for model training. Data sources comprised six CT scanners including a mobile CT unit with previously reported susceptibility to image streaking artifacts. Two UNet-derived model architectures, UNet++ and nnU-Net, were investigated for bladder and rectum model training. The models were tested on 100 CT scans and clinically-used segmentation files from 62 prostatic-or-gynecological HDR brachytherapy patients, disjoint from the training set, collected in 2024. Performance was evaluated using the Dice-Similarity-Coefficient (DSC) between model predicted contours and clinically-used contours on slices in common with the Clinical-Target-Volume (CTV). Additionally, a blinded evaluation of ten random test cases was conducted by three experienced planners.
RESULTS: Median (interquartile range) 3D DSC on CTV-containing slices were 0.95 (0.04) and 0.87 (0.09) for the UNet++ bladder and rectum models, respectively, and 0.96 (0.03) and 0.88 (0.10) for the nnU-Net. The rank-sum test did not reveal statistically significant differences in these DSC (p = 0.15 and 0.27, respectively). The blinded evaluation scored trained models higher than clinically-used contours.
CONCLUSION: Both UNet-derived architectures perform similarly on the bladder and rectum and are adequately accurate to reduce contouring time in a review-and-edit context during HDR brachytherapy planning. The UNet++ models were chosen for implementation at our institution due to lower computing hardware requirements and are in routine clinical use.
PMID:40186596 | DOI:10.1002/acm2.70089
Construction and evaluation of glucocorticoid dose prediction model based on genetic and clinical characteristics of patients with systemic lupus erythematosus
Int J Immunopathol Pharmacol. 2025 Jan-Dec;39:3946320251331791. doi: 10.1177/03946320251331791. Epub 2025 Apr 5.
ABSTRACT
Currently, no glucocorticoid dose prediction model is available for clinical practice. This study aimed to utilise machine learning techniques to develop and validate personalised dosage models. Participants were patients with SLE who were registered at Nanfang Hospital and received prednisone. Univariate analysis was used to confirm the feature variables. Subsequently, the random forest (RF) algorithm was utilised to interpolate the absent values of the feature variables. Finally, we assessed the prediction capabilities of 11 machine learning and deep-learning algorithms (Logistic, SVM, RF, Adaboost, Bagging, XGBoost, LightGBM, CatBoost, MLP, and TabNet). Finally, a confusion matrix was used to validate the three regimens. In total, 129 patients met the inclusion criteria. The XGBoost algorithm was selected as the preferred method because of its superior performance, achieving an accuracy of 0.81. The factors exhibiting the highest correlation with the prednisone dose were CYP3A4 (rs4646437), albumin (ALB), haemoglobin (HGB), anti-double-stranded DNA antibodies (Anti-dsDNA), erythrocyte sedimentation rate (ESR), age, and HLA-DQA1 (rs2187668). Based on validation, the precision and recall rates for low-dose prednisone (⩾5 mg but <7.5 mg/d) were 100% and 40% respectively. Similarly, for medium-dose prednisone (⩾7.5 mg but <30 mg/d), the accuracy and recall rates were 88% and 88%, and for high-dose prednisone (⩾30 mg but ⩽100 mg/d), the accuracy and recall rates were 62% and 100% respectively. A robust machine learning model was developed to accurately predict prednisone dosage by integrating the identified genetic and clinical factors.
PMID:40186486 | DOI:10.1177/03946320251331791
Deep learning model for detecting cystoid fluid collections on optical coherence tomography in X-linked retinoschisis patients
Acta Ophthalmol. 2025 Apr 4. doi: 10.1111/aos.17495. Online ahead of print.
ABSTRACT
PURPOSE: To validate a deep learning (DL) framework for detecting and quantifying cystoid fluid collections (CFC) on spectral-domain optical coherence tomography (SD-OCT) in X-linked retinoschisis (XLRS) patients.
METHODS: A no-new-U-Net model was trained using 112 OCT volumes from the RETOUCH challenge (70 for training and 42 for internal testing). External validation involved 37 SD-OCT scans from 20 XLRS patients, including 20 randomly sampled B-scans and 17 manually selected central B-scans. Three graders manually delineated the CFC on these B-scans in this external test set. The model's efficacy was evaluated using Dice and intraclass correlation coefficient (ICC) scores, assessed exclusively on the test set comprising B-scans from XLRS patients.
RESULTS: For the randomly sampled B-scans, the model achieved a mean Dice score of 0.886 (±0.010), compared to 0.912 (±0.014) for the observers. For the manually selected central B-scans, the Dice scores were 0.936 (±0.012) for the model and 0.946 (±0.012) for the graders. ICC scores between the model and reference were 0.945 (±0.014) for the randomly selected and 0.964 (±0.011) for the manually selected B-scans. Among the graders, ICC scores were 0.979 (±0.008) and 0.981 (±0.011), respectively.
CONCLUSIONS: Our validated DL model accurately segments and quantifies CFC on SD-OCT in XLRS, paving the way for reliable monitoring of structural changes. However, systematic overestimation by the DL model was observed, highlighting a key limitation for future refinement.
PMID:40186400 | DOI:10.1111/aos.17495
Diagnostic value of combining ultrafast cine MRI and morphological measurements on gastroesophageal reflux disease
Abdom Radiol (NY). 2025 Apr 5. doi: 10.1007/s00261-025-04890-3. Online ahead of print.
ABSTRACT
PURPOSE: To evaluate the diagnostic performance of combining ultrafast real-time cine MRI with morphological measurements on gastroesophageal reflux disease (GERD).
METHODS: In the prospective study, 40 healthy volunteers and 30 GERD patients underwent real-time cine MRI using an undersampled low-angle gradient echo sequence (50 ms/frame) with deep-learning reconstruction, to monitor the gastroesophageal junction (GEJ) and observe the reflux of the contrast agent during the Valsalva maneuver. The width of the lower esophagus, the length of the lower esophageal sphincter (LES), the end-expiratory and post Valsalva maneuver His angle were measured.
RESULTS: There were no statistical differences between the two group either in lower esophageal width (14.06 ± 1.50 mm vs. 14.75 ± 1.57 mm, P > 0.05) or LES length (25.20 ± 1.46 mm vs. 24.39 ± 1.68 mm, P > 0.05). The end-expiratory His angle (84.45 ± 18.67°) and post Valsalva maneuver His angle (101.53 ± 19.22°), and the differences between them (17.08 ± 5.65°) in the GERD group were greater than those in the healthy volunteers (71.51 ± 18.01°, 86.09 ± 18.24°, 14.57 ± 3.88° respectively, P < 0.05). Reflux was induced in 8 cases of GERD group including 4 cases with hiatus hernia and not observed in healthy volunteers. The AUC for diagnosing GERD were 0.702, 0.737 and 0.634 for end-expiratory, post Valsalva maneuver His angle and their differences, when combined with real-time MRI was 0.823, with a sensitivity of 86.67% and a specificity of 67.50%.
CONCLUSION: Real-time MRI can display dynamic swallowing and reflux at the GEJ. The His angle can serve as a morphological indicator for diagnosing GERD with MRI.
PMID:40186014 | DOI:10.1007/s00261-025-04890-3
CGLoop: a neural network framework for chromatin loop prediction
BMC Genomics. 2025 Apr 5;26(1):342. doi: 10.1186/s12864-025-11531-y.
ABSTRACT
BACKGROUND: Chromosomes of species exhibit a variety of high-dimensional organizational features, and chromatin loops, which are fundamental structures in the three-dimensional (3D) structure of the genome. Chromatin loops are visible speckled patterns on Hi-C contact matrix generated by chromosome conformation capture methods. The chromatin loops play an important role in gene expression, and predicting the chromatin loops generated during whole genome interactions is crucial for a deeper understanding of the 3D genome structure and function.
RESULTS: Here, we propose CGLoop, a deep learning based neural network framework that detects chromatin loops in Hi-C contact matrix. CGLoop combines the convolutional neural network (CNN) with Convolutional Block Attention Module (CBAM) and the Bidirectional Gated Recurrent Unit (BiGRU) to capture important features related to chromatin loops by comprehensively analyzing the Hi-C contact matrix, enabling the prediction of candidate chromatin loops. And CGLoop employs a density based clustering method to filter the candidate chromatin loops predicted by the neural network model. Finally, we compared CGloop with other chromatin loops prediction methods on several cell line including GM12878, K562, IMR90, and mESC. The code is available from https://github.com/wllwuliliwll/CGLoop .
CONCLUSIONS: The experimental results show that, loops predicted by CGLoop show high APA scores and there is an enrichment of multiple transcription factors and binding proteins at the predicted loops anchors, which outperforms other methods in terms of accuracy and validity of chromatin loops prediction.
PMID:40186170 | DOI:10.1186/s12864-025-11531-y
Opening the deep learning box
Nat Neurosci. 2025 Apr 4. doi: 10.1038/s41593-025-01938-x. Online ahead of print.
NO ABSTRACT
PMID:40186074 | DOI:10.1038/s41593-025-01938-x
Fast and Robust Single-Shot Cine Cardiac MRI Using Deep Learning Super-Resolution Reconstruction
Invest Radiol. 2025 Apr 7. doi: 10.1097/RLI.0000000000001186. Online ahead of print.
ABSTRACT
OBJECTIVE: The aim of the study was to compare the diagnostic quality of deep learning (DL) reconstructed balanced steady-state free precession (bSSFP) single-shot (SSH) cine images with standard, multishot (also: segmented) bSSFP cine (standard cine) in cardiac MRI.
METHODS AND MATERIALS: This prospective study was performed in a cohort of participants with clinical indication for cardiac MRI. SSH compressed-sensing bSSFP cine and standard multishot cine were acquired with breath-holding and electrocardiogram-gating in short-axis view at 1.5 Tesla. SSH cine images were reconstructed using an industry-developed DL super-resolution algorithm (DL-SSH cine). Two readers evaluated diagnostic quality (endocardial edge definition, blood pool to myocardium contrast and artifact burden) from 1 (nondiagnostic) to 5 (excellent). Functional left ventricular (LV) parameters were assessed in both sequences. Edge rise distance, apparent signal-to-noise ratio (aSNR) and contrast-to-noise ratio were calculated. Statistical analysis for the comparison of DL-SSH cine and standard cine included the Student's t-test, Wilcoxon signed-rank test, Bland-Altman analysis, and Pearson correlation.
RESULTS: Forty-five participants (mean age: 50 years ±18; 30 men) were included. Mean total scan time was 65% lower for DL-SSH cine compared to standard cine (92 ± 8 s vs 265 ± 33 s; P < 0.0001). DL-SSH cine showed high ratings for subjective image quality (eg, contrast: 5 [interquartile range {IQR}, 5-5] vs 5 [IQR, 5-5], P = 0.01; artifacts: 4.5 [IQR, 4-5] vs 5 [IQR, 4-5], P = 0.26), with superior values for sharpness parameters (endocardial edge definition: 5 [IQR, 5-5] vs 5 [IQR, 4-5], P < 0.0001; edge rise distance: 1.9 [IQR, 1.8-2.3] vs 2.5 [IQR, 2.3-2.6], P < 0.0001) compared to standard cine. No significant differences were found in the comparison of objective metrics between DL-SSH and standard cine (eg, aSNR: 49 [IQR, 38.5-70] vs 52 [IQR, 38-66.5], P = 0.74). Strong correlation was found between DL-SSH cine and standard cine for the assessment of functional LV parameters (eg, ejection fraction: r = 0.95). Subgroup analysis of participants with arrhythmia or unreliable breath-holding (n = 14/45, 31%) showed better image quality ratings for DL-SSH cine compared to standard cine (eg, artifacts: 4 [IQR, 4-5] vs 4 [IQR, 3-5], P = 0.04).
CONCLUSIONS: DL reconstruction of SSH cine sequence in cardiac MRI enabled accelerated acquisition times and noninferior diagnostic quality compared to standard cine imaging, with even superior diagnostic quality in participants with arrhythmia or unreliable breath-holding.
PMID:40184545 | DOI:10.1097/RLI.0000000000001186
Relationships Between Familial Factors, Learning Motivation, Learning Approaches, and Cognitive Flexibility Among Vocational Education and Training Students
J Psychol. 2025 Apr 4:1-24. doi: 10.1080/00223980.2025.2456801. Online ahead of print.
ABSTRACT
This study investigated the relationships between familial factors in terms of parental autonomy support and parental support and Vocational Education and Training (VET) students' learning motivation, learning approaches, and cognitive flexibility. In this cross-sectional study, a convenient sample of 557 VET students (males = 56.7% and females = 43.35; mean age = 18.41 and SD = 0.85) from ten vocational schools in Bangkok areas, Thailand, responded to a questionnaire of adapted scales on familial factors (i.e., parental autonomy support and parental support), learning motivation (i.e., intrinsic motivation, extrinsic motivation, and utility value), learning approaches (i.e., deep learning approaches and surface learning approaches), and cognitive flexibility (i.e., alternatives). Structural equation analyses revealed that parental autonomy support had indirect relationship with alternatives via learning motivation and deep learning approaches, whereas parental support had both direct and indirect association with alternatives through learning motivation and deep learning approaches. Surface learning approaches were not found to significantly predict alternatives. These findings suggest that a familial context that stresses autonomy support and helpful support from parents can motivate VET students to learn and adopt deep approaches to learning, which in turn encourages the development of their cognitive flexibility.
PMID:40184534 | DOI:10.1080/00223980.2025.2456801
MIST: An interpretable and flexible deep learning framework for single-T cell transcriptome and receptor analysis
Sci Adv. 2025 Apr 4;11(14):eadr7134. doi: 10.1126/sciadv.adr7134. Epub 2025 Apr 4.
ABSTRACT
Joint analysis of transcriptomic and T cell receptor (TCR) features at single-cell resolution provides a powerful approach for in-depth T cell immune function research. Here, we introduce a deep learning framework for single-T cell transcriptome and receptor analysis, MIST (Multi-insight for T cell). MIST features three latent spaces: gene expression, TCR, and a joint latent space. Through analyses of antigen-specific T cells, and T cell datasets related to lung cancer immunotherapy and COVID19, we demonstrate MIST's interpretability and flexibility. MIST easily and accurately resolves cell function and antigen specificity by vectorizing and integrating transcriptome and TCR data of T cells. In addition, using MIST, we identified the heterogeneity of CXCL13+ subsets in lung cancer infiltrating CD8+ T cells and their association with immunotherapy, providing additional insights into the functional transition of CXCL13+ T cells related to anti-PD-1 therapy that were not reported in the original study.
PMID:40184452 | DOI:10.1126/sciadv.adr7134
Deep learning-based uncertainty quantification for quality assurance in hepatobiliary imaging-based techniques
Oncotarget. 2025 Apr 4;16:249-255. doi: 10.18632/oncotarget.28709.
ABSTRACT
Recent advances in deep learning models have transformed medical imaging analysis, particularly in radiology. This editorial outlines how uncertainty quantification through embedding-based approaches enhances diagnostic accuracy and reliability in hepatobiliary imaging, with a specific focus on oncological conditions and early detection of precancerous lesions. We explore modern architectures like the Anisotropic Hybrid Network (AHUNet), which leverages both 2D imaging and 3D volumetric data through innovative convolutional approaches. We consider the implications for quality assurance in radiological practice and discuss recent clinical applications.
PMID:40184325 | DOI:10.18632/oncotarget.28709
Hessian-Aware Zeroth-Order Optimization
IEEE Trans Pattern Anal Mach Intell. 2025 Mar 7;PP. doi: 10.1109/TPAMI.2025.3548810. Online ahead of print.
ABSTRACT
Zeroth-order optimization algorithms recently emerge as a popular research theme in optimization and machine learning, playing important roles in many deep-learning related tasks such as black-box adversarial attack, deep reinforcement learning, as well as hyper-parameter tuning. Mainstream zeroth-order optimization algorithms, however, concentrate on exploiting zeroth-order-estimated first-order gradient information of the objective landscape. In this paper, we propose a novel meta-algorithm called Hessian-Aware Zeroth-Order (ZOHA) optimization algorithm, which utilizes several canonical variants of zeroth-order-estimated second-order Hessian information of the objective: power-method-based, and Gaussian-smoothing-based. We conclude theoretically that ZOHA enjoys an improved convergence rate compared with existing work without incorporating in zeroth-order optimization second-order Hessian information. Empirical studies on logistic regression as well as the black-box adversarial attack are provided to validate the effectiveness and improved success rates with reduced query complexity of the zeroth-order oracle.
PMID:40184293 | DOI:10.1109/TPAMI.2025.3548810
Short-Term Residential Load Forecasting Framework Based on Spatial-Temporal Fusion Adaptive Gated Graph Convolution Networks
IEEE Trans Neural Netw Learn Syst. 2025 Apr 4;PP. doi: 10.1109/TNNLS.2025.3551778. Online ahead of print.
ABSTRACT
Enhancing the prediction of volatile and intermittent electric loads is one of the pivotal elements that contributes to the smooth functioning of modern power grids. However, conventional deep learning-based forecasting techniques fall short in simultaneously taking into account both the temporal dependencies of historical loads and the spatial structure between residential units, resulting in a subpar prediction performance. Furthermore, the representation of the spatial graph structure is frequently inadequate and constrained, along with the complexities inherent in Spatial-Temporal data, impeding the effective learning among different households. To alleviate those shortcomings, this article proposes a novel framework: Spatial-Temporal fusion adaptive gated graph convolution networks (STFAG-GCNs), tailored for residential short-term load forecasting (STLF). Spatial-Temporal fusion graph construction is introduced to compensate for several existing correlations where additional information are not known or unreflected in advance. Through an innovative gated adaptive fusion graph convolution (AFG-Conv) mechanism, Spatial-Temporal fusion graph convolution network (STFGCN) dynamically model the Spatial-Temporal correlations implicitly. Meanwhile, by integrating a gated temporal convolutional network (Gated TCN) and multiple STFGCNs into a unified Spatial-Temporal fusion layer, STFAG-GCN handles long sequences by stacking layers. Experimental results on real-world datasets validate the accuracy and robustness of STFAG-GCN in forecasting short-term residential loads, highlighting its advancements over state-of-the-art methods. Ablation experiments further reveal its effectiveness and superiority.
PMID:40184286 | DOI:10.1109/TNNLS.2025.3551778
Unknown-Aware Bilateral Dependency Optimization for Defending Against Model Inversion Attacks
IEEE Trans Pattern Anal Mach Intell. 2025 Apr 4;PP. doi: 10.1109/TPAMI.2025.3558267. Online ahead of print.
ABSTRACT
By abusing access to a well-trained classifier, model inversion (MI) attacks pose a significant threat as they can recover the original training data, leading to privacy leakage. Previous studies mitigated MI attacks by imposing regularization to reduce the dependency between input features and outputs during classifier training, a strategy known as unilateral dependency optimization. However, this strategy contradicts the objective of minimizing the supervised classification loss, which inherently seeks to maximize the dependency between input features and outputs. Consequently, there is a trade-off between improving the model's robustness against MI attacks and maintaining its classification performance. To address this issue, we propose the bilateral dependency optimization strategy (BiDO), a dual-objective approach that minimizes the dependency between input features and latent representations, while simultaneously maximizing the dependency between latent representations and labels. BiDO is remarkable for its privacy-preserving capabilities. However, models trained with BiDO exhibit diminished capabilities in out-of-distribution (OOD) detection compared to models trained with standard classification supervision. Given the open-world nature of deep learning systems, this limitation could lead to significant security risks, as encountering OOD inputs-whose label spaces do not overlap with the in-distribution (ID) data used during training-is inevitable. To address this, we leverage readily available auxiliary OOD data to enhance the OOD detection performance of models trained with BiDO. This leads to the introduction of an upgraded framework, unknown-aware BiDO (BiDO+), which mitigates both privacy and security concerns. As a highlight, with comparable model utility, BiDO-HSIC+ reduces the FPR95 by $55.02\%$ and enhances the AUCROC by $9.52\%$ compared to BiDO-HSIC, while also providing superior MI robustness.
PMID:40184277 | DOI:10.1109/TPAMI.2025.3558267
LETA: Tooth Alignment Prediction Based on Dual-branch Latent Encoding
IEEE Trans Vis Comput Graph. 2024 Jun 20;PP. doi: 10.1109/TVCG.2024.3413857. Online ahead of print.
ABSTRACT
Accurately determining the clinical positions for each tooth is essential in orthodontics, while most existing solutions heavily rely on inefficient manual design. In this paper, we present the LETA, a dual-branch Latent Encoding based 3D Tooth Alignment. Our system takes as input the segmented individual 3D tooth meshes in the Intra-oral Scanner (IOS) dental surfaces, and automatically predicts the proper 3D pose transformation for each tooth. LETA includes three components: an Encoder that learns a latent code of dental pointcloud, a Projector that transforms the latent code of misaligned teeth to predicted aligned ones, and a Solver to estimate the transformation between different dental latent codes. A key novelty of LETA is that we extract the features from the ground truth (GT) aligned teeth to guide network learning during training. To effectively learn tooth features, our Encoder employs an improved point-wise convolutional operation and an attention-based network to extract local shape features and global context features respectively. Extensive experimental results on a large-scale dataset with 9,868 IOS surfaces demonstrate that LETA can achieve state-of-the-art performance. A further clinical applicability study reveals that our method can reduce orthodontists' workload over 60% compared to starting tooth alignment from scratch, demonstrating the strong potential of deep learning for future digital dentistry.
PMID:40184274 | DOI:10.1109/TVCG.2024.3413857
Using generative adversarial deep learning networks to synthesize cerebrovascular reactivity imaging from pre-acetazolamide arterial spin labeling in moyamoya disease
Neuroradiology. 2025 Apr 4. doi: 10.1007/s00234-025-03605-1. Online ahead of print.
ABSTRACT
BACKGROUND: Cerebrovascular reactivity (CVR) assesses vascular health in various brain conditions, but CVR measurement requires a challenge to cerebral perfusion such as the administration of acetazolamide(ACZ), thus limiting widespread use. We determined whether generative adversarial networks (GANs) can create CVR images from baseline pre-ACZ arterial spin labeling (ASL) MRI.
METHODS: This study included 203 Moyamoya cases with a total of 3248 pre- and post-ACZ ASL Cerebral Blood Flow (CBF) images. Reference CVRs were generated from these CBF slices. From this set, 2640 slices were used to train a Pixel-to-Pixel GAN consisting of a generator and discriminator network, with the remaining 608 slices reserved as a testing set. Following training, the pre-ACZ CBF in the testing set was introduced to the trained model to generate synthesized CVR. The quality of the synthesized CVR was evaluated with structural similarity index(SSI), spatial correlation coefficient(SCC), and the root mean squared error(RMSE), compared with reference CVR. The segmentations of the low CVR regions were compared using the Dice similarity coefficient (DSC). Reference and synthesized CVRs in single-slice and individual-hemisphere settings were reviewed to assess CVR status, with Cohen's Kappa measuring consistency.
RESULTS: The mean SSIs of the CVR of training and testing sets were 0.943 ± 0.019 and 0.943 ± 0.020. The mean SCCs of the CVR of training and testing sets were 0.988 ± 0.009 and 0.987 ± 0.011. The mean RMSEs of the CVR are 0.077 ± 0.015 and 0.079 ± 0.018. Mean DSC of low CVR area of testing sets was 0.593 ± 0.128. Visual interpretation yielded Cohen's Kappa values of 0.896 and 0.813 for the training and testing sets in the single-slice setting, and 0.781 and 0.730 in the individual-hemisphere setting.
CONCLUSIONS: Synthesized CVR by GANs from baseline ASL without challenge may be a useful alternative in detecting vascular deficits in clinical applications when ACZ challenge is not feasible.
PMID:40183965 | DOI:10.1007/s00234-025-03605-1