Deep learning

The impact of partner interaction on brief social buffering in adolescent female rats as analyzed by deep learning-based object detection algorithms

Thu, 2025-05-01 06:00

Physiol Behav. 2025 Apr 29:114934. doi: 10.1016/j.physbeh.2025.114934. Online ahead of print.

ABSTRACT

Social buffering is a phenomenon whereby the stress response of anyone exposed to a distressing stimulus is alleviated by the presence of conspecific(s). In this study, we aimed to determine whether brief buffering (only 3 min) with conspecific immediately after fear conditioning can produce social buffering in adolescent Sprague-Dawley rats (4-5 weeks, male and female) and whether close partner interaction can impact brief social buffering in adolescent female rats. The rats received an electric shock in the black room of shuttle box, followed by a 3 min buffering period. After two times of learning, the rats performed passive avoidance test individually, both immediately and 24 hr later. To reduce human bias and analyze variables not accessible to humans, data were analyzed using YOLOv8 and BoT-SORT, deep learning-based algorithm. As a result, Toy group, tested with an object resembling a rat, showed a significant increase in fear-related behavior for both sexes. Pair group, tested with a partner, showed a significant decrease in fear-related behavior in both sexes during the learning check, but only females maintained this decrease in the retention. In Pair female group, the longer the rat and its partner spent in the same room and the longer they stayed close, the higher the black room preference; this was a significant correlation. Therefore, we demonstrated that immediate brief social contact is sufficient to induce social buffering especially in female rats. In addition, social contact appears to be a key factor increasing the efficiency of social buffering.

PMID:40311725 | DOI:10.1016/j.physbeh.2025.114934

Categories: Literature Watch

A deep learning algorithm for automated adrenal gland segmentation on non-contrast CT images

Thu, 2025-05-01 06:00

BMC Med Imaging. 2025 May 1;25(1):142. doi: 10.1186/s12880-025-01682-5.

ABSTRACT

BACKGROUND: The adrenal glands are small retroperitoneal organs, few reference standards exist for adrenal CT measurements in clinical practice. This study aims to develop a deep learning (DL) model for automated adrenal gland segmentation on non-contrast CT images, and to conduct a preliminary large-scale study on age-related volume changes in normal adrenal glands using the model output values.

METHODS: The model was trained and evaluated on a development dataset of annotated non-contrast CT scans of bilateral adrenal glands, utilizing nnU-Net for segmentation task. The ground truth was manually established by two experienced radiologists, and the model performance was assessed using the Dice similarity coefficient (DSC). Additionally, five radiologists provided annotations on a subset of 20 randomly selected cases to measure inter-observer variability. Following validation, the model was applied to a large-scale normal adrenal glands dataset to segment adrenal glands.

RESULTS: The DL model development dataset contained 1301 CT examinations. In the test set, the median DSC scores for the segmentation model of left and right adrenal glands were 0.899 and 0.904 respectively, and in the independent test set were 0.900 and 0.896. Inter-observer DSC for radiologist manual segmentation did not differ from automated machine segmentation (P = 0.541). The large-scale normal adrenal glands dataset contained 2000 CT examinations, the graph shows that adrenal gland volume increases first and then decreases with age.

CONCLUSION: The developed DL model demonstrates accurate adrenal gland segmentation, and enables a comprehensive study of age-related adrenal gland volume variations.

PMID:40312690 | DOI:10.1186/s12880-025-01682-5

Categories: Literature Watch

Artifact estimation network for MR images: effectiveness of batch normalization and dropout layers

Thu, 2025-05-01 06:00

BMC Med Imaging. 2025 May 1;25(1):144. doi: 10.1186/s12880-025-01663-8.

ABSTRACT

BACKGROUND: Magnetic resonance imaging (MRI) is an essential tool for medical diagnosis. However, artifacts may degrade images obtained through MRI, especially owing to patient movement. Existing methods that mitigate the artifact problem are subject to limitations including extended scan times. Deep learning architectures, such as U-Net, may be able to address these limitations. Optimizing deep learning networks with batch normalization (BN) and dropout layers enhances their convergence and accuracy. However, the influence of this strategy on U-Net has not been explored for artifact removal.

METHODS: This study developed a U-Net-based regression network for the removal of motion artifacts and investigated the impact of combining BN and dropout layers as a strategy for this purpose. A Transformer-based network from a previous study was also adopted for comparison. In total, 1200 images (with and without motion artifacts) were used to train and test three variations of U-Net.

RESULTS: The evaluation results demonstrated a significant improvement in network accuracy when BN and dropout layers were implemented. The peak signal-to-noise ratio of the reconstructed images was approximately doubled and the structural similarity index was improved by approximately 10% compared with those of the artifact images.

CONCLUSIONS: Although this study was limited to phantom images, the same strategy may be applied to more complex tasks, such as those directed at improving the quality of MR and CT images. We conclude that the accuracy of motion artifact removal can be improved by integrating BN and dropout layers into a U-Net-based network, with due consideration of the correct location and dropout rate.

PMID:40312665 | DOI:10.1186/s12880-025-01663-8

Categories: Literature Watch

Assessing english Language teachers' pedagogical effectiveness using convolutional neural networks optimized by modified virus colony search algorithm

Thu, 2025-05-01 06:00

Sci Rep. 2025 May 1;15(1):15295. doi: 10.1038/s41598-025-98033-9.

ABSTRACT

Effective teacher performance evaluation is important for enhancing the quality of educational systems. This study presents a novel approach that integrates deep learning and metaheuristics to assess the pedagogical quality of English as a foreign language (EFL) instruction in a classroom setting. A comprehensive index framework is developed, comprising five primary dimensions: instructional design, instructional materials, teaching methods and approaches, teaching effectiveness, and classroom management. Each dimension is further divided into secondary indicators that capture specific aspects of teaching quality, including pronunciation, content coverage, lesson objectives, and student engagement. The proposed approach uses a convolutional neural network (CNN) architecture optimized by a modified virus colony search (VCS) algorithm to analyze audio and video recordings of classroom interactions. The results demonstrate that the VCS/CNN algorithm can accurately evaluate EFL instruction based on multiple criteria and indicators, outperforming existing methods in terms of accuracy, robustness, flexibility, and efficiency. This study contributes to the development of a reliable and efficient teacher evaluation framework that can provide timely feedback, identify teacher strengths and weaknesses, and inform areas for professional development. The proposed approach has the potential to improve the quality of EFL instruction and administration by enhancing teacher performance and student learning outcomes.

PMID:40312557 | DOI:10.1038/s41598-025-98033-9

Categories: Literature Watch

Deep learning HRNet FCN for blood vessel identification in laparoscopic pancreatic surgery

Thu, 2025-05-01 06:00

NPJ Digit Med. 2025 May 1;8(1):235. doi: 10.1038/s41746-025-01663-6.

ABSTRACT

Laparoscopic pancreatic surgery remains highly challenging due to the complexity of the pancreas and surrounding vascular structures, with risk of injuring critical blood vessels such as the Superior Mesenteric Vein (SMV)-Portal Vein (PV) axis and splenic vein. Here, we evaluated the High Resolution Network (HRNet)-Full Convolutional Network (FCN) model for its ability to accurately identify vascular contours and improve surgical safety. Using 12,694 images from 126 laparoscopic distal pancreatectomy (LDP) videos and 35,986 images from 138 Whipple procedure videos, the model demonstrated robust performance, achieving a mean Dice coefficient of 0.754, a recall of 85.00%, and a precision of 91.10%. By combining datasets from LDP and Whipple procedures, the model showed strong generalization across different surgical contexts and achieved real-time processing speeds of 11 frames per second during surgery process. These findings highlight HRNet-FCN's potential to recognize anatomical landmarks, enhance surgical precision, reduce complications, and improve laparoscopic pancreatic outcomes.

PMID:40312536 | DOI:10.1038/s41746-025-01663-6

Categories: Literature Watch

A human pose estimation network based on YOLOv8 framework with efficient multi-scale receptive field and expanded feature pyramid network

Thu, 2025-05-01 06:00

Sci Rep. 2025 May 1;15(1):15284. doi: 10.1038/s41598-025-00259-0.

ABSTRACT

Deep neural networks are used to accurately detect, estimate, and predict human body poses in images or videos through deep learning-based human pose estimation. However, traditional multi-person pose estimation methods face challenges due to partial occlusions and overlaps between multiple human bodies and body parts. To address these issues, we propose EE-YOLOv8, a human pose estimation network based on the YOLOv8 framework, which integrates Efficient Multi-scale Receptive Field (EMRF) and Expanded Feature Pyramid Network (EFPN). First, the EMRF module is employed to further enhance the model's feature representation capability. Second, the EFPN optimizes cross-level information exchange and improves multi-scale data integration. Finally, Wise-IoU replaces the traditional Intersection over Union (IoU) to improve detection accuracy through precise overlap measurement between predicted and ground-truth bounding boxes. We evaluate EE-YOLOv8 on the MS COCO 2017 dataset. Compared to YOLOv8-Pose, EE-YOLOv8 achieves an AP of 89.0% at an IoU threshold of 0.5 (an improvement of 3.3%) and an AP of 65.6% over the IoU range of 0.5-0.95 (an improvement of 5.8%). Therefore, EE-YOLOv8 achieves the highest accuracy while maintaining the lowest parameter count and computational complexity among all analyzed algorithms. These results demonstrate that EE-YOLOv8 exhibits superior competitiveness compared to other mainstream methods.

PMID:40312474 | DOI:10.1038/s41598-025-00259-0

Categories: Literature Watch

Criminal emotion detection framework using convolutional neural network for public safety

Thu, 2025-05-01 06:00

Sci Rep. 2025 May 1;15(1):15279. doi: 10.1038/s41598-025-97879-3.

ABSTRACT

In the era of rapid societal modernization, the issue of crime stands as an intrinsic facet, demanding our attention and consideration. As our communities evolve and adopt technological advancements, the dynamic landscape of criminal activities becomes an essential aspect that requires careful examination and proactive approaches for public safety application. In this paper, we proposed a collaborative approach to detect crime patterns and criminal emotions with the aim of enhancing judiciary decision-making. For the same, we utilized two standard datasets - a crime dataset comprised of different features of crime. Further, the emotion dataset has 135 classes of emotion that help the AI model to efficiently find criminal emotions. We adopted a convolutional neural network (CNN) to get first trained on crime datasets to bifurcate crime and non-crime images. Once the crime is detected, criminal faces are extracted using the region of interest and stored in a directory. Different CNN architectures, such as LeNet-5, VGGNet, RestNet-50, and basic CNN, are used to detect different emotions of the face. The trained CNN models are used to detect criminal emotion and enhance judiciary decision-making. The proposed framework is evaluated with different evaluation metrics, such as training accuracy, loss, optimizer performance, precision-recall curve, model complexity, training time, and inference time. In crime detection, the CNN model achieves a remarkable accuracy of 92.45% and in criminal emotion detection, LeNet-5 outperforms other CNN architectures by offering an accuracy of 98.6%.

PMID:40312470 | DOI:10.1038/s41598-025-97879-3

Categories: Literature Watch

RaGeoSense for smart home gesture recognition using sparse millimeter wave radar point clouds

Thu, 2025-05-01 06:00

Sci Rep. 2025 May 1;15(1):15267. doi: 10.1038/s41598-025-00065-8.

ABSTRACT

With the growing demand for contactless human-computer interaction in the smart home field, gesture recognition technology shows great market potential. In this paper, a sparse millimeter wave point cloud-based gesture recognition system, RaGeoSense, is proposed, which is designed for smart home scenarios. RaGeoSense effectively improves the recognition performance and system robustness by combining multiple advanced signal processing and deep learning methods. Firstly, the system adopts three methods, namely K-mean clustering straight-through filtering, frame difference filtering and median filtering, to reduce the noise of the raw millimeter wave data, which significantly improves the quality of the point cloud data. Subsequently, the generated point cloud data are processed with sliding sequence sampling and point cloud tiling to extract the spatio-temporal features of the action. To further improve the classification performance, the system proposes an integrated model architecture that combines GBDT and XGBoost for efficient extraction of nonlinear features, and utilizes LSTM gated loop units to classify the gesture sequences, thus realizing the accurate recognition of eight different one-arm gestures. The experimental results show that RaGeoSense performs well at different distances, angles and movement speeds, with an average recognition rate of 95.2%, which is almost unaffected by the differences in personnel and has a certain degree of anti-interference ability.

PMID:40312411 | DOI:10.1038/s41598-025-00065-8

Categories: Literature Watch

A hybrid approach for binary and multi-class classification of voice disorders using a pre-trained model and ensemble classifiers

Thu, 2025-05-01 06:00

BMC Med Inform Decis Mak. 2025 May 1;25(1):177. doi: 10.1186/s12911-025-02978-w.

ABSTRACT

Recent advances in artificial intelligence-based audio and speech processing have increasingly focused on the binary and multi-class classification of voice disorders. Despite progress, achieving high accuracy in multi-class classification remains challenging. This paper proposes a novel hybrid approach using a two-stage framework to enhance voice disorders classification performance, and achieve state-of-the-art accuracies in multi-class classification. Our hybrid approach, combines deep learning features with various powerful classifiers. In the first stage, high-level feature embeddings are extracted from voice data spectrograms using a pre-trained VGGish model. In the second stage, these embeddings are used as input to four different classifiers: Support Vector Machine (SVM), Logistic Regression (LR), Multi-Layer Perceptron (MLP), and an Ensemble Classifier (EC). Experiments are conducted on a subset of the Saarbruecken Voice Database (SVD) for male, female, and combined speakers. For binary classification, VGGish-SVM achieved the highest accuracy for male speakers (82.45% for healthy vs. disordered; 75.45% for hyperfunctional dysphonia vs. vocal fold paresis), while VGGish-EC performed best for female speakers (71.54% for healthy vs. disordered; 68.42% for hyperfunctional dysphonia vs. vocal fold paresis). In multi-class classification, VGGish-SVM outperformed other models, achieving mean accuracies of 77.81% for male speakers, 63.11% for female speakers, and 70.53% for combined genders. We conducted a comparative analysis against related works, including the Mel frequency cepstral coefficient (MFCC), MFCC-glottal features, and features extracted using the wav2vec and HuBERT models with SVM classifier. Results demonstrate that our hybrid approach consistently outperforms these models, especially in multi-class classification tasks. The results show the feasibility of a hybrid framework for voice disorder classification, offering a foundation for refining automated tools that could support clinical assessments with further validation.

PMID:40312383 | DOI:10.1186/s12911-025-02978-w

Categories: Literature Watch

Ge-SAND: an explainable deep learning-driven framework for disease risk prediction by uncovering complex genetic interactions in parallel

Thu, 2025-05-01 06:00

BMC Genomics. 2025 May 1;26(1):432. doi: 10.1186/s12864-025-11588-9.

ABSTRACT

BACKGROUND: Accurate genetic risk prediction and understanding the mechanisms underlying complex diseases are essential for effective intervention and precision medicine. However, current methods often struggle to capture the intricate and subtle genetic interactions contributing to disease risk. This challenge may be further exacerbated by the curse of dimensionality when considering large-scale pairwise genetic combinations with limited samples. Overcoming these limitations could transform biomedicine by providing deeper insights into disease mechanisms, moving beyond black-box models and single-locus analyses, and enabling a more comprehensive understanding of cross-disease patterns.

RESULTS: We developed Ge-SAND (Genomic Embedding Self-Attention Neurodynamic Decoder), an explainable deep learning-driven framework designed to uncover complex genetic interactions at scales exceeding 106 in parallel for accurate disease risk prediction. Ge-SAND leverages genotype and genomic positional information to identify both intra- and interchromosomal interactions associated with disease phenotypes, providing comprehensive insights into pathogenic mechanisms crucial for disease risk prediction. Applied to simulated datasets and UK Biobank cohorts for Crohn's disease, schizophrenia, and Alzheimer's disease, Ge-SAND achieved up to a 20% improvement in AUC-ROC compared to mainstream methods. Beyond its predictive accuracy, through self-attention-based interaction networks, Ge-SAND provided insights into large-scale genotype relationships and revealed genetic mechanisms underlying these complex diseases. For instance, Ge-SAND identified potential genetic interaction pairs, including novel relationships such as ISOC1 and HOMER2, potentially implicating the brain-gut axis in Crohn's and Alzheimer's diseases.

CONCLUSION: Ge-SAND is a novel deep-learning approach designed to address the challenges of capturing large-scale genetic interactions. By integrating disease risk prediction with interpretable insights into genetic mechanisms, Ge-SAND offers a valuable tool for advancing genomic research and precision medicine.

PMID:40312319 | DOI:10.1186/s12864-025-11588-9

Categories: Literature Watch

Machine learning in prediction of epidermal growth factor receptor status in non-small cell lung cancer brain metastases: a systematic review and meta-analysis

Thu, 2025-05-01 06:00

BMC Cancer. 2025 May 1;25(1):818. doi: 10.1186/s12885-025-14221-w.

ABSTRACT

BACKGROUND: Epidermal growth factor receptor (EGFR) mutations are present in 10-60% of all non-small cell lung cancer (NSCLC) patients and are associated with dismal prognosis. Lung cancer brain metastases (LCBM) are a common complication of lung cancer. Predictions of EGFR can help physicians in decision-making and, through optimizing treatment strategies, can result in more favorable outcomes. This systematic review and meta-analysis evaluated the predictive performance of machine learning (ML)-based models in EGFR status in NSCLC patients with brain metastasis.

METHODS: On December 20, 2024, the four electronic databases, Pubmed, Embase, Scopus, and Web of Science, were systematically searched. Studies that evaluated EGFR status in patients with brain metastasis from NSCLC were included.

RESULTS: Twenty studies with 3517 patients with 6205 NSCLC brain metastatic lesions were included. The majority of the best-performance models were ML-based (70%, 7/10), and deep learning (DL)-based models comprised 30% (6/20) of models. The area under the curve (AUC) and accuracy (ACC) of the best-performance models ranged from 0.765 to 1 and 0.69 to 0.93, respectively. The meta-analysis of the best-performance model revealed a pooled AUC of 0.91 (95%CI: 0.88-0.93) and ACC of 0.82 (95%CI: 0.79-0.86) along with a pooled sensitivity of 0.87 (95%CI: 0.83-0.9), specificity of 0.86 (95%CI: 0.79-0.9), and diagnostic odds ratio (DOR) of 35.2 (95%CI: 21.2-58.4). The subgroup analysis did not show significant differences between ML and DL models.

CONCLUSION: ML-based models demonstrated promising predictive outcomes in predicting EGFR status. Applying ML-based models in daily clinical practice can optimize treatment strategies and enhance clinical and radiological outcomes.

PMID:40312289 | DOI:10.1186/s12885-025-14221-w

Categories: Literature Watch

Real-Time, Dual-Physical-Layer Encryption Directly within an Optical Sensor on a Silicon Platform

Thu, 2025-05-01 06:00

ACS Appl Mater Interfaces. 2025 May 1. doi: 10.1021/acsami.5c00535. Online ahead of print.

ABSTRACT

Today, data breaches pose a significant risk, especially those related to image data. Ideally, toward ultimate security, the image encryption should occur at the same time when the image is captured, directly within the sensor. Nonetheless, such optical sensors have not yet been achieved, limited by the physical properties of existing devices. Herein, we demonstrate a pioneer optical sensor that allows real-time, dual-physical-layer encryption directly within the sensor, enabled by the merits of III-nitride nanowires and careful engineering of the photocarrier dynamics within the nanowire heterojunctions. The robustness of the encryption is further tested against deep-learning-assisted cyber-attacks. Self-powered operation is also possible for such devices, representing a reduced energy cost for encryption. Moreover, the sensors are built directly on silicon (Si), making the technology compatible with existing Si electronics platforms. The simple epitaxy process of fabricating such sensors also means reduced time and production costs. This study represents a paradigm shift in image encryption research.

PMID:40310757 | DOI:10.1021/acsami.5c00535

Categories: Literature Watch

LLM-guided Decoupled Probabilistic Prompt for Continual Learning in Medical Image Diagnosis

Thu, 2025-05-01 06:00

IEEE Trans Med Imaging. 2025 May 1;PP. doi: 10.1109/TMI.2025.3566105. Online ahead of print.

ABSTRACT

Deep learning-based traditional diagnostic models typically exhibit limitations when applied to dynamic clinical environments that require handling the emergence of new diseases. Continual learning (CL) offers a promising solution, aiming to learn new knowledge while preserving previously learned knowledge. Though recent rehearsal-free CL methods employing prompt tuning (PT) have shown promise, they rely on deterministic prompts that struggle to handle diverse fine-grained knowledge. Moreover, existing PT methods utilize randomly initialized prompts that are trained under standard classification constraints, impeding expert knowledge integration and optimal performance acquisition. In this paper, we propose an LLM-guided Decoupled Probabilistic Prompt (LDPP) for Continual Learning in medical image diagnosis. Specifically, we develop an Expert Knowledge Generation (EKG) module that leverages LLM to acquire decoupled expert knowledge and comprehensive category descriptions. Then, we introduce a Decoupled Probabilistic Prompt pool (DePP) to construct a shared decoupled probabilistic prompt pool, which constructs a shared prompt pool with probabilistic prompts derived from the expert knowledge set. These prompts dynamically provide diverse and flexible descriptions for input images. Finally, We design a Steering Prompt Pool (SPP) to enhance intra-class compactness and promote model performance by learning nonshared prompts. With extensive experimental validation, LDPP consistently sets state-of-the-art performance under the challenging class-incremental setting in CL. Code is available at: https://github.com/CUHK-AIM-Group/LDPP.

PMID:40310742 | DOI:10.1109/TMI.2025.3566105

Categories: Literature Watch

Digital Staining with Knowledge Distillation: A Unified Framework for Unpaired and Paired-But-Misaligned Data

Thu, 2025-05-01 06:00

IEEE Trans Med Imaging. 2025 May 1;PP. doi: 10.1109/TMI.2025.3565329. Online ahead of print.

ABSTRACT

Staining is essential in cell imaging and medical diagnostics but poses significant challenges, including high cost, time consumption, labor intensity, and irreversible tissue alterations. Recent advances in deep learning have enabled digital staining through supervised model training. However, collecting large-scale, perfectly aligned pairs of stained and unstained images remains difficult. In this work, we propose a novel unsupervised deep learning framework for digital cell staining that reduces the need for extensive paired data using knowledge distillation. We explore two training schemes: (1) unpaired and (2) pairedbut- misaligned settings. For the unpaired case, we introduce a two-stage pipeline, comprising light enhancement followed by colorization, as a teacher model. Subsequently, we obtain a student staining generator through knowledge distillation with hybrid non-reference losses. To leverage the pixel-wise information between adjacent sections, we further extend to the paired-but-misaligned setting, adding the Learning to Align module to utilize pixel-level information. Experiment results on our dataset demonstrate that our proposed unsupervised deep staining method can generate stained images with more accurate positions and shapes of the cell targets in both settings. Compared with competing methods, our method achieves improved results both qualitatively and quantitatively (e.g., NIQE and PSNR). We applied our digital staining method to the White Blood Cell (WBC) dataset, investigating its potential for medical applications.

PMID:40310741 | DOI:10.1109/TMI.2025.3565329

Categories: Literature Watch

Reconstructing and predicting stochastic dynamical systems using probabilistic deep learning

Thu, 2025-05-01 06:00

Chaos. 2025 May 1;35(5):053102. doi: 10.1063/5.0248312.

ABSTRACT

Stochastic effects introduce significant uncertainty into dynamical systems, making the data-driven reconstruction and prediction of these systems highly complex. This study incorporates uncertainty learning into a deep learning model for time-series prediction, proposing a deep stochastic time-delay embedding model to improve prediction accuracy and robustness. First, this model constructs a deep probabilistic catcher to capture uncertainty information in the reconstructed mappings. These uncertainty representations are then integrated as meta-information into the reconstruction process of time-delay embedding, enabling it to fully capture system stochasticity and predict target variables over multiple time steps. Finally, the model is validated on both the Lorenz system and real-world datasets, demonstrating superior performance compared to existing methods, with robust results under noisy conditions.

PMID:40310707 | DOI:10.1063/5.0248312

Categories: Literature Watch

Transformer-based Koopman autoencoder for linearizing Fisher's equation

Thu, 2025-05-01 06:00

Chaos. 2025 May 1;35(5):053101. doi: 10.1063/5.0244221.

ABSTRACT

A transformer-based Koopman autoencoder is proposed for linearizing Fisher's reaction-diffusion equation. The primary focus of this study is on using deep learning techniques to find complex spatiotemporal patterns in the reaction-diffusion system. The emphasis is on not just solving the equation but also transforming the system's dynamics into a more comprehensible, linear form. Global coordinate transformations are achieved through the autoencoder, which learns to capture the underlying dynamics by training on a data set with 60,000 initial conditions. Extensive testing on multiple data sets was used to assess the efficacy of the proposed model, demonstrating its ability to accurately predict the system's evolution as well as to generalize. We provide a thorough comparison study, comparing our suggested design to a few other comparable methods using experiments on various PDEs, such as the Kuramoto-Sivashinsky equation and Burger's equation. Results show improved accuracy, highlighting the capabilities of the transformer-based Koopman autoencoder. The proposed architecture is significantly ahead of other architectures, in terms of solving different types of PDEs using a single architecture. Our method relies entirely on the data, without requiring any knowledge of the underlying equations. This makes it applicable to even the data sets where the governing equations are not known.

PMID:40310706 | DOI:10.1063/5.0244221

Categories: Literature Watch

Deep Learning Model of Primary Tumor and Metastatic Cervical Lymph Nodes From CT for Outcome Predictions in Oropharyngeal Cancer

Thu, 2025-05-01 06:00

JAMA Netw Open. 2025 May 1;8(5):e258094. doi: 10.1001/jamanetworkopen.2025.8094.

ABSTRACT

IMPORTANCE: Primary tumor (PT) and metastatic cervical lymph node (LN) characteristics are highly associated with oropharyngeal squamous cell carcinoma (OPSCC) prognosis. Currently, there is a lack of studies to combine imaging characteristics of both regions for predictions of p16+ OPSCC outcomes.

OBJECTIVES: To develop and validate a computed tomography (CT)-based deep learning classifier that integrates PT and LN features to predict outcomes in p16+ OPSCC and to identify patients with stage I disease who may derive added benefit associated with chemotherapy.

DESIGN, SETTING, AND PARTICIPANTS: In this retrospective prognostic study, radiographic CT scans were analyzed of 811 patients with p16+ OPSCC treated with definitive radiotherapy or chemoradiotherapy from 3 independent cohorts. One cohort from the Cancer Imaging Archive (1998-2013) was used for model development and validation and the 2 remaining cohorts (2002-2015) were used to externally test the model performance. The Swin Transformer architecture was applied to fuse the features from both PT and LN into a multiregion imaging risk score (SwinScore) to predict survival outcomes across and within subpopulations at various stages. Data analysis was performed between February and July 2024.

EXPOSURES: Definitive radiotherapy or chemoradiotherapy treatment for patients with p16+ OPSCC.

MAIN OUTCOMES AND MEASURES: Hazard ratios (HRs), log-rank tests, concordance index (C index), and net benefit were used to evaluate the associations between multiregion imaging risk score and disease-free survival (DFS), overall survival (OS), and locoregional failure (LRF). Interaction tests were conducted to assess whether the association of chemotherapy with outcome significantly differs across dichotomized multiregion imaging risk score subgroups.

RESULTS: The total patient cohort comprised 811 patients with p16+ OPSCC (median age, 59.0 years [IQR, 47.4-70.6 years]; 683 men [84.2%]). In the external test set, the multiregion imaging risk score was found to be prognostic of DFS (HR, 3.76 [95% CI, 1.99-7.10]; P < .001), OS (HR, 4.80 [95% CI, 2.22-10.40]; P < .001), and LRF (HR, 4.47 [95% CI, 1.43-14.00]; P = .01) among all patients with p16+ OPSCC. The multiregion imaging risk score, integrating both PT and LN information, demonstrated a higher C index (0.63) compared with models focusing solely on PT (0.61) or LN (0.58). Chemotherapy was associated with improved DFS only among patients with high scores (HR, 0.09 [95% CI, 0.02-0.47]; P = .004) but not those with low scores (HR, 0.83 [95% CI, 0.32-2.10]; P = .69).

CONCLUSIONS AND RELEVANCE: This prognostic study of p16+ OPSCC describes the development of a CT-based imaging risk score integrating PT and metastatic cervical LN features to predict recurrence risk and identify suitable candidates for treatment tailoring. This tool could optimize treatment modulations of p16+ OPSCC at a highly granular level.

PMID:40310642 | DOI:10.1001/jamanetworkopen.2025.8094

Categories: Literature Watch

PhacoTrainer: Automatic Artificial Intelligence-Generated Performance Ratings for Cataract Surgery

Thu, 2025-05-01 06:00

Transl Vis Sci Technol. 2025 May 1;14(5):2. doi: 10.1167/tvst.14.5.2.

ABSTRACT

PURPOSE: To investigate whether cataract surgical skill performance metrics automatically generated by artificial intelligence (AI) models can differentiate between trainee and faculty surgeons and the correlation between AI metrics and expert-rated skills.

METHODS: Routine cataract surgical videos from residents (N = 28) and attendings (N = 29) were collected. Three video-level metrics were generated by deep learning models: phacoemulsification probe decentration, eye decentration, and zoom level change. Three types of instrument- and landmark- specific metrics were generated for the limbus, pupil, and various surgical instruments: total path length, maximum velocity, and area. Expert human judges assessed the surgical videos using the Objective Structured Assessment of Cataract Surgical Skill (OSACSS). Statistical differences between AI and human-rated scores between attending surgeons and trainees were assessed using t-tests, and the correlations between them were examined by Pearson correlation coefficients.

RESULTS: The phacoemulsification probe had significantly lower total path lengths, maximum velocities, and area metrics in attending videos. Attending surgeons demonstrated better phacoemulsification centration and eye centration. Most AI metrics negatively correlated with OSACSS scores, including phacoemulsification decentration (r = -0.369) and eye decentration (r = -0.394). OSACSS subitems related to eye centration and different steps of surgery also exhibited significant negative correlations with corresponding AI metrics (r ranging from -0.77 to -0.49).

CONCLUSIONS: Automatically generated AI metrics can be used to differentiate between attending and trainee surgeries and correlate with the human expert evaluation on surgical performance.

TRANSLATIONAL RELEVANCE: AI-generated useful metrics that correlate with surgeon skill may be useful for improving cataract surgical education.

PMID:40310637 | DOI:10.1167/tvst.14.5.2

Categories: Literature Watch

Age-Related Regional Changes in Choroidal Vascularity in Healthy Emmetropic Eyes

Thu, 2025-05-01 06:00

Transl Vis Sci Technol. 2025 May 1;14(5):3. doi: 10.1167/tvst.14.5.3.

ABSTRACT

PURPOSE: This retrospective cross-sectional study examined regional changes in choroidal vascularity index (CVI) with physiological aging in healthy emmetropes.

METHODS: Deep learning methods were used for segmentation and binarization of enhanced depth imaging optical coherence tomography images of the choroid collected from 280 healthy emmetropic subjects (mean spherical equivalent refraction: +0.39 ± 0.38 D), including 83 children (5-12 years), 77 adolescents (13-17 years), and 120 adults (18-41 years). The CVI, calculated as the ratio of luminal versus total choroidal area (in percent), and luminal and stromal choroidal thickness were measured across the 5-mm horizontal macular region centered on the fovea. Linear mixed models were used to examine age-related regional changes in the choroid while controlling for gender and imaging time of day.

RESULTS: The macular CVI reduced significantly from childhood (65% ± 0.5%) and adolescence (63% ± 0.5%) to adulthood (59% ± 0.4%) (P < 0.001). Significant regional variations were observed (P < 0.001) with the CVI increasing from the fovea (61% ± 0.3%) toward the perifovea (64% ± 0.3%) and from the temporal (61.4% ± 0.3%) toward the nasal hemiretina (63% ± 0.3%). The age-related decrease in the CVI was greater in the nasal (-7% ± 0.7%) than the temporal (-6% ± 0.7%) macula (P = 0.014) and was associated with a significant nasal stromal thickening (45 ± 5 µm; P < 0.001) and temporal luminal thinning (-16 ± 6 µm; P = 0.033) from childhood to adulthood.

CONCLUSIONS: Physiological aging was associated with a significant region-dependent decline in the CVI driven, primarily by stromal thickening in the nasal and luminal thinning in the temporal macula.

TRANSLATIONAL RELEVANCE: These age-related changes in the CVI provide new insights into the physiological morphology of the choroid during aging and may aid clinicians in understanding the spatial and age-associated predilections of certain chorioretinal diseases.

PMID:40310636 | DOI:10.1167/tvst.14.5.3

Categories: Literature Watch

Reirradiation for recurrent glioblastoma: the significance of the residual tumor volume

Thu, 2025-05-01 06:00

J Neurooncol. 2025 May 1. doi: 10.1007/s11060-025-05042-9. Online ahead of print.

ABSTRACT

PURPOSE: Recurrent glioblastoma has a poor prognosis, and its optimal management remains unclear. Reirradiation (re-RT) is a promising treatment option, but long-term outcomes and optimal patient selection criteria are not well established.

METHODS: This study analyzed 71 patients with recurrent CNS WHO grade 4, IDHwt glioblastoma (GBM) who underwent re-RT at the University of Erlangen-Nuremberg between January 2009 and June 2019. Imaging follow-ups were conducted every 3 months. Progression-free survival (PFS) was defined using RANO criteria. Outcomes, feasibility, and toxicity of re-RT were evaluated. Contrast-enhancing tumor volume was measured using a deep learning auto-segmentation pipeline with expert validation and jointly evaluated with clinical and molecular-pathologic factors.

RESULTS: Most patients were prescribed conventionally fractionated re-RT (84.5%) with 45 Gy in 1.8 Gy fractions, combined with temozolomide (TMZ, 49.3%) or lomustine (CCNU, 12.7%). Re-RT was completed as planned in 94.4% of patients. After a median follow-up of 73.8 months, 88.7% of patients had died. The median overall survival was 9.6 months, and the median progression-free survival was 5.3 months. Multivariate analysis identified residual contrast-enhancing tumor volume at re-RT (HR 1.040 per cm3, p < 0.001) as the single dominant predictor of overall survival.

CONCLUSION: Conventional fractionated re-RT is a feasible and effective treatment for recurrent high-grade glioma. The significant prognostic impact of residual tumor volume highlights the importance of combining maximum-safe resection with re-RT for improved outcomes.

PMID:40310485 | DOI:10.1007/s11060-025-05042-9

Categories: Literature Watch

Pages