Deep learning

ATLASS: An AnaTomicaLly-Aware Self-Supervised Learning Framework for Generalizable Retinal Disease Detection

Wed, 2025-08-06 06:00

IEEE J Biomed Health Inform. 2025 Aug 6;PP. doi: 10.1109/JBHI.2025.3595697. Online ahead of print.

ABSTRACT

Medical imaging, particularly retinal fundus photography, plays a crucial role in early disease detection and treatment for various ocular disorders. However, the development of robust diagnostic systems using deep learning remains constrained by the scarcity of expertly annotated data, which is time-consuming and expensive. Self-Supervised Learning (SSL) has emerged as a promising solution, but existing models fail to effectively incorporate critical domain knowledge specific to retinal anatomy. This potentially limits their clinical relevance and diagnostic capability. We address this issue by introducing an anatomically aware SSL framework that strategically integrates domain expertise through specialized masking of vital retinal structures during pretraining. Our approach leverages vessel and optic disc segmentation maps to guide the SSL process, enabling the development of clinically relevant feature representations without extensive labeled data. The framework combines a Vision Transformer with dual-masking strategies and anatomically informed loss functions to preserve structural integrity during feature learning. Comprehensive evaluation across multiple datasets demonstrates our method's competitive performance in diverse retinal disease classification tasks, including diabetic retinopathy grading, glaucoma detection, age-related macular degeneration identification, and multi-disease classification. The evaluation results establish the effectiveness of anatomically-aware SSL in advancing automated retinal disease diagnosis while addressing the fundamental challenge of limited labeled medical data.

PMID:40768461 | DOI:10.1109/JBHI.2025.3595697

Categories: Literature Watch

Transformer-Based Deep Learning Approaches for Speech-Based Dementia Detection: A Systematic Review

Wed, 2025-08-06 06:00

IEEE J Biomed Health Inform. 2025 Aug 6;PP. doi: 10.1109/JBHI.2025.3595781. Online ahead of print.

ABSTRACT

As the population of older adults continues growing, so will the need for cost-effective approaches to early dementia detection. Deep learning approaches using patient speech samples show promising results. This systematic review examines studies utilizing speech-based deep learning for dementia diagnosis with the objective of identifying best practices for future data-driven dementia research. Studies researching speech-based deep learning for dementia were obtained from PubMed, Wiley Library, Science Direct, IEEE, Web of Science, Google Scholar, and arXiv. 80 studies were reviewed. Studies were analyzed in terms of model architecture and performance, speech features employed, and databases used. We observed that transformer-based approaches were most frequent, achieving an average accuracy of 85.71%, and that linguistic features outperform acoustic features. Our review identified several limitations, including a lack of dataset diversity, inconsistent classification of dementia severity levels across studies, and variability in how sample sizes and model performance metrics (e.g., accuracy, sensitivity, specificity) are reported. These inconsistencies hinder direct comparisons between studies and limit the reproducibility of findings. Still, our findings suggest that incorporation of transformers into current speech-based deep learning models can further improve detection of cognitive impairment. Consideration of our observations in future data-driven dementia research will lead to advancements in the development of diagnostic decision support systems for clinical practice.

PMID:40768460 | DOI:10.1109/JBHI.2025.3595781

Categories: Literature Watch

Real-World Adversarial Defense against Patch Attacks based on Diffusion Model

Wed, 2025-08-06 06:00

IEEE Trans Pattern Anal Mach Intell. 2025 Aug 6;PP. doi: 10.1109/TPAMI.2025.3596462. Online ahead of print.

ABSTRACT

Adversarial patches present significant challenges to the robustness of deep learning models, making the development of effective defenses become critical for real-world applications. This paper introduces DIFFender, a novel DIFfusion-based DeFender framework that leverages the power of a text-guided diffusion model to counter adversarial patch attacks. At the core of our approach is the discovery of the Adversarial Anomaly Perception (AAP) phenomenon, which enables the diffusion model to accurately detect and locate adversarial patches by analyzing distributional anomalies. DIFFender seamlessly integrates the tasks of patch localization and restoration within a unified diffusion model framework, enhancing defense efficacy through their close interaction. Additionally, DIFFender employs an efficient few-shot prompt-tuning algorithm, facilitating the adaptation of the pre-trained diffusion model to defense tasks without the need for extensive retraining. Our comprehensive evaluation, covering image classification and face recognition tasks, as well as real-world scenarios, demonstrates DIFFender's robust performance against adversarial attacks. The framework's versatility and generalizability across various settings, classifiers, and attack methodologies mark a significant advancement in adversarial patch defense strategies. Except for the popular visible domain, we have identified another advantage of DIFFender: its capability to easily expand into the infrared domain. Consequently, we demonstrate the good flexibility of DIFFender, which can defend against both infrared and visible adversarial patch attacks alternatively using a universal defense framework.

PMID:40768456 | DOI:10.1109/TPAMI.2025.3596462

Categories: Literature Watch

Artificial Intelligence Iterative Reconstruction Algorithm Combined with Low-Dose Aortic CTA for Preoperative Access Assessment of Transcatheter Aortic Valve Implantation: A Prospective Cohort Study

Wed, 2025-08-06 06:00

J Imaging Inform Med. 2025 Aug 6. doi: 10.1007/s10278-025-01622-3. Online ahead of print.

ABSTRACT

This study aimed to explore whether an artificial intelligence iterative reconstruction (AIIR) algorithm combined with low-dose aortic computed tomography angiography (CTA) demonstrates clinical effectiveness in assessing preoperative access for transcatheter aortic valve implantation (TAVI). A total of 109 patients were prospectively recruited for aortic CTA scans and divided into two groups: group A (n = 51) with standard-dose CT examinations (SDCT) and group B (n = 58) with low-dose CT examinations (LDCT). Group B was further subdivided into groups B1 and B2. Groups A and B2 used the hybrid iterative algorithm (HIR: Karl 3D), whereas Group B1 used the AIIR algorithm. CT attenuation and noise of different vessel segments were measured, and the contrast-to-noise ratio (CNR) and signal-to-noise ratio (SNR) were calculated. Two radiologists, who were blinded to the study details, rated the subjective image quality on a 5-point scale. The effective radiation doses were also recorded for groups A and B. Group B1 demonstrated the highest CT attenuation, SNR, and CNR and the lowest image noise among the three groups (p < 0.05). The scores of subjective image noise, vessel and non-calcified plaque edge sharpness, and overall image quality in Group B1 were higher than those in groups A and B2 (p < 0.001). Group B2 had the highest artifacts scores compared with groups A and B1 (p < 0.05). The radiation dose in group B was reduced by 50.33% compared with that in group A (p < 0.001). The AIIR algorithm combined with low-dose CTA yielded better diagnostic images before TAVI than the Karl 3D algorithm.

PMID:40768017 | DOI:10.1007/s10278-025-01622-3

Categories: Literature Watch

Improved early-stage crop classification using a novel fusion-based machine learning approach with Sentinel-2A and Landsat 8-9 data

Wed, 2025-08-06 06:00

Environ Monit Assess. 2025 Aug 6;197(9):982. doi: 10.1007/s10661-025-14420-9.

ABSTRACT

Crop classification during the early stages is challenging because of the striking similarity in spectral and texture features among various crops. To improve classification accuracy, this study proposes a novel fusion-based deep learning approach. The approach integrates textural and spectral features from a fused dataset generated by merging Landsat 8-9 and Sentinel-2A data using the Gram-Schmidt fusion approach. The textural features were extracted using the multi-patch Gray Level Co-occurrence Matrix (GLCM) technique. The spectral features, namely the Enhanced Vegetation Index (EVI) and Normalized Difference Vegetation Index (NDVI), were obtained using the spectral index method. The five machine learning methods (deep neural network, 1D convolutional neural network, decision tree, support vector machine, and random forest) were trained using textural and spectral parameters to develop classifiers. The proposed approach achieves promising results using deep neural network (DNN), with an accuracy of 0.89, precision of 0.88, recall of 0.91, and F1-score of 0.90. These results demonstrate the effectiveness of the fusion-based deep learning approach in enhancing classification accuracy for early-stage crops.

PMID:40767980 | DOI:10.1007/s10661-025-14420-9

Categories: Literature Watch

Exploration of Fully-Automated Body Composition Analysis Using Routine CT-Staging of Lung Cancer Patients for Survival Prognosis

Wed, 2025-08-06 06:00

J Cachexia Sarcopenia Muscle. 2025 Aug;16(4):e70021. doi: 10.1002/jcsm.70021.

ABSTRACT

BACKGROUND: AI-driven automated body composition analysis (BCA) may provide quantitative prognostic biomarkers derived from routine staging CTs. This two-centre study evaluates the prognostic value of these volumetric markers for overall survival in lung cancer patients.

METHODS: Lung cancer cohorts from Hospital A (n = 3345, median age 65, 86% NSCLC, 40% M1, 40% female) and B (n = 1364, median age 66, 87% NSCLC, 37% M1, 38% female) underwent automated BCA of abdominal CTs ±60 days of primary diagnosis. A deep learning network segmented muscle, bone and adipose tissues (visceral = VAT, subcutaneous = SAT, intra-/intermuscular = IMAT and total = TAT) to derive three markers: Sarcopenia Index (SI = Muscle/Bone), Myosteatotic Fat Index (MFI = IMAT/TAT) and Abdominal Fat Index (AFI = VAT/SAT). Kaplan-Meier survival analysis, Cox proportional hazards modelling and machine learning-based survival prediction were performed. A survival model including clinical data (BMI, ECOG, L3-SMI, -SATI, -VATI and -IMATI) was fitted on Hospital A data and validated on Hospital B data.

RESULTS: In nonmetastatic NSCLC, high SI predicted longer survival across centres for males (Hospital A: 24.6 vs. 46.0 months; Hospital B: 13.3 vs. 28.9 months; both p < 0.001) and females (Hospital A: 37.9 vs. 53.6 months, p = 0.008; Hospital B: 23.0 vs. 28.6 months, p = 0.018). High MFI indicated reduced survival in males at both hospitals (Hospital A: 43.7 vs. 28.2 months; Hospital B: 28.8 vs. 14.3 months; both p ≤ 0.001) but showed center-dependent effects in females (significant only in Hospital A, p < 0.01). In metastatic disease, SI remained prognostic for males at both centres (p < 0.05), while MFI was significant only in Hospital A (p ≤ 0.001) and AFI only in Hospital B (p = 0.042). Multivariate Cox regression confirmed that higher SI was protective (A: HR 0.53, B: 0.59, p ≤ 0.001), while MFI was associated with shorter survival (A: HR 1.31, B: 1.12, p < 0.01). The multivariate survival model trained on Hospital A's data demonstrated prognostic differentiation of groups in internal (n = 209, p ≤ 0.001) and external (Hospital B, n = 361, p = 0.044) validation, with SI feature importance (0.037) ranking below ECOG (0.082) and M-status (0.078), outperforming all other features including conventional L3-single-slice measurements.

CONCLUSION: CT-based volumetric BCA provides prognostic biomarkers in lung cancer with varying significance by sex, disease stage and centre. SI was the strongest prognostic marker, outperforming conventional L3-based measurements, while fat-related markers showed varying associations. Our multivariate model suggests that BCA markers, particularly SI, may enhance risk stratification in lung cancer, pending centre-specific and sex-specific validation. Integration of these markers into clinical workflows could enable personalized care and targeted interventions for high-risk patients.

PMID:40767951 | DOI:10.1002/jcsm.70021

Categories: Literature Watch

Artificial intelligence and digital health in vascular surgery: a 2-decade bibliometric analysis of research landscapes and evolving frontiers

Wed, 2025-08-06 06:00

J Robot Surg. 2025 Aug 6;19(1):453. doi: 10.1007/s11701-025-02583-z.

ABSTRACT

To analyze the structural and temporal evolution of artificial intelligence (AI) and digital health applications in vascular surgery over the past two decades, identifying historical development trajectories, research focal points, and emerging frontiers. Publications on AI and digital health applications in vascular surgery were retrieved from WoSCC. Analyzed through CiteSpace and HistCite to track temporal development, thematic shifts, and innovation patterns within the domain. Active themes have emerged over time, with 123 related disciplines, 505 keywords, and 675 outbreak papers cited. Keyword clustering anchors seven emerging research subfields, namely #0 deep learning, #2 machine learning, #3 peripheral arterial disease, #4 renal cell carcinoma, #5 aortic aneurysm, #6 pulmonary embolism, #7nanocarrier. The alluvial map indicates that the most enduring research concepts within the domain include bypass, revascularisation, and others, while emerging keywords consist of chronic limb-threatening ischemia and peripheral vascular intervention, among others. Reference clustering identifies seven recent subfields of research: nephrectomy #0, force #1, artificial intelligence #2, navigation #4, prediction #5, augmented reality #9, and telemedicine #13. This study provides a comprehensive mapping of AI and digital health adoption in vascular surgery, delineating paradigm shifts from traditional surgical techniques to computational prediction models and intelligent intervention systems. The findings establish foundational references for prioritizing research investments and developing standardized evaluation metrics for emerging technologies.

PMID:40767924 | DOI:10.1007/s11701-025-02583-z

Categories: Literature Watch

Fully Automated Anatomy Labeling for Intracardiac Echocardiography Using Deep Learning

Wed, 2025-08-06 06:00

JACC Clin Electrophysiol. 2025 Jul 17:S2405-500X(25)00471-2. doi: 10.1016/j.jacep.2025.06.009. Online ahead of print.

ABSTRACT

Intracardiac echocardiography (ICE) is increasingly being used to guide electrophysiologic (EP) procedures but requires a considerable learning curve. ICE images collected from 2 separate institutions (605 EP procedures, 196,768 images) were used to develop an automated deep learning-based algorithm to detect anatomic structures from the right atrium. Fifteen of 21 anatomic structures were correctly identified with >70% precision and recall. Mislabeling of one anatomic structure for another was rare. This fully automated anatomy labeling algorithm can serve as an education tool or can be used as a navigation tool to guide ICE operators in EP procedures.

PMID:40767798 | DOI:10.1016/j.jacep.2025.06.009

Categories: Literature Watch

Artificial intelligence in the diagnosis and management of dysphagia: a scoping review

Wed, 2025-08-06 06:00

Codas. 2025 Aug 8;37(4):e20240305. doi: 10.1590/2317-1782/e20240305en. eCollection 2025.

ABSTRACT

PURPOSE: This scoping review aimed to map and synthesize evidence on technological advancements using Artificial Intelligence in the diagnosis and management of dysphagia. We followed the PRISMA guidelines and those of the Joanna Briggs Institute, focusing on research about technological innovations in dysphagia.

RESEARCH STRATEGIES: The protocol was registered on the Open Science Framework platform. The databases consulted included EMBASE, Latin American and Caribbean Health Sciences Literature (LILACS), Livivo, PubMed/Medline, Scopus, Cochrane Library, Web of Science, and grey literature.

SELECTION CRITERIA: The acronym 'PCC' was used to consider the eligibility of studies for this review.

DATA ANALYSIS: After removing duplicates, 56 articles were initially selected. A subsequent update resulted in 205 articles, of which 61 were included after applying the selection criteria.

RESULTS: Videofluoroscopy of swallowing was used as the reference examination in most studies. Regarding the underlying diseases present in the patients who participated in the studies, there was a predominance of various neurological conditions. The algorithms used varied across the categories of Machine Learning, Deep Learning, and Computer Vision, with a predominance in the use of Deep Learning.

CONCLUSION: Technological advancements in artificial intelligence for the diagnosis and management of dysphagia have been mapped, highlighting the predominance and applicability of Deep Learning in examinations such as videofluoroscopy. The findings suggest significant potential to improve diagnostic accuracy and clinical management effectiveness, particularly in neurological patients. Identified research gaps require further investigations to solidify the clinical applicability and impact of these technologies.

PMID:40767676 | DOI:10.1590/2317-1782/e20240305en

Categories: Literature Watch

Automated Deep Learning-based Segmentation of the Dentate Nucleus Using Quantitative Susceptibility Mapping MRI

Wed, 2025-08-06 06:00

Radiol Artif Intell. 2025 Aug 6:e240478. doi: 10.1148/ryai.240478. Online ahead of print.

ABSTRACT

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop a dentate nucleus (DN) segmentation tool using deep learning (DL) applied to brain MRI-based quantitative susceptibility mapping (QSM) images. Materials and Methods Brain QSM images from healthy controls and individuals with cerebellar ataxia or multiple sclerosis were collected from nine different datasets (2016-2023) worldwide for this retrospective study (ClinicalTrials.gov Identifier: NCT04349514). Manual delineation of the DN was performed by experienced raters. Automated segmentation performance was evaluated against manual reference segmentations following training with several DL architectures. A two-step approach was used, consisting of a localization model followed by DN segmentation. Performance metrics included intraclass correlation coefficient (ICC), Dice score, and Pearson correlation coefficient. Results The training and testing datasets comprised 328 individuals (age range, 11-64 years; 171 female), including 141 healthy individuals and 187 with cerebellar ataxia or multiple sclerosis. The manual tracing protocol produced reference standards with high intrarater (average ICC 0.91) and interrater reliability (average ICC 0.78). Initial DL architecture exploration indicated that the nnU-Net framework performed best. The two-step localization plus segmentation pipeline achieved a Dice score of 0.90 ± 0.03 and 0.89 ± 0.04 for left and right DN segmentation, respectively. In external testing, the proposed algorithm outperformed the current leading automated tool (mean Dice scores for left and right DN: 0.86 ± 0.04 vs 0.57 ± 0.22, P < .001; 0.84 ± 0.07 vs 0.58 ± 0.24, P < .001). The model demonstrated generalizability across datasets unseen during the training step, with automated segmentations showing high correlation with manual annotations (left DN: r = 0.74; P < .001; right DN: r = 0.48; P = .03). Conclusion The proposed model accurately and efficiently segmented the DN from brain QSM images. The model is publicly available (https://github.com/art2mri/DentateSeg). ©RSNA, 2025.

PMID:40767617 | DOI:10.1148/ryai.240478

Categories: Literature Watch

Segmenting Whole-Body MRI and CT for Multiorgan Anatomic Structure Delineation

Wed, 2025-08-06 06:00

Radiol Artif Intell. 2025 Aug 6:e240777. doi: 10.1148/ryai.240777. Online ahead of print.

ABSTRACT

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop and validate MRSegmentator, a retrospective cross-modality deep learning model for multiorgan segmentation of MRI scans. Materials and Methods This retrospective study trained MRSegmentator on 1,200 manually annotated UK Biobank Dixon MRI sequences (50 participants), 221 in-house abdominal MRI sequences (177 patients), and 1228 CT scans from the TotalSegmentator-CT dataset. A human-in-the-loop annotation workflow leveraged cross-modality transfer learning from an existing CT segmentation model to segment 40 anatomic structures. The model's performance was evaluated on 900 MRI sequences from 50 participants in the German National Cohort (NAKO), 60 MRI sequences from AMOS22 dataset, and 29 MRI sequences from TotalSegmentator-MRI. Reference standard manual annotations were used for comparison. Metrics to assess segmentation quality included Dice Similarity Coefficient (DSC). Statistical analyses included organ-and sequence-specific mean ± SD reporting and two-sided t tests for demographic effects. Results 139 participants were evaluated; demographic information was available for 70 (mean age 52.7 years ± 14.0 [SD], 36 female). Across all test datasets, MRSegmentator demonstrated high class wise DSC for well-defined organs (lungs: 0.81-0.96, heart: 0.81-0.94) and organs with anatomic variability (liver: 0.82-0.96, kidneys: 0.77-0.95). Smaller structures showed lower DSC (portal/splenic veins: 0.64-0.78, adrenal glands: 0.56-0.69). The average DSC on the external testing using NAKO data, ranged from 0.85 ± 0.08 for T2-HASTE to 0.91 ± 0.05 for in-phase sequences. The model generalized well to CT, achieving mean DSC of 0.84 ± 0.12 on AMOS CT data. Conclusion MRSegmentator accurately segmented 40 anatomic structures on MRI and generalized to CT; outperforming existing open-source tools. Published under a CC BY 4.0 license.

PMID:40767616 | DOI:10.1148/ryai.240777

Categories: Literature Watch

Integrating Physics-Based Simulations with Data-Driven Deep Learning Represents a Robust Strategy for Developing Inhibitors Targeting the Main Protease

Wed, 2025-08-06 06:00

J Chem Inf Model. 2025 Aug 6. doi: 10.1021/acs.jcim.5c01307. Online ahead of print.

ABSTRACT

The coronavirus main protease, essential for viral replication, is a well-validated antiviral target. Here, we present Deep-CovBoost, a computational pipeline integrating deep learning with free energy perturbation (FEP) simulations to guide the structure-based optimization of inhibitors targeting the coronavirus main protease. Starting from a reported noncovalent inhibitor, the pipeline generated and prioritized analogs using predictive modeling, followed by rigorous validation through FEP and molecular dynamics simulations. This approach led to the identification of optimized compounds (e.g., I3C-1, I3C-2, I3C-35) that enhance binding affinity by engaging the underexploited S4 and S5 subpockets. These results highlight the potential of combining physics-based and AI-driven approaches to accelerate lead optimization and antiviral design.

PMID:40767530 | DOI:10.1021/acs.jcim.5c01307

Categories: Literature Watch

Harnessing artificial intelligence to advance insights in systemic sclerosis skin and lung disease

Wed, 2025-08-06 06:00

Curr Opin Rheumatol. 2025 Aug 7. doi: 10.1097/BOR.0000000000001114. Online ahead of print.

ABSTRACT

PURPOSE OF REVIEW: The purpose of this review is to summarize the uses of artificial intelligence for advancing systemic sclerosis (SSc) skin and lung disease research through 2024.

RECENT FINDINGS: Applications of AI in SSc research have expanded markedly in recent years. The most common artificial intelligence method identified was supervised machine learning for predictive modeling. Supervised machine learning uses input data labeled with a known outcome to train a model to predict outcomes when encountering new data. Using machine learningassisted feature selection and posttraining feature importance techniques also highlighted key predictors within complex datasets, informing possible mechanisms underlying heterogeneous patient outcomes. Additionally, unsupervised machine learning approaches have been used to identify patient subsets with distinct clinical trajectories. Unsupervised machine learning identifies groups with similar characteristics within a dataset, without considering a specific outcome. Digital image analysis using deep learning has also been undertaken in lung imaging studies to quantify interstitial lung disease (ILD) extent and automate ILD subtype classification, as well as skin biopsy analysis to quantify histologic changes. These scalable tools could efficiently automate prognostic assessments for use across centers of varying local expertise.

SUMMARY: Artificial intelligence represents a tool for analyzing high-dimensional, complex datasets to derive robust results, even within relatively small SSc cohorts. To date, artificial intelligence driven insights to SSc skin and lung disease have focused on identifying patient subsets, quantifying disease severity, and building predictive models to inform personalized patient care.

PMID:40767529 | DOI:10.1097/BOR.0000000000001114

Categories: Literature Watch

Computer-vision based automatic rider helmet violation detection and vehicle identification in Indian smart city scenarios using NVIDIA TAO toolkit and YOLOv8

Wed, 2025-08-06 06:00

Front Artif Intell. 2025 Jul 22;8:1582257. doi: 10.3389/frai.2025.1582257. eCollection 2025.

ABSTRACT

Two-wheeler traffic offenses are a well-known fact about the Indian Road scenario. In addition to endangering the offenders, these offenses also endanger other commuters. Two-wheeler traffic violations can take many different forms, such as overloading, triple riding, and helmetless riding. Effective identification and enforcement strategies are necessary for these offenses since they pose a serious risk to public safety. Due to the inadequacy of traditional traffic monitoring and enforcement techniques, advanced technology-based solutions are now required. Deep learning-based systems have demonstrated significant promise in identifying and stopping such infractions in recent years. We propose a two-step deep learning approach that leverages the strengths of pre-trained object detection models to detect two-wheeler riders and specialized helmet classifiers to identify helmet wear status as well as detect number plates. In the first stage, we utilized a highly efficient, robust, and accurate object identification DetectNet (Model 1) framework developed by NVIDIA, and it uses the ResNet18 Convolutional Neural Network (CNN) architecture as part of the Transfer Learning Toolkit known as TAO (Train, Adapt, Optimize). The second stage demands accurate detection of a helmet on the identified rider and extracting numbers from the violator's license plates using the OCR module in real time. We employed YOLOv8 (Model 2), a deep learning-based architecture that has proven effective in several applications involving object detection in real time. It predicts bounding boxes and class probabilities for objects within an image using a single neural network, making it a perfect choice for real-time applications like rider helmet violations detections and number plate processing. Due to a lack of publicly available traffic datasets, we created a custom dataset containing motorcycle rider images captured under complex scenarios for training and validating our models. Experimental analysis shows that our proposed two-step model achieved a promising helmet detection accuracy of 98.56% and a 97.6% number plate detection accuracy of persons not wearing helmets. The major objective of our proposed study is to enforce stringent traffic laws in real-time to decrease rider helmet violations.

PMID:40766945 | PMC:PMC12321817 | DOI:10.3389/frai.2025.1582257

Categories: Literature Watch

Implementation of generative AI for the assessment and treatment of autism spectrum disorders: a scoping review

Wed, 2025-08-06 06:00

Front Psychiatry. 2025 Jul 22;16:1628216. doi: 10.3389/fpsyt.2025.1628216. eCollection 2025.

ABSTRACT

INTRODUCTION: Autism spectrum disorder (ASD) is characterized by persistent deficits in social communication and restrictive, repetitive behaviors. Current diagnostic and intervention pathways rely heavily on clinician expertise, leading to delays and limited scalability. Generative artificial intelligence (GenAI) offers emerging opportunities for automatically assisting and personalizing ASD care, though technical and ethical concerns persist.

METHODS: We conducted systematic searches in Embase, PsycINFO, PubMed, Scopus, and Web of Science (January 2014 to February 2025). Two reviewers independently screened and extracted eligible studies reporting empirical applications of GenAI in ASD screening, diagnosis, or intervention. Data were charted across GenAI architectures, application domains, evaluation metrics, and validation strategies. Comparative performance against baseline methods was synthesized where available.

RESULTS: From 553 records, 10 studies met the inclusion criteria across three domains: (1) screening and diagnosis (e.g., transformer-based classifiers and GAN-based data augmentation), (2) assessment and intervention, (e.g., multimodal emotion recognition and feedback systems), and (3) caregiver education and support (e.g., LLM-based chatbots). While most studies reported potential performance improvements, they also highlighted limitations such as small sample sizes, data biases, limited validation, and model hallucinations. Comparative analyses were sparse and lacked standardized metrics.

DISCUSSION: This review (i) maps GenAI applications in ASD care, (ii) compares GenAI and traditional approaches, (iii) highlights methodological and ethical challenges, and (iv) proposes future research directions. Our findings underscore GenAI's emerging potential in autism care and the prerequisites for its ethical, transparent, and clinically validated implementation.

SYSTEMATIC REVIEW REGISTRATION: https://osf.io/4gsyj/, identifier DOI: 10.17605/OSF.IO/4GSYJ.

PMID:40766925 | PMC:PMC12322814 | DOI:10.3389/fpsyt.2025.1628216

Categories: Literature Watch

On the Utility of Virtual Staining for Downstream Applications as it relates to Task Network Capacity

Wed, 2025-08-06 06:00

ArXiv [Preprint]. 2025 Jul 31:arXiv:2508.00164v1.

ABSTRACT

Virtual staining, or in-silico-labeling, has been proposed to computationally generate synthetic fluorescence images from label-free images by use of deep learning-based image-to-image translation networks. In most reported studies, virtually stained images have been assessed only using traditional image quality measures such as structural similarity or signal-to-noise ratio. However, in biomedical imaging, images are typically acquired to facilitate an image-based inference, which we refer to as a downstream biological or clinical task. This study systematically investigates the utility of virtual staining for facilitating clinically relevant downstream tasks (like segmentation or classification) with consideration of the capacity of the deep neural networks employed to perform the tasks. Comprehensive empirical evaluations were conducted using biological datasets, assessing task performance by use of label-free, virtually stained, and ground truth fluorescence images. The results demonstrated that the utility of virtual staining is largely dependent on the ability of the segmentation or classification task network to extract meaningful task-relevant information, which is related to the concept of network capacity. Examples are provided in which virtual staining does not improve, or even degrades, segmentation or classification performance when the capacity of the associated task network is sufficiently large. The results demonstrate that task network capacity should be considered when deciding whether to perform virtual staining.

PMID:40766889 | PMC:PMC12324553

Categories: Literature Watch

Single Capture Quantitative Oblique Back-Illumination Microscopy

Wed, 2025-08-06 06:00

bioRxiv [Preprint]. 2025 Aug 1:2025.07.29.667497. doi: 10.1101/2025.07.29.667497.

ABSTRACT

Quantitative oblique back-illumination microscopy (qOBM) has emerged as a powerful technique for label-free, 3D quantitative phase imaging of arbitrarily thick biological specimens. However, in its initial embodiment, qOBM requires multiple captures for phase recovery, which reduces imaging speed and increases system complexity. In this work, we present a novel advancement in qOBM: single-capture qOBM (SCqOBM) which utilizes a deep learning model to accurately reconstruct phase information from a single oblique back-illumination capture. We demonstrate that SCqOBM achieves remarkable phase imaging accuracy, closely matching the results of traditional four-capture qOBM in diverse biological samples. We first highlight the unique potential of SCqOBM for non-invasive, in-vivo imaging applications by visualizing blood flow in mouse brain and human arm. Additionally, we demonstrate single-slice (en-face) quantitative phase imaging at 2 kHz and volumetric refractive index tomography at speeds up to 10 volumes per second. SCqOBM offers transformative advantages in speed, simplicity, and system accessibility, making it highly suitable for dynamic and real-time imaging applications. Its ability to produce high-resolution, quantitative phase and refractive index images with minimal hardware complexity opens new frontiers in biomedical research and clinical diagnostics, including non-invasive hematological assessments and in-vivo tissue imaging.

PMID:40766649 | PMC:PMC12324366 | DOI:10.1101/2025.07.29.667497

Categories: Literature Watch

Tranquillyzer: A Flexible Neural Network Framework for Structural Annotation and Demultiplexing of Long-Read Transcriptomes

Wed, 2025-08-06 06:00

bioRxiv [Preprint]. 2025 Jul 31:2025.07.25.666829. doi: 10.1101/2025.07.25.666829.

ABSTRACT

Long-read single-cell RNA sequencing using platforms such as Oxford Nanopore Technologies (ONT) enables full-length transcriptome profiling at single-cell resolution. However, high sequencing error rates, diverse library architectures, and increasing dataset scale introduce major challenges for accurately identifying cell barcodes (CBCs) and unique molecular identifiers (UMIs) - key prerequisites for reliable demultiplexing and deduplication, respectively. Existing pipelines rely on hard-coded heuristics or local transition rules that cannot fully capture this broader structural context and often fail to robustly interpret reads with indel-induced shifts, truncated segments, or non-canonical element ordering. We introduce Tranquillyzer (TRANscript QUantification In Long reads-anaLYZER), a flexible, architecture-aware deep learning framework for processing long-read single-cell RNA-seq data. Tranquillyzer employs a hybrid neural network architecture and a global, context-aware design, and enables precise identification of structural elements - even when elements are shifted, partially degraded, or repeated due to sequencing noise or library construction variability. In addition to supporting established single-cell protocols, Tranquillyzer accommodates custom library formats through rapid, one-time model training on user-defined label schemas, typically completed within a few hours on standard GPUs. Additional features such as scalability across large datasets and comprehensive visualization capabilities further position Tranquillyzer as a flexible and scalable framework solution for processing long-read single-cell transcriptomic datasets.

PMID:40766630 | PMC:PMC12324178 | DOI:10.1101/2025.07.25.666829

Categories: Literature Watch

What does it take to learn the rules of RNA base pairing? A lot less than you may think

Wed, 2025-08-06 06:00

bioRxiv [Preprint]. 2025 Aug 2:2025.07.31.668042. doi: 10.1101/2025.07.31.668042.

ABSTRACT

Amidst the fast-developing trend of RNA large language models with millions of parameters, we asked what would be the minimum required to rediscover the rules of RNA canonical base pairing, mainly the Watson-Crick-Franklin A:U, G:C and the wobble G:U base pairs (the secondary structure). Here, we conclude that it does not require much at all. It does not require knowing secondary structures; it does not require aligning the sequences; and it does not require many parameters. We selected a probabilistic model of palindromes (a stochastic context-free grammar or SCFG) with a total of just 21 parameters. Using standard deep learning techniques, we estimate its parameters by implementing the generative process in an automatic differentiation (autodiff) framework and applying stochastic gradient descent (SGD). We define and minimize a loss function that does not use any structural or alignment information. Trained on as few as fifty RNA sequences, the rules of RNA base pairing emerge after only a few iterations of SGD. Crucially, the sole inputs are RNA sequences. When optimizing for sequences corresponding to structured RNAs, SGD also yields the rules of RNA base-pair aggregation into helices. Trained on shuffled sequences, the system optimizes by avoiding base pairing altogether. Trained on messenger RNAs, it reveals interactions that are different from those of structural RNAs, and specific to each mRNA. Our results show that the emergence of canonical base-pairing can be attributed to sequence-level signals that are robust and detectable even without labeled structures or alignments, and with very few parameters. Autodiff algorithms for probabilistic models, such as, but not restricted to SCFGs, have significant potential as they allow these models to be incorporated into end- to-end RNA deep learning methods for discerning transcripts of different functionalities.

PMID:40766544 | PMC:PMC12324431 | DOI:10.1101/2025.07.31.668042

Categories: Literature Watch

A combinatorial mutational map of active non-native protein kinases by deep learning guided sequence design

Wed, 2025-08-06 06:00

bioRxiv [Preprint]. 2025 Aug 3:2025.08.03.668353. doi: 10.1101/2025.08.03.668353.

ABSTRACT

Mapping protein sequence-function landscapes has either been limited to small steps (only few mutations) or to sequences similar to those already explored by evolution to maintain activity. Here, we overcome both limitations by applying deep-learning guided redesign to a natural protein tyrosine kinase to generate novel, functional sequences with highly combinatorial mutations. Using cell-free assays, we measure the activities and concentrations of 537 redesigned sequences, which differ from the wild-type by an average of 37 mutations while retaining activity in 85% of variants. These sequences sample 436 unique mutations at 76 different positions throughout the kinase domain. A simple regression model identifies key sequence determinants of function and predicts the function of unseen sequences. Our approach demonstrates how integrating deep-learning guided redesign, functional measurement at scale, and interpretable computational modelling enables functional exploration of highly combinatorial and sparse sequence-function landscapes at mutational scales not possible before.

PMID:40766444 | PMC:PMC12324526 | DOI:10.1101/2025.08.03.668353

Categories: Literature Watch

Pages