Deep learning

Hard-aware Instance Adaptive Self-training for Unsupervised Cross-domain Semantic Segmentation

Tue, 2025-03-18 06:00

IEEE Trans Pattern Anal Mach Intell. 2025 Mar 18;PP. doi: 10.1109/TPAMI.2025.3552484. Online ahead of print.

ABSTRACT

The divergence between labeled training data and unlabeled testing data is a significant challenge for recent deep learning models. Unsupervised domain adaptation (UDA) attempts to solve such problem. Recent works show that self-training is a powerful approach to UDA. However, existing methods have difficulty in balancing the scalability and performance. In this paper, we propose a hard-aware instance adaptive self-training framework for UDA on the task of semantic segmentation. To effectively improve the quality and diversity of pseudo-labels, we develop a novel pseudo-label generation strategy with an instance adaptive selector. We further enrich the hard class pseudo-labels with inter-image information through a skillfully designed hard-aware pseudo-label augmentation. Besides, we propose the region-adaptive regularization to smooth the pseudo-label region and sharpen the non-pseudo-label region. For the non-pseudo-label region, consistency constraint is also constructed to introduce stronger supervision signals during model optimization. Our method is so concise and efficient that it is easy to be generalized to other UDA methods. Experiments on GTA5 Cityscapes, SYNTHIA Cityscapes, and Cityscapes Oxford RobotCar demonstrate the superior performance of our approach compared with the state-of-the-art methods. Our codes are available at https://github.com/bupt-ai-cz/HIAST.

PMID:40100655 | DOI:10.1109/TPAMI.2025.3552484

Categories: Literature Watch

Enhancing Patient Outcome Prediction Through Deep Learning With Sequential Diagnosis Codes From Structured Electronic Health Record Data: Systematic Review

Tue, 2025-03-18 06:00

J Med Internet Res. 2025 Mar 18;27:e57358. doi: 10.2196/57358.

ABSTRACT

BACKGROUND: The use of structured electronic health records in health care systems has grown rapidly. These systems collect huge amounts of patient information, including diagnosis codes representing temporal medical history. Sequential diagnostic information has proven valuable for predicting patient outcomes. However, the extent to which these types of data have been incorporated into deep learning (DL) models has not been examined.

OBJECTIVE: This systematic review aims to describe the use of sequential diagnostic data in DL models, specifically to understand how these data are integrated, whether sample size improves performance, and whether the identified models are generalizable.

METHODS: Relevant studies published up to May 15, 2023, were identified using 4 databases: PubMed, Embase, IEEE Xplore, and Web of Science. We included all studies using DL algorithms trained on sequential diagnosis codes to predict patient outcomes. We excluded review articles and non-peer-reviewed papers. We evaluated the following aspects in the included papers: DL techniques, characteristics of the dataset, prediction tasks, performance evaluation, generalizability, and explainability. We also assessed the risk of bias and applicability of the studies using the Prediction Model Study Risk of Bias Assessment Tool (PROBAST). We used the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist to report our findings.

RESULTS: Of the 740 identified papers, 84 (11.4%) met the eligibility criteria. Publications in this area increased yearly. Recurrent neural networks (and their derivatives; 47/84, 56%) and transformers (22/84, 26%) were the most commonly used architectures in DL-based models. Most studies (45/84, 54%) presented their input features as sequences of visit embeddings. Medications (38/84, 45%) were the most common additional feature. Of the 128 predictive outcome tasks, the most frequent was next-visit diagnosis (n=30, 23%), followed by heart failure (n=18, 14%) and mortality (n=17, 13%). Only 7 (8%) of the 84 studies evaluated their models in terms of generalizability. A positive correlation was observed between training sample size and model performance (area under the receiver operating characteristic curve; P=.02). However, 59 (70%) of the 84 studies had a high risk of bias.

CONCLUSIONS: The application of DL for advanced modeling of sequential medical codes has demonstrated remarkable promise in predicting patient outcomes. The main limitation of this study was the heterogeneity of methods and outcomes. However, our analysis found that using multiple types of features, integrating time intervals, and including larger sample sizes were generally related to an improved predictive performance. This review also highlights that very few studies (7/84, 8%) reported on challenges related to generalizability and less than half (38/84, 45%) of the studies reported on challenges related to explainability. Addressing these shortcomings will be instrumental in unlocking the full potential of DL for enhancing health care outcomes and patient care.

TRIAL REGISTRATION: PROSPERO CRD42018112161; https://tinyurl.com/yc6h9rwu.

PMID:40100249 | DOI:10.2196/57358

Categories: Literature Watch

Mining the UniProtKB/Swiss-Prot database for antimicrobial peptides

Tue, 2025-03-18 06:00

Protein Sci. 2025 Apr;34(4):e70083. doi: 10.1002/pro.70083.

ABSTRACT

The ever-growing global health threat of antibiotic resistance is compelling researchers to explore alternatives to conventional antibiotics. Antimicrobial peptides (AMPs) are emerging as a promising solution to fill this need. Naturally occurring AMPs are produced by all forms of life as part of the innate immune system. High-throughput bioinformatics tools have enabled fast and large-scale discovery of AMPs from genomic, transcriptomic, and proteomic resources of selected organisms. Public protein sequence databases, comprising over 200 million records and growing, serve as comprehensive compendia of sequences from a broad range of source organisms. Yet, large-scale in silico probing of those databases for novel AMP discovery using modern deep learning techniques has rarely been reported. In the present study, we propose an AMP mining workflow to predict novel AMPs from the UniProtKB/Swiss-Prot database using the AMP prediction tool, AMPlify, as its discovery engine. Using this workflow, we identified 8008 novel putative AMPs from all eukaryotic sequences in the database. Focusing on the practical use of AMPs as suitable antimicrobial agents with applications in the poultry industry, we prioritized 40 of those AMPs based on their similarities to known chicken AMPs in predicted structures. In our tests, 13 out of the 38 successfully synthesized peptides showed antimicrobial activity against Escherichia coli and/or Staphylococcus aureus. AMPlify and the companion scripts supporting the AMP mining workflow presented herein are publicly available at https://github.com/bcgsc/AMPlify.

PMID:40100125 | DOI:10.1002/pro.70083

Categories: Literature Watch

Deep learning approaches to predict late gadolinium enhancement and clinical outcomes in suspected cardiac sarcoidosis

Tue, 2025-03-18 06:00

Sarcoidosis Vasc Diffuse Lung Dis. 2025 Mar 18;42(1):15378. doi: 10.36141/svdld.v42i1.15378.

NO ABSTRACT

PMID:40100114 | DOI:10.36141/svdld.v42i1.15378

Categories: Literature Watch

A deep learning model based on chest CT to predict benign and malignant breast masses and axillary lymph node metastasis

Tue, 2025-03-18 06:00

Biomol Biomed. 2025 Mar 17. doi: 10.17305/bb.2025.12010. Online ahead of print.

ABSTRACT

Differentiating early-stage breast cancer from benign breast masses is crucial for radiologists. Additionally, accurately assessing axillary lymph node metastasis (ALNM) plays a significant role in clinical management and prognosis for breast cancer patients. Chest computed tomography (CT) is a commonly used imaging modality in physical and preoperative evaluations. This study aims to develop a deep learning model based on chest CT imaging to improve the preliminary assessment of breast lesions, potentially reducing the need for costly follow-up procedures such as magnetic resonance imaging (MRI) or positron emission tomography-CT and alleviating the financial and emotional burden on patients. We retrospectively collected chest CT images from 482 patients with breast masses, classifying them as benign (n = 224) or malignant (n = 258) based on pathological findings. The malignant group was further categorized into ALNM-positive (n = 91) and ALNM-negative (n = 167) subgroups. Patients were randomly divided into training, validation, and test sets in an 8:1:1 ratio, with the test set excluded from model development. All patients underwent non-contrast chest CT before surgery. After preprocessing the images through cropping, scaling, and standardization, we applied ResNet-34, ResNet-50, and ResNet-101 architectures to differentiate between benign and malignant masses and to assess ALNM. Model performance was evaluated using sensitivity, specificity, accuracy, receiver operating characteristic (ROC) curves, and the area under the curve (AUC). The ResNet models effectively distinguished benign from malignant masses, with ResNet-101 achieving the highest performance (AUC: 0.964; 95% CI: 0.948-0.981). It also demonstrated excellent predictive capability for ALNM (AUC: 0.951; 95% CI: 0.926-0.975). In conclusion, these deep learning models show strong diagnostic potential for both breast mass classification and ALNM prediction, offering a valuable tool for improving clinical decision-making.

PMID:40100034 | DOI:10.17305/bb.2025.12010

Categories: Literature Watch

Ratiometric, 3D Fluorescence Spectrum with Abundant Information for Tetracyclines Discrimination via Dual Biomolecules Recognition and Deep Learning

Tue, 2025-03-18 06:00

Anal Chem. 2025 Mar 18. doi: 10.1021/acs.analchem.4c07061. Online ahead of print.

ABSTRACT

Tetracyclines are widely used in bacteria infection treatment, while the subtle chemical differences between tetracyclines make it a challenge to accurate discrimination via biosensors. A 3D fluorescence spectrum can provide fingerprint structure information for many analytes, but a single probe-based method is prone to information overlap. Here, aptamers are first reported to obtain abundant information in a ratiometric, 3D fluorescence spectrum for deep learning to accurately discriminate tetracyclines. So, each tetracycline can be related to a distinct, ratiometric, 3D fluorescence spectrum via the strategy of dual biomolecules recognition. One artificial neural network model can efficiently treat this fingerprint information, and the qualitative/quantitative analysis of tetracyclines is successfully realized. The proposed dual biomolecule recognition strategy has been demonstrated to show a higher accuracy than a conventional single probe method. So, the ratiometric 3D fluorescence spectrum can enrich the fingerprint information for deep learning, providing a new strategy for 3D fluorescence-based analytes discrimination.

PMID:40099919 | DOI:10.1021/acs.analchem.4c07061

Categories: Literature Watch

A Molecular Representation to Identify Isofunctional Molecules

Tue, 2025-03-18 06:00

Mol Inform. 2025 Mar;44(3):e202400159. doi: 10.1002/minf.202400159.

ABSTRACT

The challenges of drug discovery from hit identification to clinical development sometimes involves addressing scaffold hopping issues, in order to optimise molecular biological activity or ADME properties, or mitigate toxicology concerns of a drug candidate. Docking is usually viewed as the method of choice for identification of isofunctional molecules, i. e. highly dissimilar molecules that share common binding modes with a protein target. However, the structure of the protein may not be suitable for docking because of a low resolution, or may even be unknown. This problem is frequently encountered in the case of membrane proteins, although they constitute an important category of the druggable proteome. In such cases, ligand-based approaches offer promise but are often inadequate to handle large-step scaffold hopping, because they usually rely on molecular structure. Therefore, we propose the Interaction Fingerprints Profile (IFPP), a molecular representation that captures molecules binding modes based on docking experiments against a panel of diverse high-quality proteins structures. Evaluation on the LH benchmark demonstrates the interest of IFPP for identification of isofunctional molecules. Nevertheless, computation of IFPPs is expensive, which limits its scalability for screening very large molecular libraries. We propose to overcome this limitation by leveraging Metric Learning approaches, allowing fast estimation of molecules IFPP similarities, thus providing an efficient pre-screening strategy that in applicable to very large molecular libraries. Overall, our results suggest that IFPP provides an interesting and complementary tool alongside existing methods, in order to address challenging scaffold hopping problems effectively in drug discovery.

PMID:40099892 | DOI:10.1002/minf.202400159

Categories: Literature Watch

X2-PEC: A Neural Network Model Based on Atomic Pair Energy Corrections

Tue, 2025-03-18 06:00

J Comput Chem. 2025 Mar 30;46(8):e70081. doi: 10.1002/jcc.70081.

ABSTRACT

With the development of artificial neural networks (ANNs), its applications in chemistry have become increasingly widespread, especially in the prediction of various molecular properties. This work introduces the X2-PEC method, that is, the second generalization of the X1 series of ANN methods developed in our group, utilizing pair energy correction (PEC). The essence of the X2 model lies in its feature vector construction, using overlap integrals and core Hamiltonian integrals to incorporate physical and chemical information into the feature vectors to describe atomic interactions. It aims to enhance the accuracy of low-rung density functional theory (DFT) calculations, such as those from the widely used BLYP/6-31G(d) or B3LYP/6-31G(2df,p) methods, to the level of top-rung DFT calculations, such as those from the highly accurate doubly hybrid XYGJ-OS/GTLarge method. Trained on the QM9 dataset, X2-PEC excels in predicting the atomization energies of isomers such as C6H8 and C4H4N2O with varying bonding structures. The performance of the X2-PEC model on standard enthalpies of formation for datasets such as G2-HCNOF, PSH36, ALKANE28, BIGMOL20, and HEDM45, as well as a HCNOF subset of BH9 for reaction barriers, is equally commendable, demonstrating its good generalization ability and predictive accuracy, as well as its potential for further development to achieve greater accuracy. These outcomes highlight the practical significance of the X2-PEC model in elevating the results from lower-rung DFT calculations to the level of higher-rung DFT calculations through deep learning.

PMID:40099806 | DOI:10.1002/jcc.70081

Categories: Literature Watch

Integrating Social Determinants of Health and Established Risk Factors to Predict Cardiovascular Disease Risk Among Healthy Older Adults

Tue, 2025-03-18 06:00

J Am Geriatr Soc. 2025 Mar 18. doi: 10.1111/jgs.19440. Online ahead of print.

ABSTRACT

BACKGROUND: Recent evidence underscores the significant impact of social determinants of health (SDoH) on cardiovascular disease (CVD). However, available CVD risk assessment tools often neglect SDoH. This study aimed to integrate SDoH with traditional risk factors to predict CVD risk.

METHODS: The data was sourced from the ASPirin in Reducing Events in the Elderly (ASPREE) longitudinal study, and its sub-study, the ASPREE Longitudinal Study of Older Persons (ALSOP). The study included 12,896 people (5884 men and 7012 women) aged 70 or older who were initially free of CVD, dementia, and independence-limiting physical disability. The participants were followed for a median of eight years. CVD risk was predicted using state-of-the-art machine learning (ML) and deep learning (DL) models: Random Survival Forest (RSF), Deepsurv, and Neural Multi-Task Logistic Regression (NMTLR), incorporating both SDoH and traditional CVD risk factors as candidate predictors. The permutation-based feature importance method was further utilized to assess the predictive potential of the candidate predictors.

RESULTS: Among men, the RSF model achieved relatively good performance (C-index = 0.732, integrated brier score (IBS) = 0.071, 5-year and 10-year AUC = 0.657 and 0.676 respectively). For women, DeepSurv was the best-performing model (C-index = 0.670, IBS = 0.042, 5-year and 10-year AUC = 0.676 and 0.677 respectively). Regarding the contribution of the candidate predictors, for men, age, urine albumin-to-creatinine ratio, and smoking, along with SDoH variables, were identified as the most significant predictors of CVD. For women, SDoH variables, such as social network, living arrangement, and education, predicted CVD risk better than the traditional risk factors, with age being the exception.

CONCLUSION: SDoH can improve the accuracy of CVD risk prediction and emerge among the main predictors for CVD. The influence of SDoH was greater for women than for men, reflecting gender-specific impacts of SDoH.

PMID:40099367 | DOI:10.1111/jgs.19440

Categories: Literature Watch

Machine learning-based risk predictive models for diabetic kidney disease in type 2 diabetes mellitus patients: a systematic review and meta-analysis

Tue, 2025-03-18 06:00

Front Endocrinol (Lausanne). 2025 Mar 3;16:1495306. doi: 10.3389/fendo.2025.1495306. eCollection 2025.

ABSTRACT

BACKGROUND: Machine learning (ML) models are being increasingly employed to predict the risk of developing and progressing diabetic kidney disease (DKD) in patients with type 2 diabetes mellitus (T2DM). However, the performance of these models still varies, which limits their widespread adoption and practical application. Therefore, we conducted a systematic review and meta-analysis to summarize and evaluate the performance and clinical applicability of these risk predictive models and to identify key research gaps.

METHODS: We conducted a systematic review and meta-analysis to compare the performance of ML predictive models. We searched PubMed, Embase, the Cochrane Library, and Web of Science for English-language studies using ML algorithms to predict the risk of DKD in patients with T2DM, covering the period from database inception to April 18, 2024. The primary performance metric for the models was the area under the receiver operating characteristic curve (AUC) with a 95% confidence interval (CI). The risk of bias was assessed using the Prediction Model Risk of Bias Assessment Tool (PROBAST) checklist.

RESULTS: 26 studies that met the eligibility criteria were included into the meta-analysis. 25 studies performed internal validation, but only 8 studies conducted external validation. A total of 94 ML models were developed, with 81 models evaluated in the internal validation sets and 13 in the external validation sets. The pooled AUC was 0.839 (95% CI 0.787-0.890) in the internal validation and 0.830 (95% CI 0.784-0.877) in the external validation sets. Subgroup analysis based on the type of ML showed that the pooled AUC for traditional regression ML was 0.797 (95% CI 0.777-0.816), for ML was 0.811 (95% CI 0.785-0.836), and for deep learning was 0.863 (95% CI 0.825-0.900). A total of 26 ML models were included, and the AUCs of models that were used three or more times were pooled. Among them, the random forest (RF) models demonstrated the best performance with a pooled AUC of 0.848 (95% CI 0.785-0.911).

CONCLUSION: This meta-analysis demonstrates that ML exhibit high performance in predicting DKD risk in T2DM patients. However, challenges related to data bias during model development and validation still need to be addressed. Future research should focus on enhancing data transparency and standardization, as well as validating these models' generalizability through multicenter studies.

SYSTEMATIC REVIEW REGISTRATION: https://inplasy.com/inplasy-2024-9-0038/, identifier INPLASY202490038.

PMID:40099258 | PMC:PMC11911190 | DOI:10.3389/fendo.2025.1495306

Categories: Literature Watch

Statistical Evaluation of Smartphone-Based Automated Grading System for Ocular Redness Associated with Dry Eye Disease and Implications for Clinical Trials

Tue, 2025-03-18 06:00

Clin Ophthalmol. 2025 Mar 13;19:907-914. doi: 10.2147/OPTH.S506519. eCollection 2025.

ABSTRACT

PURPOSE: This study introduces a fully automated approach using deep learning-based segmentation to select the conjunctiva as the region of interest (ROI) for large-scale, multi-site clinical trials. By integrating a precise, objective grading system, we aim to minimize inter- and intra-grader variability due to perceptual biases. We evaluate the impact of adding a "horizontality" parameter to the grading system and assess this method's potential to enhance grading precision, reduce sample size, and improve clinical trial efficiency.

METHODS: We analyzed 29,640 images from 450 subjects in a multi-visit, multi-site clinical trial to assess the performance of an automated grading model compared to expert graders. Images were graded on a 0-4 scale, in 0.5 increments. The model utilizes the DeepLabV3 architecture for image segmentation, extracting two key features-horizontality and redness. The algorithm then uses these features to predict eye redness, validated by comparison with expert grader scores.

RESULTS: The bivariate model using both redness and horizontality performed best, with a Mean Absolute Error (MAE) of 0.450 points (SD=0.334) on the redness scale relative to expert scores. Expert graded scores were within one unit of the mean grade in over 85% cases, ensuring consistency and optimal training set for the predictive model. Models incorporating both features outperformed those using only redness, reducing MAE by 5-6%. The optimal generalized model improved predictive accuracy with horizontality such that 93.0% of images were predicted with an absolute error less than one unit difference in grading.

CONCLUSION: This study demonstrates that fully automating image analysis allows thousands of images to be graded efficiently. The addition of the horizontality parameter enhances model performance, reduces error, and supports its relevance to specific Dry Eye manifestations. This automated method provides a continuous scale and greater sensitivity to treatment effects than standard clinical scales.

PMID:40099234 | PMC:PMC11912931 | DOI:10.2147/OPTH.S506519

Categories: Literature Watch

Elucidating the role of artificial intelligence in drug development from the perspective of drug-target interactions

Tue, 2025-03-18 06:00

J Pharm Anal. 2025 Mar;15(3):101144. doi: 10.1016/j.jpha.2024.101144. Epub 2024 Nov 14.

ABSTRACT

Drug development remains a critical issue in the field of biomedicine. With the rapid advancement of information technologies such as artificial intelligence (AI) and the advent of the big data era, AI-assisted drug development has become a new trend, particularly in predicting drug-target associations. To address the challenge of drug-target prediction, AI-driven models have emerged as powerful tools, offering innovative solutions by effectively extracting features from complex biological data, accurately modeling molecular interactions, and precisely predicting potential drug-target outcomes. Traditional machine learning (ML), network-based, and advanced deep learning architectures such as convolutional neural networks (CNNs), graph convolutional networks (GCNs), and transformers play a pivotal role. This review systematically compiles and evaluates AI algorithms for drug- and drug combination-target predictions, highlighting their theoretical frameworks, strengths, and limitations. CNNs effectively identify spatial patterns and molecular features critical for drug-target interactions. GCNs provide deep insights into molecular interactions via relational data, whereas transformers increase prediction accuracy by capturing complex dependencies within biological sequences. Network-based models offer a systematic perspective by integrating diverse data sources, and traditional ML efficiently handles large datasets to improve overall predictive accuracy. Collectively, these AI-driven methods are transforming drug-target predictions and advancing the development of personalized therapy. This review summarizes the application of AI in drug development, particularly in drug-target prediction, and offers recommendations on models and algorithms for researchers engaged in biomedical research. It also provides typical cases to better illustrate how AI can further accelerate development in the fields of biomedicine and drug discovery.

PMID:40099205 | PMC:PMC11910364 | DOI:10.1016/j.jpha.2024.101144

Categories: Literature Watch

Multi-view united transformer block of graph attention network based autism spectrum disorder recognition

Tue, 2025-03-18 06:00

Front Psychiatry. 2025 Feb 20;16:1485286. doi: 10.3389/fpsyt.2025.1485286. eCollection 2025.

ABSTRACT

INTRODUCTION: Autism Spectrum Disorder (ASD) identification poses significant challenges due to its multifaceted and diverse nature, necessitating early discovery for operative involvement. In a recent study, there has been a lot of talk about how deep learning algorithms might improve the diagnosis of ASD by analyzing neuroimaging data.

METHOD: To overrule the negatives of current techniques, this research proposed a revolutionary strategic model called the Unified Transformer Block for Multi-View Graph Attention Networks (MVUT_GAT). For the purpose of extracting delicate outlines from physical and efficient functional MRI data, MVUT_GAT combines the advantages of multi-view learning with attention processes.

RESULT: With the use of the ABIDE dataset, a thorough analysis shows that MVUT_GAT performs better than Mutli-view Site Graph Convolution Network (MVS_GCN), outperforming it in accuracy by +3.40%. This enhancement reinforces our suggested model's effectiveness in identifying ASD. The result has implications over higher accuracy metrics. Through improving the accuracy and consistency of ASD diagnosis, MVUT_GAT will help with early interference and assistance for ASD patients.

DISCUSSION: Moreover, the proposed MVUT_GAT's which patches the distance between the models of deep learning and medical visions by helping to identify biomarkers linked to ASD. In the end, this effort advances the knowledge of recognizing autism spectrum disorder along with the powerful ability to enhance results and the value of people who are undergone.

PMID:40099145 | PMC:PMC11913004 | DOI:10.3389/fpsyt.2025.1485286

Categories: Literature Watch

Identification of biomarkers and target drugs for melanoma: a topological and deep learning approach

Tue, 2025-03-18 06:00

Front Genet. 2025 Mar 3;16:1471037. doi: 10.3389/fgene.2025.1471037. eCollection 2025.

ABSTRACT

INTRODUCTION: Melanoma, a highly aggressive malignancy characterized by rapid metastasis and elevated mortality rates, predominantly originates in cutaneous tissues. While surgical interventions, immunotherapy, and targeted therapies have advanced, the prognosis for advanced-stage melanoma remains dismal. Globally, melanoma incidence continues to rise, with the United States alone reporting over 100,000 new cases and 7,000 deaths annually. Despite the exponential growth of tumor data facilitated by next-generation sequencing (NGS), current analytical approaches predominantly emphasize single-gene analyses, neglecting critical insights into complex gene interaction networks. This study aims to address this gap by systematically exploring immune gene regulatory dynamics in melanoma progression.

METHODS: We developed a bidirectional, weighted, signed, and directed topological immune gene regulatory network to compare transcriptional landscapes between benign melanocytic nevi and cutaneous melanoma. Advanced network analysis tools were employed to identify structural disparities and functional module shifts. Key driver genes were validated through topological centrality metrics. Additionally, deep learning models were implemented to predict drug-target interactions, leveraging molecular features derived from network analyses.

RESULTS: Significant topological divergences emerged between nevi and melanoma networks, with dominant functional modules transitioning from cell cycle regulation in benign lesions to DNA repair and cell migration pathways in malignant tumors. A group of genes, including AURKA, CCNE1, APEX2, and EXOC8, were identified as potential orchestrators of immune microenvironment remodeling during malignant transformation. The deep learning framework successfully predicted 23 clinically actionable drug candidates targeting these molecular drivers.

DISCUSSION: The observed module shift from cell cycle to invasion-related pathways provides mechanistic insights into melanoma progression, suggesting early therapeutic targeting of DNA repair machinery might mitigate metastatic potential. The identified hub genes, particularly AURKA and DDX19B, represent novel candidates for immunomodulatory interventions. Our computational drug prediction strategy bridges molecular network analysis with clinical translation, offering a paradigm for precision oncology in melanoma. Future studies should validate these targets in preclinical models and explore network-based biomarkers for early detection.

PMID:40098976 | PMC:PMC11911340 | DOI:10.3389/fgene.2025.1471037

Categories: Literature Watch

Predicting the risk of relapsed or refractory in patients with diffuse large B-cell lymphoma via deep learning

Tue, 2025-03-18 06:00

Front Oncol. 2025 Mar 3;15:1480645. doi: 10.3389/fonc.2025.1480645. eCollection 2025.

ABSTRACT

INTRODUCTION: Diffuse large B-cell lymphoma (DLBCL) is the most common type of non-Hodgkin lymphoma (NHL) in humans, and it is a highly heterogeneous malignancy with a 40% to 50% risk of relapsed or refractory (R/R), leading to a poor prognosis. So early prediction of R/R risk is of great significance for adjusting treatments and improving the prognosis of patients.

METHODS: We collected clinical information and H&E images of 227 patients diagnosed with DLBCL in Xuzhou Medical University Affiliated Hospital from 2015 to 2018. Patients were then divided into R/R group and non-relapsed & non-refractory group based on clinical diagnosis, and the two groups were randomly assigned to the training set, validation set and test set in a ratio of 7:1:2. We developed a model to predict the R/R risk of patients based on clinical features utilizing the random forest algorithm. Additionally, a prediction model based on histopathological images was constructed using CLAM, a weakly supervised learning method after extracting image features with convolutional networks. To improve the prediction performance, we further integrated image features and clinical information for fusion modeling.

RESULTS: The average area under the ROC curve value of the fusion model was 0.71±0.07 in the validation dataset and 0.70±0.04 in the test dataset. This study proposed a novel method for predicting the R/R risk of DLBCL based on H&E images and clinical features.

DISCUSSION: For patients predicted to have high risk, follow-up monitoring can be intensified, and treatment plans can be adjusted promptly.

PMID:40098696 | PMC:PMC11911189 | DOI:10.3389/fonc.2025.1480645

Categories: Literature Watch

Deep learning imaging analysis to identify bacterial metabolic states associated with carcinogen production

Tue, 2025-03-18 06:00

Discov Imaging. 2025;2(1):2. doi: 10.1007/s44352-025-00006-1. Epub 2025 Mar 10.

ABSTRACT

BACKGROUND: Colorectal cancer (CRC) is a globally prevalent cancer. Emerging research implicates the gut microbiome in CRC pathogenesis. Bacteria such as Clostridium scindens can produce the carcinogenic bile acid deoxycholic acid (DCA). It is unknown whether imaging methods can differentiate DCA-producing and DCA-non-producing C. scindens cells.

METHODS: Light microscopy images of anaerobically cultured C. scindens in four conditions were acquired at 100× magnification using the Tissue FAX system: C. scindens in media alone (DCA-non-producing state), C. scindens in media with cholic acid (DCA-producing state), or C. scindens in co-culture with one of two Bacteroides species (intermediate DCA production states). We evaluated three approaches: whole-image classification, per-cell classification, and image segmentation-based classification. For whole-image classification, we used a custom Convolutional Neural Network (CNN), pre-trained DenseNet, pre-trained ResNet, and ResNet enhanced by integrating the Digital Images of Bacterial Species (DIBaS) dataset. For cell detection and classification, we applied thresholding (OTSU or adaptive thresholding) followed by a ResNet model. Finally, image segmentation-based classification was performed using nnU-Net.

RESULTS: For whole-image analysis, DIBaS-enhanced ResNet models achieved the best performance in distinguishing C. scindens states in monoculture (accuracy 0.89 ± 0.006) and in co-cultures (accuracy 0.86 ± 0.004). Per-cell analysis was optimal at a C constant value of 3, with the ResNet model achieving 62-74% accuracy for C. scindens states in monoculture. Segmentation-based analysis using nnU-Net resulted in Dice coefficients of 87% for C. scindens and 74-76% for the Bacteroides species.

CONCLUSIONS: This study demonstrates feasibility of image-based deep learning models in identifying health-relevant gut bacterial metabolic states.

SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s44352-025-00006-1.

PMID:40098681 | PMC:PMC11912549 | DOI:10.1007/s44352-025-00006-1

Categories: Literature Watch

An efficient deep learning strategy for accurate and automated detection of breast tumors in ultrasound image datasets

Tue, 2025-03-18 06:00

Front Oncol. 2025 Mar 3;14:1461542. doi: 10.3389/fonc.2024.1461542. eCollection 2024.

ABSTRACT

BACKGROUND: Breast cancer ranks as one of the leading malignant tumors among women worldwide in terms of incidence and mortality. Ultrasound examination is a critical method for breast cancer screening and diagnosis in China. However, conventional breast ultrasound examinations are time-consuming and labor-intensive, necessitating the development of automated and efficient detection models.

METHODS: We developed a novel approach based on an improved deep learning model for the intelligent auxiliary diagnosis of breast tumors. Combining an optimized U2NET-Lite model with the efficient DeepCardinal-50 model, this method demonstrates superior accuracy and efficiency in the precise segmentation and classification of breast ultrasound images compared to traditional deep learning models such as ResNet and AlexNet.

RESULTS: Our proposed model demonstrated exceptional performance in experimental test sets. For segmentation, the U2NET-Lite model processed breast cancer images with an accuracy of 0.9702, a recall of 0.7961, and an IoU of 0.7063. In classification, the DeepCardinal-50 model excelled, achieving higher accuracy and AUC values compared to other models. Specifically, ResNet-50 achieved accuracies of 0.78 for benign, 0.67 for malignant, and 0.73 for normal cases, while DeepCardinal-50 achieved 0.76, 0.63, and 0.90 respectively. These results highlight our model's superior capability in breast tumor identification and classification.

CONCLUSION: The automatic detection of benign and malignant breast tumors using deep learning can rapidly and accurately identify breast tumor types at an early stage, which is crucial for the early diagnosis and treatment of malignant breast tumors.

PMID:40098633 | PMC:PMC11911202 | DOI:10.3389/fonc.2024.1461542

Categories: Literature Watch

Rediscovering histology - the application of artificial intelligence in inflammatory bowel disease histologic assessment

Tue, 2025-03-18 06:00

Therap Adv Gastroenterol. 2025 Mar 17;18:17562848251325525. doi: 10.1177/17562848251325525. eCollection 2025.

ABSTRACT

Integrating artificial intelligence (AI) into histologic disease assessment is transforming the management of inflammatory bowel disease (IBD). AI-aided histology enables precise, objective evaluations of disease activity by analysing whole-slide images, facilitating accurate predictions of histologic remission (HR) in ulcerative colitis and Crohn's disease. Additionally, AI shows promise in predicting adverse outcomes and therapeutic responses, making it a promising tool for clinical practice and clinical trials. By leveraging advanced algorithms, AI enhances diagnostic accuracy, reduces assessment variability and streamlines histological workflows in clinical settings. In clinical trials, AI aids in assessing histological endpoints, enabling real-time analysis, standardising evaluations and supporting adaptive trial designs. Recent advancements are further refining AI-aided digital pathology in IBD. New developments in multimodal AI models integrating clinical, endoscopic, histologic and molecular data pave the way for a comprehensive approach to precision medicine in IBD. Automated assessment of intestinal barrier healing - a deeper level of healing beyond endoscopic and HR - shows promise for improved outcome prediction and patient management. Preliminary evidence also suggests that AI applied to colitis-associated neoplasia can aid in the detection, characterisation and molecular profiling of lesions, holding potential for enhanced dysplasia management and organ-sparing approaches. Although challenges remain in standardisation, validation through randomised controlled trials and ethical considerations. AI is poised to revolutionise IBD management by advancing towards a more personalised and efficient care model, while the path to full clinical implementation may be lengthy. However, the transformative impact of AI on IBD care is already shining through.

PMID:40098604 | PMC:PMC11912177 | DOI:10.1177/17562848251325525

Categories: Literature Watch

Lit-OTAR Framework for Extracting Biological Evidences from Literature

Mon, 2025-03-17 06:00

Bioinformatics. 2025 Mar 17:btaf113. doi: 10.1093/bioinformatics/btaf113. Online ahead of print.

ABSTRACT

SUMMARY: The lit-OTAR framework, developed through a collaboration between Europe PMC and Open Targets, leverages deep learning to revolutionise drug discovery by extracting evidence from scientific literature for drug target identification and validation. This novel framework combines Named Entity Recognition (NER) for identifying gene/protein (target), disease, organism, and chemical/drug within scientific texts, and entity normalisation to map these entities to databases like Ensembl, Experimental Factor Ontology (EFO), and ChEMBL. Continuously operational, it has processed over 39 million abstracts and 4.5 million full-text articles and preprints to date, identifying more than 48.5 million unique associations that significantly help accelerate the drug discovery process and scientific research >29.9 m distinct target-disease, 11.8 m distinct target-drug, and 8.3 m distinct disease-drug relationships).

AVAILABILITY AND IMPLEMENTATION: The results are accessible through Europe PMC's SciLite web app (https://europepmc.org/) and its annotations API (https://europepmc.org/annotationsapi), as well as via the Open Targets Platform (https://platform.opentargets.org/). The daily pipeline is available at https://github.com/ML4LitS/otar-maintenance, and the Open Targets ETL processes are available at https://github.com/opentargets.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

PMID:40097274 | DOI:10.1093/bioinformatics/btaf113

Categories: Literature Watch

H2GnnDTI: hierarchical heterogeneous graph neural networks for drug target interaction prediction

Mon, 2025-03-17 06:00

Bioinformatics. 2025 Mar 17:btaf117. doi: 10.1093/bioinformatics/btaf117. Online ahead of print.

ABSTRACT

MOTIVATION: Identifying drug target interactions is a crucial step in drug repurposing and drug discovery. The significant increase in demand and the expensive nature for experimentally identifying drug target interactions necessitate computational tools for automated prediction and comprehension of drug target interactions. Despite recent advancements, current methods fail to fully leverage the hierarchical information in drug target interactions.

RESULTS: Here we introduce H2GnnDTI, a novel two-level hierarchical heterogeneous graph learning model to predict drug target interactions, by integrating the structures of drugs and proteins via a low-level view GNN (LGNN) and a high-level view GNN (HGNN). The hierarchical graph consists of high-level heterogeneous nodes representing drugs and proteins, connected by edges representing known DTIs. Each drug or protein node is further detailed in a low-level graph, where nodes represent molecules within each drug or amino acids within each protein, accompanied by their respective chemical descriptors. Two distinct low-level graph neural networks are first deployed to capture structural and chemical features specific to drugs and proteins from these low-level graphs. Subsequently, a high-level graph encoder is employed to comprehensively capture and merge interactive features pertaining to drugs and proteins from the high-level graph. The high-level encoder incorporates a structure and attribute information fusion module designed to explicitly integrate representations acquired from both a feature encoder and a graph encoder, facilitating consensus representation learning. Extensive experiments conducted on three benchmark datasets have shown that our proposed H2GnnDTI model consistently outperforms state-of-the-art deep learning methods.

AVAILABILITY AND IMPLEMENTATION: The codes are freely available at https://github.com/LiminLi-xjtu/H2GnnDTI.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

PMID:40097269 | DOI:10.1093/bioinformatics/btaf117

Categories: Literature Watch

Pages