Deep learning

An efficient method for chili pepper variety classification and origin tracing based on an electronic nose and deep learning

Tue, 2025-03-18 06:00

Food Chem. 2025 Mar 12;479:143850. doi: 10.1016/j.foodchem.2025.143850. Online ahead of print.

ABSTRACT

The quality of chili peppers is closely related to their variety and geographical origin. The market often substitutes high-quality chili peppers with inferior ones, and cross-contamination occurs during processing. The existing methods cannot quickly and conveniently distinguish between different chili varieties or origins, which require expensive experimental equipment and professional skills. Techniques such as energy-dispersive X-ray fluorescence and inductively coupled plasma spectroscopy have been used for chili pepper classification and origin tracing, but these methods are either costly or destructive. To address the challenges of accurately identifying chili pepper varieties and origin tracing of chili peppers, this paper presents a sensor-aware convolutional network (SACNet) integrated with an electronic nose (e-nose) for accurate variety classification and origin traceability of chili peppers. The e-nose system collects gas samples from various chili peppers. We introduce a sensor attention module that adaptively focuses on the importance of each sensor in gathering gas information. Additionally, we introduce a local sensing and wide-area sensing structure to specifically capture gas information features, enabling high-precision identification of chili pepper gases. In comparative experiments with other networks, SACNet demonstrated excellent performance in both variety classification and origin traceability, and it showed significant advantages in terms of parameter quantity. Specifically, SACNet achieved 98.56 % accuracy in variety classification with Dataset A, 97.43 % accuracy in origin traceability with Dataset B, and 99.31 % accuracy with Dataset C. In summary, the combination of SACNet and an e-nose provides an effective strategy for identifying the varieties and origins of chili peppers.

PMID:40101378 | DOI:10.1016/j.foodchem.2025.143850

Categories: Literature Watch

UniSAL: Unified Semi-supervised Active Learning for histopathological image classification

Tue, 2025-03-18 06:00

Med Image Anal. 2025 Mar 12;102:103542. doi: 10.1016/j.media.2025.103542. Online ahead of print.

ABSTRACT

Histopathological image classification using deep learning is crucial for accurate and efficient cancer diagnosis. However, annotating a large amount of histopathological images for training is costly and time-consuming, leading to a scarcity of available labeled data for training deep neural networks. To reduce human efforts and improve efficiency for annotation, we propose a Unified Semi-supervised Active Learning framework (UniSAL) that effectively selects informative and representative samples for annotation. First, unlike most existing active learning methods that only train from labeled samples in each round, dual-view high-confidence pseudo training is proposed to utilize both labeled and unlabeled images to train a model for selecting query samples, where two networks operating on different augmented versions of an input image provide diverse pseudo labels for each other, and pseudo label-guided class-wise contrastive learning is introduced to obtain better feature representations for effective sample selection. Second, based on the trained model at each round, we design novel uncertain and representative sample selection strategy. It contains a Disagreement-aware Uncertainty Selector (DUS) to select informative uncertain samples with inconsistent predictions between the two networks, and a Compact Selector (CS) to remove redundancy of selected samples. We extensively evaluate our method on three public pathological image classification datasets, i.e., CRC5000, Chaoyang and CRC100K datasets, and the results demonstrate that our UniSAL significantly surpasses several state-of-the-art active learning methods, and reduces the annotation cost to around 10% to achieve a performance comparable to full annotation. Code is available at https://github.com/HiLab-git/UniSAL.

PMID:40101375 | DOI:10.1016/j.media.2025.103542

Categories: Literature Watch

Deep learning techniques for proton dose prediction across multiple anatomical sites and variable beam configurations

Tue, 2025-03-18 06:00

Phys Med Biol. 2025 Mar 18. doi: 10.1088/1361-6560/adc236. Online ahead of print.

ABSTRACT

&#xD;To evaluate the impact of beam mask implementation and data aggregation on artificial intelligence-based dose prediction accuracy in proton therapy, with a focus on scenarios involving limited or highly heterogeneous datasets.&#xD;Approach:&#xD;In this study, 541 prostate and 632 head and neck (H&N) proton therapy plans were used to train and evaluate convolutional neural networks designed for the task of dose prediction. Datasets were grouped by anatomical site and beam configuration to assess the impact of beam masks-graphical depictions of radiation paths-as a model input. We also evaluated the effect of combining datasets. Model performance was measured using dose-volume histograms (DVH) scores, mean absolute error, mean absolute percent error, Dice similarity coefficients (DSC), and gamma passing rates.&#xD;Main results:&#xD;DSC analysis revealed that the inclusion of beam masks improved dose prediction accuracy, particularly in low-dose regions and for datasets with diverse beam configurations. Data aggregation alone produced mixed results, with improvements in high-dose regions but potential degradation in low-dose areas. Notably, combining beam masks and data aggregation yielded the best overall performance, effectively leveraging the strengths of both strategies. Additionally, the magnitude of the improvements was larger for datasets with greater heterogeneity, with the combined approach increasing the DSC score by as much as 0.2 for a subgroup of H&N cases characterized by small size and heterogeneity in beam arrangement. DVH scores reflected these benefits, showing statistically significant improvements (p < 0.05) for the more heterogeneous H&N datasets.&#xD;Significance:&#xD;Artificial intelligence-based dose prediction models incorporating beam masks and data aggregation significantly improve accuracy in proton therapy planning, especially for complex cases. This technique could accelerate the planning process, enabling more efficient and effective cancer treatment strategies.&#xD.

PMID:40101365 | DOI:10.1088/1361-6560/adc236

Categories: Literature Watch

Emotion Forecasting: A Transformer-Based Approach

Tue, 2025-03-18 06:00

J Med Internet Res. 2025 Mar 18;27:e63962. doi: 10.2196/63962.

ABSTRACT

BACKGROUND: Monitoring the emotional states of patients with psychiatric problems has always been challenging due to the noncontinuous nature of clinical assessments, the effect of the health care environment, and the inherent subjectivity of evaluation instruments. However, mental states in psychiatric disorders exhibit substantial variability over time, making real-time monitoring crucial for preventing risky situations and ensuring appropriate treatment.

OBJECTIVE: This study aimed to leverage new technologies and deep learning techniques to enable more objective, real-time monitoring of patients. This was achieved by passively monitoring variables such as step count, patient location, and sleep patterns using mobile devices. We aimed to predict patient self-reports and detect sudden variations in their emotional valence, identifying situations that may require clinical intervention.

METHODS: Data for this project were collected using the Evidence-Based Behavior (eB2) app, which records both passive and self-reported variables daily. Passive data refer to behavioral information gathered via the eB2 app through sensors embedded in mobile devices and wearables. These data were obtained from studies conducted in collaboration with hospitals and clinics that used eB2. We used hidden Markov models (HMMs) to address missing data and transformer deep neural networks for time-series forecasting. Finally, classification algorithms were applied to predict several variables, including emotional state and responses to the Patient Health Questionnaire-9.

RESULTS: Through real-time patient monitoring, we demonstrated the ability to accurately predict patients' emotional states and anticipate changes over time. Specifically, our approach achieved high accuracy (0.93) and a receiver operating characteristic (ROC) area under the curve (AUC) of 0.98 for emotional valence classification. For predicting emotional state changes 1 day in advance, we obtained an ROC AUC of 0.87. Furthermore, we demonstrated the feasibility of forecasting responses to the Patient Health Questionnaire-9, with particularly strong performance for certain questions. For example, in question 9, related to suicidal ideation, our model achieved an accuracy of 0.9 and an ROC AUC of 0.77 for predicting the next day's response. Moreover, we illustrated the enhanced stability of multivariate time-series forecasting when HMM preprocessing was combined with a transformer model, as opposed to other time-series forecasting methods, such as recurrent neural networks or long short-term memory cells.

CONCLUSIONS: The stability of multivariate time-series forecasting improved when HMM preprocessing was combined with a transformer model, as opposed to other time-series forecasting methods (eg, recurrent neural network and long short-term memory), leveraging the attention mechanisms to capture longer time dependencies and gain interpretability. We showed the potential to assess the emotional state of a patient and the scores of psychiatric questionnaires from passive variables in advance. This allows real-time monitoring of patients and hence better risk detection and treatment adjustment.

PMID:40101216 | DOI:10.2196/63962

Categories: Literature Watch

Impact of Clinical Decision Support Systems on Medical Students' Case-Solving Performance: Comparison Study with a Focus Group

Tue, 2025-03-18 06:00

JMIR Med Educ. 2025 Mar 18;11:e55709. doi: 10.2196/55709.

ABSTRACT

BACKGROUND: Health care practitioners use clinical decision support systems (CDSS) as an aid in the crucial task of clinical reasoning and decision-making. Traditional CDSS are online repositories (ORs) and clinical practice guidelines (CPG). Recently, large language models (LLMs) such as ChatGPT have emerged as potential alternatives. They have proven to be powerful, innovative tools, yet they are not devoid of worrisome risks.

OBJECTIVE: This study aims to explore how medical students perform in an evaluated clinical case through the use of different CDSS tools.

METHODS: The authors randomly divided medical students into 3 groups, CPG, n=6 (38%); OR, n=5 (31%); and ChatGPT, n=5 (31%); and assigned each group a different type of CDSS for guidance in answering prespecified questions, assessing how students' speed and ability at resolving the same clinical case varied accordingly. External reviewers evaluated all answers based on accuracy and completeness metrics (score: 1-5). The authors analyzed and categorized group scores according to the skill investigated: differential diagnosis, diagnostic workup, and clinical decision-making.

RESULTS: Answering time showed a trend for the ChatGPT group to be the fastest. The mean scores for completeness were as follows: CPG 4.0, OR 3.7, and ChatGPT 3.8 (P=.49). The mean scores for accuracy were as follows: CPG 4.0, OR 3.3, and ChatGPT 3.7 (P=.02). Aggregating scores according to the 3 students' skill domains, trends in differences among the groups emerge more clearly, with the CPG group that performed best in nearly all domains and maintained almost perfect alignment between its completeness and accuracy.

CONCLUSIONS: This hands-on session provided valuable insights into the potential perks and associated pitfalls of LLMs in medical education and practice. It suggested the critical need to include teachings in medical degree courses on how to properly take advantage of LLMs, as the potential for misuse is evident and real.

PMID:40101183 | DOI:10.2196/55709

Categories: Literature Watch

Forecasting stock prices using long short-term memory involving attention approach: An application of stock exchange industry

Tue, 2025-03-18 06:00

PLoS One. 2025 Mar 18;20(3):e0319679. doi: 10.1371/journal.pone.0319679. eCollection 2025.

ABSTRACT

The Stability of the economy is always a great challenge across the world, especially in under developed countries. Many researchers have contributed to forecasting the Stock Market and controlling the situation to ensure economic stability over the past several decades. For this purpose, many researchers have built various models and gained benefits. This journey continues to date and will persist for the betterment of the stock market. This study is also a part of this journey, where four learning-based models are tailored for stock price prediction. Daily business data from the Karachi Stock Exchange (100 Index), covering from February 22, 2008 to February 23, 2021, is used for training and testing these models. This paper presenting four deep learning models with different architectures, namely the Artificial Neural Network model, the Recurrent Neural Network with Attention model, the Long Short-Term Memory Network with Attention model, and the Gated Recurrent Unit with Attention model. The Long Short-Term Memory with attention model was found to be the top-performing technique for accurately predicting stock exchange prices. During the Training, Validation and Testing Sessions, we observed the R-Squared values of the proposed model to be 0.9996, 0.9980 and 0.9921, respectively, making it the best-performing model among those mentioned above.

PMID:40100866 | DOI:10.1371/journal.pone.0319679

Categories: Literature Watch

Deep image features sensing with multilevel fusion for complex convolution neural networks &amp; cross domain benchmarks

Tue, 2025-03-18 06:00

PLoS One. 2025 Mar 18;20(3):e0317863. doi: 10.1371/journal.pone.0317863. eCollection 2025.

ABSTRACT

Efficient image retrieval from a variety of datasets is crucial in today's digital world. Visual properties are represented using primitive image signatures in Content Based Image Retrieval (CBIR). Feature vectors are employed to classify images into predefined categories. This research presents a unique feature identification technique based on suppression to locate interest points by computing productive sum of pixel derivatives by computing the differentials for corner scores. Scale space interpolation is applied to define interest points by combining color features from spatially ordered L2 normalized coefficients with shape and object information. Object based feature vectors are formed using high variance coefficients to reduce the complexity and are converted into bag-of-visual-words (BoVW) for effective retrieval and ranking. The presented method encompass feature vectors for information synthesis and improves the discriminating strength of the retrieval system by extracting deep image features including primitive, spatial, and overlayed using multilayer fusion of Convolutional Neural Networks(CNNs). Extensive experimentation is performed on standard image datasets benchmarks, including ALOT, Cifar-10, Corel-10k, Tropical Fruits, and Zubud. These datasets cover wide range of categories including shape, color, texture, spatial, and complicated objects. Experimental results demonstrate considerable improvements in precision and recall rates, average retrieval precision and recall, and mean average precision and recall rates across various image semantic groups within versatile datasets. The integration of traditional feature extraction methods fusion with multilevel CNN advances image sensing and retrieval systems, promising more accurate and efficient image retrieval solutions.

PMID:40100801 | DOI:10.1371/journal.pone.0317863

Categories: Literature Watch

Retraction: Control of hybrid electromagnetic bearing and elastic foil gas bearing under deep learning

Tue, 2025-03-18 06:00

PLoS One. 2025 Mar 18;20(3):e0320337. doi: 10.1371/journal.pone.0320337. eCollection 2025.

NO ABSTRACT

PMID:40100785 | DOI:10.1371/journal.pone.0320337

Categories: Literature Watch

Leveraging Extended Windows in End-to-End Deep Learning for Improved Continuous Myoelectric Locomotion Prediction

Tue, 2025-03-18 06:00

IEEE Trans Neural Syst Rehabil Eng. 2025 Mar 18;PP. doi: 10.1109/TNSRE.2025.3552530. Online ahead of print.

ABSTRACT

Current surface electromyography (sEMG) methods for locomotion mode prediction face limitations in anticipatory capability due to computation delays and constrained window lengths typically below 500ms-a practice historically tied to stationarity requirements of handcrafted feature extraction. This study investigates whether end-to-end convolutional neural networks (CNNs) processing raw sEMG signals can overcome these constraints through extended window lengths (250ms to 1500 ms). We systematically evaluate six window lengths paired with three prediction horizons (model forecasts 50ms to 150ms ahead) in a continuous locomotion task involving eight modes and 16 transitions. The optimal configuration (1000ms window with 150ms horizon) achieved subject-average accuracies of 96.93% (steady states) and 97.50% (transient states), maintaining 95.03% and 85.53% respectively in real-time simulations. With a net averaged anticipation time of 147.9ms after 2.1ms computation latency, this approach demonstrates that windows covering 74% of the gait cycle can synergize with deep learning to balance the inherent trade-off between extracting richer information and maintaining system responsiveness to changes in activity.

PMID:40100693 | DOI:10.1109/TNSRE.2025.3552530

Categories: Literature Watch

Privacy-Preserving Data Augmentation for Digital Pathology Using Improved DCGAN

Tue, 2025-03-18 06:00

IEEE J Biomed Health Inform. 2025 Mar 18;PP. doi: 10.1109/JBHI.2025.3551720. Online ahead of print.

ABSTRACT

The intelligent analysis of Whole Slide Images (WSI) in digital pathology is critical for advancing precision medicine, particularly in oncology. However, the availability of WSI datasets is often limited by privacy regulations, which constrains the performance and generalizability of deep learning models. To address this challenge, this paper proposes an improved data augmentation method based on Deep Convolutional Generative Adversarial Network (DCGAN). Our approach leverages self-supervised pretraining with the CTransPath model to extract diverse and representationally rich WSI features, which guide the generation of high-quality synthetic images. We further enhance the model by introducing a least-squares adversarial loss and a frequency domain loss to improve pixel-level accuracy and structural fidelity, while incorporating residual blocks and skip connections to increase network depth, mitigate gradient vanishing, and improve training stability. Experimental results on the PatchCamelyon dataset demonstrate that our improved DCGAN achieves superior SSIM and FID scores compared to traditional models. The augmented datasets significantly enhance the performance of downstream classification tasks, improving accuracy, AUC, and F1 scores.

PMID:40100674 | DOI:10.1109/JBHI.2025.3551720

Categories: Literature Watch

Population-Driven Synthesis of Personalized Cranial Development from Cross-Sectional Pediatric CT Images

Tue, 2025-03-18 06:00

IEEE Trans Biomed Eng. 2025 Mar 18;PP. doi: 10.1109/TBME.2025.3550842. Online ahead of print.

ABSTRACT

OBJECTIVE: Predicting normative pediatric growth is crucial to identify developmental anomalies. While traditional statistical and computational methods have shown promising results predicting personalized development, they either rely on statistical assumptions that limit generalizability or require longitudinal datasets, which are scarce in children. Recent deep learning methods trained with cross-sectional dataset have shown potential to predict temporal changes but have only succeeded at predicting local intensity changes and can hardly model major anatomical changes that occur during childhood. We present a novel deep learning method for image synthesis that can be trained using only cross-sectional data to make personalized predictions of pediatric development.

METHODS: We designed a new generative adversarial network (GAN) with a novel Siamese cyclic encoder-decoder generator architecture and an identity preservation mechanism. Our design allows the encoder to learn age- and sex-independent identity-preserving representations of patient phenotypes from single images by leveraging the statistical distributions in the cross-sectional dataset. The decoder learns to synthesize personalized images from the encoded representations at any age.

RESULTS: Trained using only cross-sectional head CT images from 2,014 subjects (age 0-10 years), our model demonstrated state-of-the-art performance evaluated on an independent longitudinal dataset with images from 51 subjects.

CONCLUSION: Our method can predict pediatric development and synthesize temporal image sequences with state-of-the-art accuracy without requiring longitudinal images for training.

SIGNIFICANCE: Our method enables the personalized prediction of pediatric growth and longitudinal synthesis of clinical images, hence providing a patient-specific reference of normative development.

PMID:40100672 | DOI:10.1109/TBME.2025.3550842

Categories: Literature Watch

Protein Language Pragmatic Analysis and Progressive Transfer Learning for Profiling Peptide-Protein Interactions

Tue, 2025-03-18 06:00

IEEE Trans Neural Netw Learn Syst. 2025 Mar 18;PP. doi: 10.1109/TNNLS.2025.3540291. Online ahead of print.

ABSTRACT

Protein complex structural data are growing at an unprecedented pace, but its complexity and diversity pose significant challenges for protein function research. Although deep learning models have been widely used to capture the syntactic structure, word semantics, or semantic meanings of polypeptide and protein sequences, these models often overlook the complex contextual information of sequences. Here, we propose interpretable interaction deep learning (IIDL)-peptide-protein interaction (PepPI), a deep learning model designed to tackle these challenges using data-driven and interpretable pragmatic analysis to profile PepPIs. IIDL-PepPI constructs bidirectional attention modules to represent the contextual information of peptides and proteins, enabling pragmatic analysis. It then adopts a progressive transfer learning framework to simultaneously predict PepPIs and identify binding residues for specific interactions, providing a solution for multilevel in-depth profiling. We validate the performance and robustness of IIDL-PepPI in accurately predicting peptide-protein binary interactions and identifying binding residues compared with the state-of-the-art methods. We further demonstrate the capability of IIDL-PepPI in peptide virtual drug screening and binding affinity assessment, which is expected to advance artificial intelligence-based peptide drug discovery and protein function elucidation.

PMID:40100664 | DOI:10.1109/TNNLS.2025.3540291

Categories: Literature Watch

Hard-aware Instance Adaptive Self-training for Unsupervised Cross-domain Semantic Segmentation

Tue, 2025-03-18 06:00

IEEE Trans Pattern Anal Mach Intell. 2025 Mar 18;PP. doi: 10.1109/TPAMI.2025.3552484. Online ahead of print.

ABSTRACT

The divergence between labeled training data and unlabeled testing data is a significant challenge for recent deep learning models. Unsupervised domain adaptation (UDA) attempts to solve such problem. Recent works show that self-training is a powerful approach to UDA. However, existing methods have difficulty in balancing the scalability and performance. In this paper, we propose a hard-aware instance adaptive self-training framework for UDA on the task of semantic segmentation. To effectively improve the quality and diversity of pseudo-labels, we develop a novel pseudo-label generation strategy with an instance adaptive selector. We further enrich the hard class pseudo-labels with inter-image information through a skillfully designed hard-aware pseudo-label augmentation. Besides, we propose the region-adaptive regularization to smooth the pseudo-label region and sharpen the non-pseudo-label region. For the non-pseudo-label region, consistency constraint is also constructed to introduce stronger supervision signals during model optimization. Our method is so concise and efficient that it is easy to be generalized to other UDA methods. Experiments on GTA5 Cityscapes, SYNTHIA Cityscapes, and Cityscapes Oxford RobotCar demonstrate the superior performance of our approach compared with the state-of-the-art methods. Our codes are available at https://github.com/bupt-ai-cz/HIAST.

PMID:40100655 | DOI:10.1109/TPAMI.2025.3552484

Categories: Literature Watch

Enhancing Patient Outcome Prediction Through Deep Learning With Sequential Diagnosis Codes From Structured Electronic Health Record Data: Systematic Review

Tue, 2025-03-18 06:00

J Med Internet Res. 2025 Mar 18;27:e57358. doi: 10.2196/57358.

ABSTRACT

BACKGROUND: The use of structured electronic health records in health care systems has grown rapidly. These systems collect huge amounts of patient information, including diagnosis codes representing temporal medical history. Sequential diagnostic information has proven valuable for predicting patient outcomes. However, the extent to which these types of data have been incorporated into deep learning (DL) models has not been examined.

OBJECTIVE: This systematic review aims to describe the use of sequential diagnostic data in DL models, specifically to understand how these data are integrated, whether sample size improves performance, and whether the identified models are generalizable.

METHODS: Relevant studies published up to May 15, 2023, were identified using 4 databases: PubMed, Embase, IEEE Xplore, and Web of Science. We included all studies using DL algorithms trained on sequential diagnosis codes to predict patient outcomes. We excluded review articles and non-peer-reviewed papers. We evaluated the following aspects in the included papers: DL techniques, characteristics of the dataset, prediction tasks, performance evaluation, generalizability, and explainability. We also assessed the risk of bias and applicability of the studies using the Prediction Model Study Risk of Bias Assessment Tool (PROBAST). We used the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist to report our findings.

RESULTS: Of the 740 identified papers, 84 (11.4%) met the eligibility criteria. Publications in this area increased yearly. Recurrent neural networks (and their derivatives; 47/84, 56%) and transformers (22/84, 26%) were the most commonly used architectures in DL-based models. Most studies (45/84, 54%) presented their input features as sequences of visit embeddings. Medications (38/84, 45%) were the most common additional feature. Of the 128 predictive outcome tasks, the most frequent was next-visit diagnosis (n=30, 23%), followed by heart failure (n=18, 14%) and mortality (n=17, 13%). Only 7 (8%) of the 84 studies evaluated their models in terms of generalizability. A positive correlation was observed between training sample size and model performance (area under the receiver operating characteristic curve; P=.02). However, 59 (70%) of the 84 studies had a high risk of bias.

CONCLUSIONS: The application of DL for advanced modeling of sequential medical codes has demonstrated remarkable promise in predicting patient outcomes. The main limitation of this study was the heterogeneity of methods and outcomes. However, our analysis found that using multiple types of features, integrating time intervals, and including larger sample sizes were generally related to an improved predictive performance. This review also highlights that very few studies (7/84, 8%) reported on challenges related to generalizability and less than half (38/84, 45%) of the studies reported on challenges related to explainability. Addressing these shortcomings will be instrumental in unlocking the full potential of DL for enhancing health care outcomes and patient care.

TRIAL REGISTRATION: PROSPERO CRD42018112161; https://tinyurl.com/yc6h9rwu.

PMID:40100249 | DOI:10.2196/57358

Categories: Literature Watch

Mining the UniProtKB/Swiss-Prot database for antimicrobial peptides

Tue, 2025-03-18 06:00

Protein Sci. 2025 Apr;34(4):e70083. doi: 10.1002/pro.70083.

ABSTRACT

The ever-growing global health threat of antibiotic resistance is compelling researchers to explore alternatives to conventional antibiotics. Antimicrobial peptides (AMPs) are emerging as a promising solution to fill this need. Naturally occurring AMPs are produced by all forms of life as part of the innate immune system. High-throughput bioinformatics tools have enabled fast and large-scale discovery of AMPs from genomic, transcriptomic, and proteomic resources of selected organisms. Public protein sequence databases, comprising over 200 million records and growing, serve as comprehensive compendia of sequences from a broad range of source organisms. Yet, large-scale in silico probing of those databases for novel AMP discovery using modern deep learning techniques has rarely been reported. In the present study, we propose an AMP mining workflow to predict novel AMPs from the UniProtKB/Swiss-Prot database using the AMP prediction tool, AMPlify, as its discovery engine. Using this workflow, we identified 8008 novel putative AMPs from all eukaryotic sequences in the database. Focusing on the practical use of AMPs as suitable antimicrobial agents with applications in the poultry industry, we prioritized 40 of those AMPs based on their similarities to known chicken AMPs in predicted structures. In our tests, 13 out of the 38 successfully synthesized peptides showed antimicrobial activity against Escherichia coli and/or Staphylococcus aureus. AMPlify and the companion scripts supporting the AMP mining workflow presented herein are publicly available at https://github.com/bcgsc/AMPlify.

PMID:40100125 | DOI:10.1002/pro.70083

Categories: Literature Watch

Deep learning approaches to predict late gadolinium enhancement and clinical outcomes in suspected cardiac sarcoidosis

Tue, 2025-03-18 06:00

Sarcoidosis Vasc Diffuse Lung Dis. 2025 Mar 18;42(1):15378. doi: 10.36141/svdld.v42i1.15378.

NO ABSTRACT

PMID:40100114 | DOI:10.36141/svdld.v42i1.15378

Categories: Literature Watch

A deep learning model based on chest CT to predict benign and malignant breast masses and axillary lymph node metastasis

Tue, 2025-03-18 06:00

Biomol Biomed. 2025 Mar 17. doi: 10.17305/bb.2025.12010. Online ahead of print.

ABSTRACT

Differentiating early-stage breast cancer from benign breast masses is crucial for radiologists. Additionally, accurately assessing axillary lymph node metastasis (ALNM) plays a significant role in clinical management and prognosis for breast cancer patients. Chest computed tomography (CT) is a commonly used imaging modality in physical and preoperative evaluations. This study aims to develop a deep learning model based on chest CT imaging to improve the preliminary assessment of breast lesions, potentially reducing the need for costly follow-up procedures such as magnetic resonance imaging (MRI) or positron emission tomography-CT and alleviating the financial and emotional burden on patients. We retrospectively collected chest CT images from 482 patients with breast masses, classifying them as benign (n = 224) or malignant (n = 258) based on pathological findings. The malignant group was further categorized into ALNM-positive (n = 91) and ALNM-negative (n = 167) subgroups. Patients were randomly divided into training, validation, and test sets in an 8:1:1 ratio, with the test set excluded from model development. All patients underwent non-contrast chest CT before surgery. After preprocessing the images through cropping, scaling, and standardization, we applied ResNet-34, ResNet-50, and ResNet-101 architectures to differentiate between benign and malignant masses and to assess ALNM. Model performance was evaluated using sensitivity, specificity, accuracy, receiver operating characteristic (ROC) curves, and the area under the curve (AUC). The ResNet models effectively distinguished benign from malignant masses, with ResNet-101 achieving the highest performance (AUC: 0.964; 95% CI: 0.948-0.981). It also demonstrated excellent predictive capability for ALNM (AUC: 0.951; 95% CI: 0.926-0.975). In conclusion, these deep learning models show strong diagnostic potential for both breast mass classification and ALNM prediction, offering a valuable tool for improving clinical decision-making.

PMID:40100034 | DOI:10.17305/bb.2025.12010

Categories: Literature Watch

Ratiometric, 3D Fluorescence Spectrum with Abundant Information for Tetracyclines Discrimination via Dual Biomolecules Recognition and Deep Learning

Tue, 2025-03-18 06:00

Anal Chem. 2025 Mar 18. doi: 10.1021/acs.analchem.4c07061. Online ahead of print.

ABSTRACT

Tetracyclines are widely used in bacteria infection treatment, while the subtle chemical differences between tetracyclines make it a challenge to accurate discrimination via biosensors. A 3D fluorescence spectrum can provide fingerprint structure information for many analytes, but a single probe-based method is prone to information overlap. Here, aptamers are first reported to obtain abundant information in a ratiometric, 3D fluorescence spectrum for deep learning to accurately discriminate tetracyclines. So, each tetracycline can be related to a distinct, ratiometric, 3D fluorescence spectrum via the strategy of dual biomolecules recognition. One artificial neural network model can efficiently treat this fingerprint information, and the qualitative/quantitative analysis of tetracyclines is successfully realized. The proposed dual biomolecule recognition strategy has been demonstrated to show a higher accuracy than a conventional single probe method. So, the ratiometric 3D fluorescence spectrum can enrich the fingerprint information for deep learning, providing a new strategy for 3D fluorescence-based analytes discrimination.

PMID:40099919 | DOI:10.1021/acs.analchem.4c07061

Categories: Literature Watch

A Molecular Representation to Identify Isofunctional Molecules

Tue, 2025-03-18 06:00

Mol Inform. 2025 Mar;44(3):e202400159. doi: 10.1002/minf.202400159.

ABSTRACT

The challenges of drug discovery from hit identification to clinical development sometimes involves addressing scaffold hopping issues, in order to optimise molecular biological activity or ADME properties, or mitigate toxicology concerns of a drug candidate. Docking is usually viewed as the method of choice for identification of isofunctional molecules, i. e. highly dissimilar molecules that share common binding modes with a protein target. However, the structure of the protein may not be suitable for docking because of a low resolution, or may even be unknown. This problem is frequently encountered in the case of membrane proteins, although they constitute an important category of the druggable proteome. In such cases, ligand-based approaches offer promise but are often inadequate to handle large-step scaffold hopping, because they usually rely on molecular structure. Therefore, we propose the Interaction Fingerprints Profile (IFPP), a molecular representation that captures molecules binding modes based on docking experiments against a panel of diverse high-quality proteins structures. Evaluation on the LH benchmark demonstrates the interest of IFPP for identification of isofunctional molecules. Nevertheless, computation of IFPPs is expensive, which limits its scalability for screening very large molecular libraries. We propose to overcome this limitation by leveraging Metric Learning approaches, allowing fast estimation of molecules IFPP similarities, thus providing an efficient pre-screening strategy that in applicable to very large molecular libraries. Overall, our results suggest that IFPP provides an interesting and complementary tool alongside existing methods, in order to address challenging scaffold hopping problems effectively in drug discovery.

PMID:40099892 | DOI:10.1002/minf.202400159

Categories: Literature Watch

X2-PEC: A Neural Network Model Based on Atomic Pair Energy Corrections

Tue, 2025-03-18 06:00

J Comput Chem. 2025 Mar 30;46(8):e70081. doi: 10.1002/jcc.70081.

ABSTRACT

With the development of artificial neural networks (ANNs), its applications in chemistry have become increasingly widespread, especially in the prediction of various molecular properties. This work introduces the X2-PEC method, that is, the second generalization of the X1 series of ANN methods developed in our group, utilizing pair energy correction (PEC). The essence of the X2 model lies in its feature vector construction, using overlap integrals and core Hamiltonian integrals to incorporate physical and chemical information into the feature vectors to describe atomic interactions. It aims to enhance the accuracy of low-rung density functional theory (DFT) calculations, such as those from the widely used BLYP/6-31G(d) or B3LYP/6-31G(2df,p) methods, to the level of top-rung DFT calculations, such as those from the highly accurate doubly hybrid XYGJ-OS/GTLarge method. Trained on the QM9 dataset, X2-PEC excels in predicting the atomization energies of isomers such as C6H8 and C4H4N2O with varying bonding structures. The performance of the X2-PEC model on standard enthalpies of formation for datasets such as G2-HCNOF, PSH36, ALKANE28, BIGMOL20, and HEDM45, as well as a HCNOF subset of BH9 for reaction barriers, is equally commendable, demonstrating its good generalization ability and predictive accuracy, as well as its potential for further development to achieve greater accuracy. These outcomes highlight the practical significance of the X2-PEC model in elevating the results from lower-rung DFT calculations to the level of higher-rung DFT calculations through deep learning.

PMID:40099806 | DOI:10.1002/jcc.70081

Categories: Literature Watch

Pages