Deep learning
Towards artificial intelligence application in pain medicine
Recenti Prog Med. 2025 Mar;116(3):156-161. doi: 10.1701/4460.44555.
ABSTRACT
Pain is a complex, multidimensional experience involving significant challenges in both diagnosis and management. While acute pain serves as a critical warning mechanism, chronic pain encompasses intricate biological, psychological, and social components, complicating its assessment and treatment. Artificial intelligence (AI) technologies are revolutionizing medicine and healthcare. Here we present an overview of the recent advances in AI for pain medicine. For example, the emergence of automatic pain assessment (APA) methodologies offers promising avenues for more objective pain evaluation. For APA aims, AI technologies, including machine learning algorithms and deep learning architectures such as natural language processing systems, have shown potential in analyzing biosignals, facial expressions, and speech patterns related to pain. However, the integration of these objective measures with traditional self-reporting remains essential for a comprehensive approach to pain diagnosis. Notably, APA models can be implemented for pain diagnosis in newborn and non-communicative patients. Additionally, the application of AI extends beyond pain diagnosis to personalized treatment strategies, predict opioid use disorders, education and training, clinical trajectory definition, and telehealth and real-time. Despite the potential of these innovations, challenges such as validation, parameter selection, and ethical aspects of technical implementation must be addressed.
PMID:40084580 | DOI:10.1701/4460.44555
Diagnosis and Post-Treatment Follow-Up Evaluation of Melasma Using Optical Coherence Tomography and Deep Learning
J Biophotonics. 2025 Mar 14:e70006. doi: 10.1002/jbio.70006. Online ahead of print.
ABSTRACT
Melasma is a common pigmentary disorder accompanied by tissue changes in composition and structure through the epidermis and dermis. In this study, we propose to employ optical coherence tomography (OCT) combined with deep learning techniques for melasma diagnostics. Specifically, a portable spectral domain OCT system with a handheld probe was developed for clinical skin imaging. Then, a diagnostic model was built based on the VGG16 neural network by adding a spatial attention mechanism. The results show that a good differentiation with an accuracy of 94.2% can be achieved among health datasets from healthy volunteers, and melasma and tissue-around-melasma datasets from melasma patients. Moreover, the same trained model was applied to treatment evaluation, showing a good capability to assess antivascular medicine treatment. Thus, it can be concluded that OCT combined with deep learning techniques has a good potential to aid in clinical diagnosis and treatment evaluation of melasma.
PMID:40084480 | DOI:10.1002/jbio.70006
Patho-Net: enhancing breast cancer classification using deep learning and explainable artificial intelligence
Am J Cancer Res. 2025 Feb 15;15(2):754-768. doi: 10.62347/XKFN1793. eCollection 2025.
ABSTRACT
Breast cancer is a disorder affecting women globally, and hence an early and precise classification is the best possible treatment to increase the survival rate. However, the breast cancer classification faced difficulties in scalability, fixed-size input images, and overfitting on limited datasets. To tackle these issues, this work proposes a Patho-Net model for breast cancer classification that overcomes the problems of scalability in color normalization, integrates the Gated Recurrent Unit (GRU) network with the U-Net architecture to process images without the need for resizing and computational efficiency, and addresses the overfitting problems. The proposed model collects and normalizes histopathology images using automated reference image selection with the Reinhard method for color standardization. Also, the Enhanced Adaptive Non-Local Means (EANLM) filtering is utilized for noise removal to preserve image features. These preprocessed images undergo semantic segmentation to isolate specific parts of an image, followed by feature extraction using an Improved Gray Level Co-occurrence Matrix (I-GLCM) to reveal fine patterns and textures in images. These features serve as input into the classification U-Net model integrated with GRU networks to improve the model performance. Finally, the classification result is expanded, and XAI is used for clear visual explanations of the model's predictions. The proposed Patho-Net model, which uses the 100X BreakHis dataset, achieves an accuracy of 98.90% in the classification of breast cancer.
PMID:40084355 | PMC:PMC11897615 | DOI:10.62347/XKFN1793
Multi-omics and single-cell analysis reveals machine learning-based pyrimidine metabolism-related signature in the prognosis of patients with lung adenocarcinoma
Int J Med Sci. 2025 Feb 18;22(6):1375-1392. doi: 10.7150/ijms.107694. eCollection 2025.
ABSTRACT
Background: Pyrimidine metabolism is a hallmark of tumor metabolic reprogramming, while its significance in the prognostic and therapeutic implications of patients with lung adenocarcinoma (LUAD) still remains unclear. Methods: In this study, an integrated framework of various machine learning and deep learning algorithms was used to develop the pyrimidine metabolism-related signature (PMRS). Its efficacy in genomic stability, chemotherapy and immunotherapy resistance was evaluated through comprehensive multi-omics analysis. The single-cell landscape of patients between PMRS subgroups was also elucidated. Subsequently, the biological functions of LYPD3, the most important coefficient factor in the PMRS model, were experimentally validated in LUAD cell lines. Results: The PMRS model with "random survival forest" algorithm exhibited the best performance and was utilized for further analysis. It displayed excellent accuracy and stability in various model evaluation assays. Compared to the PMRS-high subgroup, patients with lower PMRS scores had better survival outcomes, more stable genomic characteristics and higher sensitivity to immunotherapy. Single-cell analysis indicated that as PMRS increased, epithelial cells gradually exhibited malignant phenotypes with enhanced pyrimidine metabolism, while PMRS-high patients showed an inhibitory status of tumor immune microenvironment. Further experiments indicated that LYPD3 promoted the malignant progression in LUAD cell lines. Conclusion: Our study constructed the PMRS model, highlighting its potential value in the treatment and prognosis of LUAD patients and providing new insights into the individualized precision treatment for LUAD patients.
PMID:40084259 | PMC:PMC11898844 | DOI:10.7150/ijms.107694
Graph-Based 3-Dimensional Spatial Gene Neighborhood Networks of Single Cells in Gels and Tissues
BME Front. 2025 Mar 13;6:0110. doi: 10.34133/bmef.0110. eCollection 2025.
ABSTRACT
Objective: We developed 3-dimensional spatially resolved gene neighborhood network embedding (3D-spaGNN-E) to find subcellular gene proximity relationships and identify key subcellular motifs in cell-cell communication (CCC). Impact Statement: The pipeline combines 3D imaging-based spatial transcriptomics and graph-based deep learning to identify subcellular motifs. Introduction: Advancements in imaging and experimental technology allow the study of 3D spatially resolved transcriptomics and capture better spatial context than approximating the samples as 2D. However, the third spatial dimension increases the data complexity and requires new analyses. Methods: 3D-spaGNN-E detects single transcripts in 3D cell culture samples and identifies subcellular gene proximity relationships. Then, a graph autoencoder projects the gene proximity relationships into a latent space. We then applied explainability analysis to identify subcellular CCC motifs. Results: We first applied the pipeline to mesenchymal stem cells (MSCs) cultured in hydrogel. After clustering the cells based on the RNA count, we identified cells belonging to the same cluster as homotypic and those belonging to different clusters as heterotypic. We identified changes in local gene proximity near the border between homotypic and heterotypic cells. When applying the pipeline to the MSC-peripheral blood mononuclear cell (PBMC) coculture system, we identified CD4+ and CD8+ T cells. Local gene proximity and autoencoder embedding changes can distinguish strong and weak suppression of different immune cells. Lastly, we compared astrocyte-neuron CCC in mouse hypothalamus and cortex by analyzing 3D multiplexed-error-robust fluorescence in situ hybridization (MERFISH) data and identified regional gene proximity differences. Conclusion: 3D-spaGNN-E distinguished distinct CCCs in cell culture and tissue by examining subcellular motifs.
PMID:40084126 | PMC:PMC11906096 | DOI:10.34133/bmef.0110
Label-free Aβ plaque detection in Alzheimer's disease brain tissue using infrared microscopy and neural networks
Heliyon. 2025 Jan 18;11(4):e42111. doi: 10.1016/j.heliyon.2025.e42111. eCollection 2025 Feb 28.
ABSTRACT
We present a novel method for the label-free detection of amyloid-beta (Aβ) plaques, the key hallmark of Alzheimer's disease, in human brain tissue sections. Conventionally, immunohistochemistry (IHC) is employed for the characterization of Aβ plaques, hindering subsequent analysis. Here, a semi-supervised convolutional neural network (CNN) is trained to detect Aβ plaques in quantum cascade laser infrared (QCL-IR) microscopy images. Laser microdissection (LMD) is then used to precisely extract plaques from snap-frozen, unstained tissue sections. Mass spectrometry-based proteomics reveals a loss of soluble proteins in IHC stained samples. Our method prevents this loss and provides a novel tool that expands the scope of molecular analysis methods to chemically native plaques. Insight into soluble plaque components will complement our understanding of plaques and their role in Alzheimer's disease.
PMID:40083995 | PMC:PMC11903818 | DOI:10.1016/j.heliyon.2025.e42111
Effect of natural and synthetic noise data augmentation on physical action classification by brain-computer interface and deep learning
Front Neuroinform. 2025 Feb 27;19:1521805. doi: 10.3389/fninf.2025.1521805. eCollection 2025.
ABSTRACT
Analysis of electroencephalography (EEG) signals gathered by brain-computer interface (BCI) recently demonstrated that deep neural networks (DNNs) can be effectively used for investigation of time sequences for physical actions (PA) classification. In this study, the relatively simple DNN with fully connected network (FCN) components and convolutional neural network (CNN) components was considered to classify finger-palm-hand manipulations each from the grasp-and-lift (GAL) dataset. The main aim of this study was to imitate and investigate environmental influence by the proposed noise data augmentation (NDA) of two kinds: (i) natural NDA by inclusion of noise EEG data from neighboring regions by increasing the sampling size N and the different offset values for sample labeling and (ii) synthetic NDA by adding the generated Gaussian noise. The natural NDA by increasing N leads to the higher micro and macro area under the curve (AUC) for receiver operating curve values for the bigger N values than usage of synthetic NDA. The detrended fluctuation analysis (DFA) was applied to investigate the fluctuation properties and calculate the correspondent Hurst exponents H for the quantitative characterization of the fluctuation variability. H values for the low time window scales (< 2 s) are higher in comparison with ones for the bigger time window scales. For example, H more than 2-3 times higher for some PAs, i.e., it means that the shorter EEG fragments (< 2 s) demonstrate the scaling behavior of the higher complexity than the longer fragments. As far as these results were obtained by the relatively small DNN with the low resource requirements, this approach can be promising for porting such models to Edge Computing infrastructures on devices with the very limited computational resources.
PMID:40083893 | PMC:PMC11903462 | DOI:10.3389/fninf.2025.1521805
Multi-dimensional interpretable deep learning-radiomics based on intra-tumoral and spatial habitat for preoperative prediction of thymic epithelial tumours risk categorisation
Acta Oncol. 2025 Mar 13;64:391-405. doi: 10.2340/1651-226X.2025.42982.
ABSTRACT
BACKGROUND AND PURPOSE: This study aims to develop and compare combined models based on enhanced CT-based radiomics, multi-dimensional deep learning, clinical-conventional imaging and spatial habitat analysis to achieve accurate prediction of thymoma risk classification.
MATERIALS AND METHODS: 205 consecutive patients with thymoma confirmed by surgical pathology were recruited from three medical centers. Venous phase enhanced CT images were used to delineate the tumor, and radiomics, 2D and 3D deep learning models based on the whole tumor were established and feature extraction was performed. The tumors were divided into different sub-regions by K-means clustering method and the corresponding features were obtained. The clinical-conventional imaging data of the patients were collected and evaluated, and the univariate and multivariate analysis were used for screening. The above types of features were fused with each other to construct a variety of combined models. Quantitative indicators such as area under the receiver operating characteristic (ROC) curve (AUC) were calculated to evaluate the performance of the model.
RESULTS: The AUC of RDLCSM developed based on LightGBM classifier was 0.953 in the training cohort, 0.930 in the internal validation cohort, 0.924 and 0.903 in the two external validation cohorts, respectively. RDLCSM performs better than RDLM (AUC range: 0.831-0.890) and 2DLCSM (AUC range: 0.785-0.916) based on KNN. In addition, RDLCSM had the highest accuracy (0.818-0.882) and specificity (0.926-1.000).
INTERPRETATION: The RDLCSM, which combines whole-tumor radiomics, 2D and 3D deep learning, clinical-visual radiology, and subregional omics, can be used as a non-invasive tool to predict thymoma risk classification.
PMID:40079653 | DOI:10.2340/1651-226X.2025.42982
Exploring the repository of de novo-designed bifunctional antimicrobial peptides through deep learning
Elife. 2025 Mar 13;13:RP97330. doi: 10.7554/eLife.97330.
ABSTRACT
Antimicrobial peptides (AMPs) are attractive candidates to combat antibiotic resistance for their capability to target biomembranes and restrict a wide range of pathogens. It is a daunting challenge to discover novel AMPs due to their sparse distributions in a vast peptide universe, especially for peptides that demonstrate potencies for both bacterial membranes and viral envelopes. Here, we establish a de novo AMP design framework by bridging a deep generative module and a graph-encoding activity regressor. The generative module learns hidden 'grammars' of AMP features and produces candidates sequentially pass antimicrobial predictor and antiviral classifiers. We discovered 16 bifunctional AMPs and experimentally validated their abilities to inhibit a spectrum of pathogens in vitro and in animal models. Notably, P076 is a highly potent bactericide with the minimal inhibitory concentration of 0.21 μM against multidrug-resistant Acinetobacter baumannii, while P002 broadly inhibits five enveloped viruses. Our study provides feasible means to uncover the sequences that simultaneously encode antimicrobial and antiviral activities, thus bolstering the function spectra of AMPs to combat a wide range of drug-resistant infections.
PMID:40079572 | DOI:10.7554/eLife.97330
Deep learning and robotics enabled approach for audio based emotional pragmatics deficits identification in social communication disorders
Proc Inst Mech Eng H. 2025 Mar 13:9544119251325331. doi: 10.1177/09544119251325331. Online ahead of print.
ABSTRACT
The aim of this study is to develop Deep Learning (DL) enabled robotic systems to identify audio-based emotional pragmatics deficits in individuals with social pragmatic communication deficits. The novelty of the work stems from its integration of deep learning with a robotics platform for identifying emotional pragmatics deficits. In this study, the proposed methodology utilizes the implementation of machine and DL-based classification techniques, which have been applied to a collection of open-source datasets to identify audio emotions. The application of pre-processing and converting audio signals of different emotions utilizing Mel-Frequency Cepstral Coefficients (MFCC) resulted in improved emotion classification. The data generated using MFCC were used for the training of machine or DL models. The trained models were then tested on a randomly selected dataset. DL has been proven to be more effective in the identification of emotions using robotic structure. As the data generated by MFCC is of a single dimension, therefore, one-dimensional DL algorithms, such as 1D-Convolution Neural Network, Long Short-Term Memory, and Bidirectional-Long Short-Term Memory, were utilized. In comparison to other algorithms, bidirectional Long Short-Term Memory model has resulted in higher accuracy (96.24%), loss (0.2524 in value), precision (92.87%), and recall (92.87%) in comparison to other machine and DL algorithms. Further, the proposed model was deployed on the robotic structure for real-time detection for improvement of social-emotional pragmatic responses in individuals with deficits. The approach can serve as a potential tool for the individuals with pragmatic communication deficits.
PMID:40079556 | DOI:10.1177/09544119251325331
Harnessing Electronic Health Records and Artificial Intelligence for Enhanced Cardiovascular Risk Prediction: A Comprehensive Review
J Am Heart Assoc. 2025 Mar 13:e036946. doi: 10.1161/JAHA.124.036946. Online ahead of print.
ABSTRACT
Electronic health records (EHR) have revolutionized cardiovascular disease (CVD) research by enabling comprehensive, large-scale, and dynamic data collection. Integrating EHR data with advanced analytical methods, including artificial intelligence (AI), transforms CVD risk prediction and management methodologies. This review examines the advancements and challenges of using EHR in developing CVD prediction models, covering traditional and AI-based approaches. While EHR-based CVD risk prediction has greatly improved, moving from models that integrate real-world data on medication use and imaging, challenges persist regarding data quality, standardization across health care systems, and geographic variability. The complexity of EHR data requires sophisticated computational methods and multidisciplinary approaches for effective CVD risk modeling. AI's deep learning enhances prediction performance but faces limitations in interpretability and the need for validation and recalibration for diverse populations. The future of CVD risk prediction and management increasingly depends on using EHR and AI technologies effectively. Addressing data quality issues and overcoming limitations from retrospective data analysis are critical for improving the reliability and applicability of risk prediction models. Integrating multidimensional data, including environmental, lifestyle, social, and genomic factors, could significantly enhance risk assessment. These models require continuous validation and recalibration to ensure their adaptability to diverse populations and evolving health care environments, providing reassurance about their reliability.
PMID:40079336 | DOI:10.1161/JAHA.124.036946
Seq2Topt: a sequence-based deep learning predictor of enzyme optimal temperature
Brief Bioinform. 2025 Mar 4;26(2):bbaf114. doi: 10.1093/bib/bbaf114.
ABSTRACT
An accurate deep learning predictor is needed for enzyme optimal temperature (${T}_{opt}$), which quantitatively describes how temperature affects the enzyme catalytic activity. In comparison with existing models, a new model developed in this study, Seq2Topt, reached a superior accuracy on ${T}_{opt}$ prediction just using protein sequences (RMSE = 12.26°C and R2 = 0.57), and could capture key protein regions for enzyme ${T}_{opt}$ with multi-head attention on residues. Through case studies on thermophilic enzyme selection and predicting enzyme ${T}_{opt}$ shifts caused by point mutations, Seq2Topt was demonstrated as a promising computational tool for enzyme mining and in-silico enzyme design. Additionally, accurate deep learning predictors of enzyme optimal pH (Seq2pHopt, RMSE = 0.88 and R2 = 0.42) and melting temperature (Seq2Tm, RMSE = 7.57 °C and R2 = 0.64) were developed based on the model architecture of Seq2Topt, suggesting that the development of Seq2Topt could potentially give rise to a useful prediction platform of enzymes.
PMID:40079266 | DOI:10.1093/bib/bbaf114
Thermal Adaptation of Cytosolic Malate Dehydrogenase Revealed by Deep Learning and Coevolutionary Analysis
J Chem Theory Comput. 2025 Mar 13. doi: 10.1021/acs.jctc.4c01774. Online ahead of print.
ABSTRACT
Protein evolution has shaped enzymes that maintain stability and function across diverse thermal environments. While sequence variation, thermal stability and conformational dynamics are known to influence an enzyme's thermal adaptation, how these factors collectively govern stability and function across diverse temperatures remains unresolved. Cytosolic malate dehydrogenase (cMDH), a citric acid cycle enzyme, is an ideal model for studying these mechanisms due to its temperature-sensitive flexibility and broad presence in species from diverse thermal environments. In this study, we employ techniques inspired by deep learning and statistical mechanics to uncover how sequence variation and conformational dynamics shape patterns of cMDH's thermal adaptation. By integrating coevolutionary models with variational autoencoders (VAE), we generate a latent generative landscape (LGL) of the cMDH sequence space, enabling us to explore mutational pathways and predict fitness using direct coupling analysis (DCA). Structure predictions via AlphaFold and molecular dynamics simulations further illuminate how variations in hydrophobic interactions and conformational flexibility contribute to the thermal stability of warm- and cold-adapted cMDH orthologs. Notably, we identify the ratio of hydrophobic contacts between two regions as a predictive order parameter for thermal stability features, providing a quantitative metric for understanding cMDH dynamics across temperatures. The integrative computational framework employed in this study provides mechanistic insights into protein adaptation at both sequence and structural levels, offering unique perspectives on the evolution of thermal stability and creating avenues for the rational design of proteins with optimized thermal properties.
PMID:40079215 | DOI:10.1021/acs.jctc.4c01774
Optical label-free microscopy characterization of dielectric nanoparticles
Nanoscale. 2025 Mar 13. doi: 10.1039/d4nr03860f. Online ahead of print.
ABSTRACT
In order to relate nanoparticle properties to function, fast and detailed particle characterization is needed. The ability to characterize nanoparticle samples using optical microscopy techniques has drastically improved over the past few decades; consequently, there are now numerous microscopy methods available for detailed characterization of particles with nanometric size. However, there is currently no "one size fits all" solution to the problem of nanoparticle characterization. Instead, since the available techniques have different detection limits and deliver related but different quantitative information, the measurement and analysis approaches need to be selected and adapted for the sample at hand. In this tutorial, we review the optical theory of single particle scattering and how it relates to the differences and similarities in the quantitative particle information obtained from commonly used label-free microscopy techniques, with an emphasis on nanometric (submicron) sized dielectric particles. Particular emphasis is placed on how the optical signal relates to mass, size, structure, and material properties of the detected particles and to its combination with diffusivity-based particle sizing. We also discuss emerging opportunities in the wake of new technology development, including examples of adaptable python notebooks for deep learning image analysis, with the ambition to guide the choice of measurement strategy based on various challenges related to different types of nanoparticle samples and associated analytical demands.
PMID:40079204 | DOI:10.1039/d4nr03860f
A multi-objective function for deep learning-based automatic energy efficiency power allocation in multicarrier noma system using hybrid heuristic improvement
Network. 2025 Mar 13:1-32. doi: 10.1080/0954898X.2025.2461046. Online ahead of print.
ABSTRACT
Non-Orthogonal Multiple Access (NOMA) is the successive multiple-access methodologies for modern communication devices. Energy Efficiency (EE) is suggested in the NOMA system. In dynamic network conditions, the consideration of NOMA shows high computational complexity that minimizes the EE to degrade the system performance. This research suggested EE for the Multi-Carrier NOMA (MC-NOMA) models by optimization algorithm. The main scope of this research tends to improve the EE by Hybrid of Sewing Training and Lemur Optimization for optimizing the system parameters. The improvement made in this developed HSTLO algorithm can provide significant impact on MC-NOMA system, which it renders better user capacity while effectively optimizing the system parameters. Moreover, the Dilated Dense Recurrent Neural Network (DDRNN) model is developed. Employing the improvement in the deep learning model for the MC-NOMA system could effectively manage and enhance the system performance. Considering the DDRNN model can leverage to provide better generalization outcomes in different network scenarios that ensures to provide fast and reliable solutions compared to existing methods. Addressing the energy consumption problems in this research study will be analysed to show the advancement in MC-NOMA system that help to enhance the system performance.
PMID:40079096 | DOI:10.1080/0954898X.2025.2461046
Impact of menopause and age on breast density and background parenchymal enhancement in dynamic contrast-enhanced magnetic resonance imaging
J Med Imaging (Bellingham). 2025 Nov;12(Suppl 2):S22002. doi: 10.1117/1.JMI.12.S2.S22002. Epub 2025 Mar 11.
ABSTRACT
PURPOSE: Breast density (BD) and background parenchymal enhancement (BPE) are important imaging biomarkers for breast cancer (BC) risk. We aim to evaluate longitudinal changes in quantitative BD and BPE in high-risk women undergoing dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI), focusing on the effects of age and transition into menopause.
APPROACH: A retrospective cohort study analyzed 834 high-risk women undergoing breast DCE-MRI for screening between 2005 and 2020. Quantitative BD and BPE were derived using deep-learning segmentation. Linear mixed-effects models assessed longitudinal changes and the effects of age, menopausal status, weeks since the last menstrual period (LMP-wks), body mass index (BMI), and hormone replacement therapy (HRT) on these imaging biomarkers.
RESULTS: BD decreased with age across all menopausal stages, whereas BPE declined with age in postmenopausal women but remained stable in premenopausal women. HRT elevated BPE in postmenopausal women. Perimenopausal women exhibited decreases in both BD and BPE during the menopausal transition, though cross-sectional age at menopause had no significant effect on either measure. Fibroglandular tissue was positively associated with BPE in perimenopausal women.
CONCLUSIONS: We highlight the dynamic impact of menopause on BD and BPE and correlate well with the known relationship between risk and age at menopause. These findings advance the understanding of imaging biomarkers in high-risk populations and may contribute to the development of improved risk assessment leading to personalized chemoprevention and BC screening recommendations.
PMID:40078986 | PMC:PMC11894108 | DOI:10.1117/1.JMI.12.S2.S22002
Medical image classification by incorporating clinical variables and learned features
R Soc Open Sci. 2025 Mar 12;12(3):241222. doi: 10.1098/rsos.241222. eCollection 2025 Mar.
ABSTRACT
Medical image classification plays an important role in medical imaging. In this work, we present a novel approach to enhance deep learning models in medical image classification by incorporating clinical variables without overwhelming the information. Unlike most existing deep neural network models that only consider single-pixel information, our method captures a more comprehensive view. Our method contains two main steps and is effective in tackling the extra challenge raised by the scarcity of medical data. Firstly, we employ a pre-trained deep neural network served as a feature extractor to capture meaningful image features. Then, an exquisite discriminant analysis is applied to reduce the dimensionality of these features, ensuring that the low number of features remains optimized for the classification task and striking a balance with the clinical variables information. We also develop a way of obtaining class activation maps for our approach in visualizing models' focus on specific regions within the low-dimensional feature space. Thorough experimental results demonstrate improvements of our proposed method over state-of-the-art methods for tuberculosis and dermatology issues for example. Furthermore, a comprehensive comparison with a popular dimensionality reduction technique (principal component analysis) is also conducted.
PMID:40078919 | PMC:PMC11897822 | DOI:10.1098/rsos.241222
Singing to speech conversion with generative flow
EURASIP J Audio Speech Music Process. 2025;2025(1):12. doi: 10.1186/s13636-025-00400-x. Epub 2025 Mar 10.
ABSTRACT
This paper introduces singing to speech conversion (S2S), a cross-domain voice conversion task, and presents the first deep learning-based S2S system. S2S aims to transform singing into speech while retaining the phonetic information, reducing variations in pitch, rhythm, and timbre. Inspired by the Glow-TTS architecture, the proposed model is built using generative flow, with an adjusted alignment module between the latent features. We adapt the original monotonic alignment search (MAS) to the S2S scenario and utilize a duration predictor to deal with the duration differences between the two modalities. Subjective evaluations show that the proposed model outperforms signal processing baselines in naturalness and outperforms a transcribe-and-synthesize baseline in phonetic similarity to the original singing. We further demonstrate that singing-to-speech could be an effective augmentation method for low-resource lyrics transcription.
PMID:40078713 | PMC:PMC11893632 | DOI:10.1186/s13636-025-00400-x
Artificial intelligence integration in surgery through hand and instrument tracking: a systematic literature review
Front Surg. 2025 Feb 26;12:1528362. doi: 10.3389/fsurg.2025.1528362. eCollection 2025.
ABSTRACT
OBJECTIVE: This systematic literature review of the integration of artificial intelligence (AI) applications in surgical practice through hand and instrument tracking provides an overview of recent advancements and analyzes current literature on the intersection of surgery with AI. Distinct AI algorithms and specific applications in surgical practice are also examined.
METHODS: An advanced search using medical subject heading terms was conducted in Medline (via PubMed), SCOPUS, and Embase databases for articles published in English. A strict selection process was performed, adhering to PRISMA guidelines.
RESULTS: A total of 225 articles were retrieved. After screening, 77 met inclusion criteria and were included in the review. Use of AI algorithms in surgical practice was uncommon during 2013-2017 but has gained significant popularity since 2018. Deep learning algorithms (n = 62) are increasingly preferred over traditional machine learning algorithms (n = 15). These technologies are used in surgical fields such as general surgery (n = 19), neurosurgery (n = 10), and ophthalmology (n = 9). The most common functional sensors and systems used were prerecorded videos (n = 29), cameras (n = 21), and image datasets (n = 7). The most common applications included laparoscopic (n = 13), robotic-assisted (n = 13), basic (n = 12), and endoscopic (n = 8) surgical skills training, as well as surgical simulation training (n = 8).
CONCLUSION: AI technologies can be tailored to address distinct needs in surgical education and patient care. The use of AI in hand and instrument tracking improves surgical outcomes by optimizing surgical skills training. It is essential to acknowledge the current technical and social limitations of AI and work toward filling those gaps in future studies.
PMID:40078701 | PMC:PMC11897506 | DOI:10.3389/fsurg.2025.1528362
Deep learning-based multi-task prediction of response to neoadjuvant chemotherapy using multiscale whole slide images in breast cancer: A multicenter study
Chin J Cancer Res. 2025 Jan 30;37(1):28-47. doi: 10.21147/j.issn.1000-9604.2025.01.03.
ABSTRACT
OBJECTIVE: Early predicting response before neoadjuvant chemotherapy (NAC) is crucial for personalized treatment plans for locally advanced breast cancer patients. We aim to develop a multi-task model using multiscale whole slide images (WSIs) features to predict the response to breast cancer NAC more finely.
METHODS: This work collected 1,670 whole slide images for training and validation sets, internal testing sets, external testing sets, and prospective testing sets of the weakly-supervised deep learning-based multi-task model (DLMM) in predicting treatment response and pCR to NAC. Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations, and controls the expressiveness of each representation via a gating-based attention mechanism.
RESULTS: In the retrospective analysis, DLMM exhibited excellent predictive performance for the prediction of treatment response, with area under the receiver operating characteristic curves (AUCs) of 0.869 [95% confidence interval (95% CI): 0.806-0.933] in the internal testing set and 0.841 (95% CI: 0.814-0.867) in the external testing sets. For the pCR prediction task, DLMM reached AUCs of 0.865 (95% CI: 0.763-0.964) in the internal testing and 0.821 (95% CI: 0.763-0.878) in the pooled external testing set. In the prospective testing study, DLMM also demonstrated favorable predictive performance, with AUCs of 0.829 (95% CI: 0.754-0.903) and 0.821 (95% CI: 0.692-0.949) in treatment response and pCR prediction, respectively. DLMM significantly outperformed the baseline models in all testing sets (P<0.05). Heatmaps were employed to interpret the decision-making basis of the model. Furthermore, it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.
CONCLUSIONS: The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.
PMID:40078559 | PMC:PMC11893347 | DOI:10.21147/j.issn.1000-9604.2025.01.03