Deep learning
Deep Learning-Based Molecular Fingerprint Prediction for Metabolite Annotation
Metabolites. 2025 Feb 14;15(2):132. doi: 10.3390/metabo15020132.
ABSTRACT
Background/Objectives: Liquid chromatography coupled with mass spectrometry (LC-MS) is a commonly used platform for many metabolomics studies. However, metabolite annotation has been a major bottleneck in these studies in part due to the limited publicly available spectral libraries, which consist of tandem mass spectrometry (MS/MS) data acquired from just a fraction of known compounds. Application of deep learning methods is increasingly reported as an alternative to spectral matching due to their ability to map complex relationships between molecular fingerprints and mass spectrometric measurements. The objectives of this study are to investigate deep learning methods for molecular fingerprint based on MS/MS spectra and to rank putative metabolite IDs according to similarity of their known and predicted molecular fingerprints. Methods: We trained three types of deep learning methods to model the relationships between molecular fingerprints and MS/MS spectra. Prior to training, various data processing steps, including scaling, binning, and filtering, were performed on MS/MS spectra obtained from National Institute of Standards and Technology (NIST), MassBank of North America (MoNA), and Human Metabolome Database (HMDB). Furthermore, selection of the most relevant m/z bins and molecular fingerprints was conducted. The trained deep learning models were evaluated on ranking putative metabolite IDs obtained from a compound database for the challenges in Critical Assessment of Small Molecule Identification (CASMI) 2016, CASMI 2017, and CASMI 2022 benchmark datasets. Results: Feature selection methods effectively reduced redundant molecular and spectral features prior to model training. Deep learning methods trained with the truncated features have shown comparable performances against CSI:FingerID on ranking putative metabolite IDs. Conclusion: The results demonstrate a promising potential of deep learning methods for metabolite annotation.
PMID:39997757 | DOI:10.3390/metabo15020132
Metabolic Objectives and Trade-Offs: Inference and Applications
Metabolites. 2025 Feb 6;15(2):101. doi: 10.3390/metabo15020101.
ABSTRACT
Background/Objectives: Determining appropriate cellular objectives is crucial for the system-scale modeling of biological networks for metabolic engineering, cellular reprogramming, and drug discovery applications. The mathematical representation of metabolic objectives can describe how cells manage limited resources to achieve biological goals within mechanistic and environmental constraints. While rapidly proliferating cells like tumors are often assumed to prioritize biomass production, mammalian cell types can exhibit objectives beyond growth, such as supporting tissue functions, developmental processes, and redox homeostasis. Methods: This review addresses the challenge of determining metabolic objectives and trade-offs from multiomics data. Results: Recent advances in single-cell omics, metabolic modeling, and machine/deep learning methods have enabled the inference of cellular objectives at both the transcriptomic and metabolic levels, bridging gene expression patterns with metabolic phenotypes. Conclusions: These in silico models provide insights into how cells adapt to changing environments, drug treatments, and genetic manipulations. We further explore the potential application of incorporating cellular objectives into personalized medicine, drug discovery, tissue engineering, and systems biology.
PMID:39997726 | DOI:10.3390/metabo15020101
AI-based quality assessment methods for protein structure models from cryo-EM
Curr Res Struct Biol. 2025 Feb 2;9:100164. doi: 10.1016/j.crstbi.2025.100164. eCollection 2025 Jun.
ABSTRACT
Cryogenic electron microscopy (cryo-EM) has revolutionized structural biology, with an increasing number of structures being determined by cryo-EM each year, many at higher resolutions. However, challenges remain in accurately interpreting cryo-EM maps. Inaccuracies can arise in regions of locally low resolution, where manual model building is more prone to errors. Validation scores for structure models have been developed to assess both the compatibility between map density and the structure, as well as the geometric and stereochemical properties of protein models. Recent advancements have introduced artificial intelligence (AI) into this field. These emerging AI-driven tools offer unique capabilities in the validation and refinement of cryo-EM-derived protein atomic models, potentially leading to more accurate protein structures and deeper insights into complex biological systems.
PMID:39996138 | PMC:PMC11848767 | DOI:10.1016/j.crstbi.2025.100164
Exploring artificial intelligence in orthopaedics: A collaborative survey from the ISAKOS Young Professional Task Force
J Exp Orthop. 2025 Feb 24;12(1):e70181. doi: 10.1002/jeo2.70181. eCollection 2025 Jan.
ABSTRACT
PURPOSE: Through an analysis of findings from a survey about the use of artificial intelligence (AI) in orthopaedics, the aim of this study was to establish a scholarly foundation for the discourse on AI in orthopaedics and to elucidate key patterns, challenges and potential future trajectories for AI applications within the field.
METHODS: The International Society of Arthroscopy, Knee Surgery and Orthopaedic Sports Medicine (ISAKOS) Young Professionals Task Force developed a survey to collect feedback on issues related to the use of AI in the orthopaedic field. The survey included 26 questions. Data obtained from the completed questionnaires were transferred to a spreadsheet and then analyzed.
RESULTS: Two hundred and eleven orthopaedic surgeons completed the survey. The survey encompassed responses from a diverse cohort of orthopaedic professionals, predominantly comprising males (92.9%). There was wide representation across all geographic regions. A notable proportion (52.1%) reported uncertainty or lack of differentiation among AI, machine learning and deep learning (47.9%). Respondents identified imaging-based diagnosis (60.2%) as the primary field of orthopaedics poised to benefit from AI. A considerable proportion (25.1%) reported using AI in their practice, with primary reasons including referencing scientific literature/publications (40.3%). The vast majority expressed interest in leveraging AI technologies (95.3%), demonstrating an inclination towards incorporating AI into orthopaedic practice. Respondents indicated specific areas of interest for further study, including prediction of patient outcomes after surgery (30.8%) and image-based diagnosis of osteoarthritis (28%).
CONCLUSIONS: This survey demonstrates that there is currently limited use of AI in orthopaedic practice, mainly due to a lack of knowledge about the subject, a lack of proven evidence of its real utility and high costs. These findings are in accordance with other surveys in the literature. However, there is also a high level of interest in its use in the future, in increased study and further research on the subject, so that it can be of real benefit and make AI an integral part of the orthopaedic surgeon's daily work.
LEVEL OF EVIDENCE: Level IV, survey study.
PMID:39996084 | PMC:PMC11848192 | DOI:10.1002/jeo2.70181
Brain analysis to approach human muscles synergy using deep learning
Cogn Neurodyn. 2025 Dec;19(1):44. doi: 10.1007/s11571-025-10228-y. Epub 2025 Feb 22.
ABSTRACT
Brain signals and muscle movements have been analyzed using electroencephalogram (EEG) data in several studies. EEG signals contain a lot of noise, such as electromyographic (EMG) waves. Further studies have been done to improve the quality of the results, though it is thought that the combination of these two signals can lead to a significant improvement in the synergistic analysis of muscle movements and muscle connections. Using graph theory, this study examined the interaction of EMG and EEG signals during hand movement and estimated the synergy between muscle and brain signals. Mapping of the brain diagram was also developed to reconstruct the muscle signals from the muscle connections in the brain diagram. The proposed method included noise removal from EEG and EMG signals, graph feature analysis from EEG, and synergy calculation from EMG. Two methods were used to estimate synergy. In the first method, after calculating the brain connections, the features of the communication graph were extracted and then synergy estimating was made with neural networks. In the second method, a convolutional network created a transition from the matrix of brain connections to the synergistic EMG signal. This study reached the high correlation values of 99.8% and maximum MSE error of 0.0084. Compared to other graph-based methods, this method based on regression analysis had a very significant performance. This research can lead to the improvement of rehabilitation methods and brain-computer interfaces.
PMID:39996071 | PMC:PMC11846801 | DOI:10.1007/s11571-025-10228-y
UAlpha40: A comprehensive dataset of Urdu alphabet for Pakistan sign language
Data Brief. 2025 Jan 28;59:111342. doi: 10.1016/j.dib.2025.111342. eCollection 2025 Apr.
ABSTRACT
Language bridges the gap of communication, and Sign Language (SL) is a native language among vocal and mute community. Every region has its own sign language. In Pakistan, Urdu Sign Language (USL) is a visual gesture language used by the deaf community for communication. The Urdu alphabet in Pakistan Sign Language consists not only of static gestures but also includes dynamic gestures. There are a total of 40 alphabets in Urdu sign language, with 36 being static and 4 being dynamic. While researchers have focused on the 36 static gestures, the 4 dynamic gestures have been overlooked. Additionally, there remains a lack of advancements in the development of Pakistan Sign Language (PSL) with respect to Urdu alphabets. A dataset named UAlpa40 has been compiled, comprising 22,280 images, among which 2,897 are originally created and 19,383 are created through noise or augmentation, representing the 36 static gestures and 393 videos representing the 4 dynamic gestures, completing the set of 40 Urdu alphabets. The standard gestures for USL are published by the Family Educational Services Foundation (FESF) for the deaf and mute community of Pakistan. This dataset was prepared in real-world environments under expert supervision, with volunteers ranging from males to females aged 20 to 45. This newly developed dataset can be utilized to train vision-based deep learning models, which in turn can aid in the development of sign language translators and finger-spelling systems for USL.
PMID:39996049 | PMC:PMC11848795 | DOI:10.1016/j.dib.2025.111342
Identifying relevant EEG channels for subject-independent emotion recognition using attention network layers
Front Psychiatry. 2025 Feb 10;16:1494369. doi: 10.3389/fpsyt.2025.1494369. eCollection 2025.
ABSTRACT
BACKGROUND: Electrical activity recorded with electroencephalography (EEG) enables the development of predictive models for emotion recognition. These models can be built using two approaches: subject-dependent and subject-independent. Although subject-independent models offer greater practical utility compared to subject-dependent models, they face challenges due to the significant variability of EEG signals between individuals.
OBJECTIVE: One potential solution to enhance subject-independent approaches is to identify EEG channels that are consistently relevant across different individuals for predicting emotion. With the growing use of deep learning in emotion recognition, incorporating attention mechanisms can help uncover these shared predictive patterns.
METHODS: This study explores this method by applying attention mechanism layers to identify EEG channels that are relevant for predicting emotions in three independent datasets (SEED, SEED-IV, and SEED-V).
RESULTS: The model achieved average accuracies of 79.3% (CI: 76.0-82.5%), 69.5% (95% CI: 64.2-74.8%) and 60.7% (95% CI: 52.3-69.2%) on these datasets, revealing that EEG channels located along the head circumference, including Fp 1, Fp 2, F 7, F 8, T 7, T 8, P 7, P 8, O 1, and O 2, are the most crucial for emotion prediction.
CONCLUSION: These results emphasize the importance of capturing relevant electrical activity from these EEG channels, thereby facilitating the prediction of emotions evoked by audiovisual stimuli in subject-independent approaches.
PMID:39995952 | PMC:PMC11847823 | DOI:10.3389/fpsyt.2025.1494369
Deep Learning for Predicting Biomolecular Binding Sites of Proteins
Research (Wash D C). 2025 Feb 24;8:0615. doi: 10.34133/research.0615. eCollection 2025.
ABSTRACT
The rapid evolution of deep learning has markedly enhanced protein-biomolecule binding site prediction, offering insights essential for drug discovery, mutation analysis, and molecular biology. Advancements in both sequence-based and structure-based methods demonstrate their distinct strengths and limitations. Sequence-based approaches offer efficiency and adaptability, while structure-based techniques provide spatial precision but require high-quality structural data. Emerging trends in hybrid models that combine multimodal data, such as integrating sequence and structural information, along with innovations in geometric deep learning, present promising directions for improving prediction accuracy. This perspective summarizes challenges such as computational demands and dynamic modeling and proposes strategies for future research. The ultimate goal is the development of computationally efficient and flexible models capable of capturing the complexity of real-world biomolecular interactions, thereby broadening the scope and applicability of binding site predictions across a wide range of biomedical contexts.
PMID:39995900 | PMC:PMC11848751 | DOI:10.34133/research.0615
Editorial: Computer vision and image synthesis for neurological applications
Front Comput Neurosci. 2025 Feb 10;19:1561635. doi: 10.3389/fncom.2025.1561635. eCollection 2025.
NO ABSTRACT
PMID:39995891 | PMC:PMC11847876 | DOI:10.3389/fncom.2025.1561635
Contrast quality control for segmentation task based on deep learning models-Application to stroke lesion in CT imaging
Front Neurol. 2025 Feb 10;16:1434334. doi: 10.3389/fneur.2025.1434334. eCollection 2025.
ABSTRACT
INTRODUCTION: Although medical imaging plays a crucial role in stroke management, machine learning (ML) has been increasingly used in this field, particularly in lesion segmentation. Despite advances in acquisition technologies and segmentation architectures, one of the main challenges of subacute stroke lesion segmentation in computed tomography (CT) imaging is image contrast.
METHODS: To address this issue, we propose a method to assess the contrast quality of an image dataset with a ML trained model for segmentation. This method identifies the critical contrast level below which the medical-imaging model fails to learn meaningful content from images. Contrast measurement relies on the Fisher's ratio, estimating how well the stroke lesion is contrasted from the background. The critical contrast is found-thanks to the following three methods: Performance, graphical, and clustering analysis. Defining this threshold improves dataset design and accelerates training by excluding low-contrast images.
RESULTS: Application of this method to brain lesion segmentation in CT imaging highlights a Fisher's ratio threshold value of 0.05, and training validation of a new model without these images confirms this with similar results with only 60% of the training data, resulting in an almost 30% reduction in initial training time. Moreover, the model trained without the low-contrast images performed equally well with all images when tested on another database.
DISCUSSION: This study opens discussion with clinicians concerning the limitations, areas for improvement, and strategies for enhancing datasets and training models. While the methodology was only applied to stroke lesion segmentation in CT images, it has the potential to be adapted to other tasks.
PMID:39995787 | PMC:PMC11849432 | DOI:10.3389/fneur.2025.1434334
Combining pelvic floor ultrasonography with deep learning to diagnose anterior compartment organ prolapse
Quant Imaging Med Surg. 2025 Feb 1;15(2):1265-1274. doi: 10.21037/qims-24-772. Epub 2025 Jan 21.
ABSTRACT
BACKGROUND: Anterior compartment prolapse is a common pelvic organ prolapse (POP), which occurs frequently among middle-aged and elderly women and can cause urinary incontinence, perineal pain and swelling, and seriously affect their physical and mental health. At present, pelvic floor ultrasound is the primary examination method, but it is not carried out by many primary medical institutions due to the significant shortcomings of training in the early stage and the variable image quality. There has been great progress in the application of deep learning (DL) in image-based diagnosis in various clinical contexts. The main purpose of this study was to improve the speed and reliability of pelvic floor ultrasound diagnosis of POP by training neural networks to interpret ultrasound images, thereby facilitating the diagnosis and treatment of POP in primary care.
METHODS: This retrospective study analyzed medical records of women with anterior compartment organ prolapse (n=1,605, mean age 45.1±12.2 years) or without (n=200, mean age 38.1±13.4 years), who were examined at West China Second University Hospital between March 2019 and September 2021. Static ultrasound images of the anterior chamber of the pelvic floor (5,281 abnormal, 535 normal) were captured at rest and at maximal Valsalva motion, and four convolutional neural network (CNN) models, AlexNet, VGG-16, ResNet-18, and ResNet-50, were trained on 80% of the images, then internally validated on the other 20%. Each model was trained in two ways: through a random initialization parameter training method and through a transfer learning method based on ImageNet pre-training. The diagnostic performance of each network was evaluated according to accuracy, precision, recall and F1-score, and the receiver operating characteristic (ROC) curve of each network in the training set and validation set was drawn and the area under the curve (AUC) was obtained.
RESULTS: All four models, regardless of training method, achieved recognition accuracy of >91%, whereas transfer learning led to more stable and effective feature extraction. Specifically, ResNet-18 and ResNet-50 performed better than AlexNet and VGG-16. However, the four networks learned by transfer all showed fairly high AUCs, with the ResNet-18 network performing the best: it read images in 13.4 msec and provided recognition an accuracy of 93.53% along with an AUC of 0.852.
CONCLUSIONS: Combining DL with pelvic floor ultrasonography can substantially accelerate diagnosis of anterior compartment organ prolapse in women while improving accuracy.
PMID:39995742 | PMC:PMC11847209 | DOI:10.21037/qims-24-772
Diagnosis of Alzheimer's disease using transfer learning with multi-modal 3D Inception-v4
Quant Imaging Med Surg. 2025 Feb 1;15(2):1455-1467. doi: 10.21037/qims-24-1577. Epub 2025 Jan 20.
ABSTRACT
BACKGROUND: Deep learning (DL) technologies are playing increasingly important roles in computer-aided diagnosis in medicine. In this study, we sought to address issues related to the diagnosis of Alzheimer's disease (AD) based on multi-modal features, and introduced a multi-modal three-dimensional Inception-v4 model that employs transfer learning for AD diagnosis based on magnetic resonance imaging (MRI) and clinical score data.
METHODS: The multi-modal three-dimensional (3D) Inception-v4 model was first pre-trained using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Subsequently, independent validation data were used to fine-tune the model with pre-trained weight parameters. The model was quantitatively evaluated using the mean values obtained from five-fold cross-validation. Further, control experiments were conducted to verify the performance of the model patients with AD, and in the study of disease progression.
RESULTS: In the AD diagnosis task, when a single image marker was used, the average accuracy (ACC) and area under the curve (AUC) were 62.21% and 71.87%, respectively. When transfer learning was not employed, the average ACC and AUC were 75.74% and 83.13%, respectively. Conversely, the combined approach proposed in this study achieved an average ACC of 87.84%, and an average AUC of 90.80% [with an average precision (PRE) of 87.21%, an average recall (REC) of 82.52%, and an average F1 of 83.58%].
CONCLUSIONS: In comparison with existing methods, the performance of the proposed method was superior in terms of diagnostic accuracy. Specifically, the method showed an enhanced ability to accurately distinguish among various stages of AD. Our findings show that multi-modal feature fusion and transfer learning can be valuable resources in the treatment of patients with AD, and in the study of disease progression.
PMID:39995734 | PMC:PMC11847174 | DOI:10.21037/qims-24-1577
CTDNN-Spoof: compact tiny deep learning architecture for detection and multi-label classification of GPS spoofing attacks in small UAVs
Sci Rep. 2025 Feb 24;15(1):6656. doi: 10.1038/s41598-025-90809-3.
ABSTRACT
GPS spoofing presents a significant threat to small Unmanned Aerial Vehicles (UAVs) by manipulating navigation systems, potentially causing safety risks, privacy violations, and mission disruptions. Effective countermeasures include secure GPS signal authentication, anti-spoofing technologies, and continuous monitoring to detect and respond to such threats. Safeguarding small UAVs from GPS spoofing is crucial for their reliable operation in applications such as surveillance, agriculture, and environmental monitoring. In this paper, we propose a compact, tiny deep learning architecture named CTDNN-Spoof for detecting and multi-label classifying GPS spoofing attacks in small UAVs. The architecture utilizes a sequential neural network with 64 neurons in the input layer (ReLU activation), 32 neurons in the hidden layer (ReLU activation), and 4 neurons in the output layer (linear activation), optimized with the Adam optimizer. We use Mean Squared Error (MSE) loss for regression and accuracy for evaluation. First, early stopping with a patience of 10 epochs is implemented to improve training efficiency and restore the best weights. Furthermore, the model is also trained for 50 epochs, and its performance is assessed using a separate validation set. Additionally, we use two other models to compare with the CTDNN-Spoof in terms of complexity, loss, and accuracy. The proposed CTDNN-Spoof demonstrates varying accuracies across different labels, with the proposed architecture achieving the highest performance and promising time complexity. These results highlight the model's effectiveness in mitigating GPS spoofing threats in UAVs. This innovative approach provides a scalable, real-time solution to enhance UAV security, surpassing traditional methods in precision and adaptability.
PMID:39994281 | DOI:10.1038/s41598-025-90809-3
An enhanced denoising system for mammogram images using deep transformer model with fusion of local and global features
Sci Rep. 2025 Feb 24;15(1):6562. doi: 10.1038/s41598-025-89451-w.
ABSTRACT
Image denoising is a critical problem in low-level computer vision, where the aim is to reconstruct a clean, noise-free image from a noisy input, such as a mammogram image. In recent years, deep learning, particularly convolutional neural networks (CNNs), has shown great success in various image processing tasks, including denoising, image compression, and enhancement. While CNN-based approaches dominate, Transformer models have recently gained popularity for computer vision tasks. However, there have been fewer applications of Transformer-based models to low-level vision problems like image denoising. In this study, a novel denoising network architecture called DeepTFormer is proposed, which leverages Transformer models for the task. The DeepTFormer architecture consists of three main components: a preprocessing module, a local-global feature extraction module, and a reconstruction module. The local-global feature extraction module is the core of DeepTFormer, comprising several groups of ITransformer layers. Each group includes a series of Transformer layers, convolutional layers, and residual connections. These groups are tightly coupled with residual connections, which allow the model to capture both local and global information from the noisy images effectively. The design of these groups ensures that the model can utilize both local features for fine details and global features for larger context, leading to more accurate denoising. To validate the performance of the DeepTFormer model, extensive experiments were conducted using both synthetic and real noise data. Objective and subjective evaluations demonstrated that DeepTFormer outperforms leading denoising methods. The model achieved impressive results, surpassing state-of-the-art techniques in terms of key metrics like PSNR, FSIM, EPI, and SSIM, with values of 0.41, 0.93, 0.96, and 0.94, respectively. These results demonstrate that DeepTFormer is a highly effective solution for image denoising, combining the power of Transformer architecture with convolutional layers to enhance both local and global feature extraction.
PMID:39994276 | DOI:10.1038/s41598-025-89451-w
Deep structured learning with vision intelligence for oral carcinoma lesion segmentation and classification using medical imaging
Sci Rep. 2025 Feb 24;15(1):6610. doi: 10.1038/s41598-025-89971-5.
ABSTRACT
Oral carcinoma (OC) is a toxic illness among the most general malignant cancers globally, and it has developed a gradually significant public health concern in emerging and low-to-middle-income states. Late diagnosis, high incidence, and inadequate treatment strategies remain substantial challenges. Analysis at an initial phase is significant for good treatment, prediction, and existence. Despite the current growth in the perception of molecular devices, late analysis and methods near precision medicine for OC patients remain a challenge. A machine learning (ML) model was employed to improve early detection in medicine, aiming to reduce cancer-specific mortality and disease progression. Recent advancements in this approach have significantly enhanced the extraction and diagnosis of critical information from medical images. This paper presents a Deep Structured Learning with Vision Intelligence for Oral Carcinoma Lesion Segmentation and Classification (DSLVI-OCLSC) model for medical imaging. Using medical imaging, the DSLVI-OCLSC model aims to enhance OC's classification and recognition outcomes. To accomplish this, the DSLVI-OCLSC model utilizes wiener filtering (WF) as a pre-processing technique to eliminate the noise. In addition, the ShuffleNetV2 method is used for the group of higher-level deep features from an input image. The convolutional bidirectional long short-term memory network with a multi-head attention mechanism (MA-CNN-BiLSTM) approach is utilized for oral carcinoma recognition and identification. Moreover, the Unet3 + is employed to segment abnormal regions from the classified images. Finally, the sine cosine algorithm (SCA) approach is utilized to hyperparameter-tune the DL model. A wide range of simulations is implemented to ensure the enhanced performance of the DSLVI-OCLSC method under the OC images dataset. The experimental analysis of the DSLVI-OCLSC method portrayed a superior accuracy value of 98.47% over recent approaches.
PMID:39994267 | DOI:10.1038/s41598-025-89971-5
Progress on intelligent metasurfaces for signal relay, transmitter, and processor
Light Sci Appl. 2025 Feb 25;14(1):93. doi: 10.1038/s41377-024-01729-2.
ABSTRACT
Pursuing higher data rate with limited spectral resources is a longstanding topic that has triggered the fast growth of modern wireless communication techniques. However, the massive deployment of active nodes to compensate for propagation loss necessitates high hardware expenditure, energy consumption, and maintenance cost, as well as complicated network interference issues. Intelligent metasurfaces, composed of a number of subwavelength passive or active meta-atoms, have recently found to be a new paradigm to actively reshape wireless communication environment in a green way, distinct from conventional works that passively adapt to the surrounding. In this review, we offer a unified perspective on how intelligent metasurfaces can facilitate wireless communication in three manners: signal relay, signal transmitter, and signal processor. We start by the basic modeling of wireless channel and the evolution of metasurfaces from passive, active to intelligent metasurfaces. Integrated with various deep learning algorithms, intelligent metasurfaces adapt to cater for the ever-changing environments without human intervention. Then, we overview specific experimental advancements using intelligent metasurfaces. We conclude by identifying key issues in the practical implementations of intelligent metasurfaces, and surveying new directions, such as gain metasurfaces and knowledge migration.
PMID:39994200 | DOI:10.1038/s41377-024-01729-2
A PET/CT-based 3D deep learning model for predicting spread through air spaces in stage I lung adenocarcinoma
Clin Transl Oncol. 2025 Feb 24. doi: 10.1007/s12094-025-03870-9. Online ahead of print.
ABSTRACT
PURPOSE: This study evaluates a three-dimensional (3D) deep learning (DL) model based on fluorine-18 fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) for predicting the preoperative status of spread through air spaces (STAS) in patients with clinical stage I lung adenocarcinoma (LUAD).
METHODS: A retrospective analysis of 162 patients with stage I LUAD was conducted, splitting data into training and test sets (4:1). Six 3D DL models were developed, and the top-performing PET and CT models (ResNet50) were fused for optimal prediction. The model's clinical utility was assessed through a two-stage reader study.
RESULTS: The fused PET/CT model achieved an area under the curve (AUC) of 0.956 (95% CI 0.9230-0.9881) in the training set and 0.889 (95% CI 0.7624-1.0000) in the test set. Compared to three physicians, the model demonstrated superior sensitivity and specificity. After the artificial intelligence (AI) assistance's participation, the diagnostic accuracy of the physicians improved during their subsequent reading session.
CONCLUSION: Our DL model demonstrates potential as a resource to aid physicians in predicting STAS status and preoperative treatment planning for stage I LUAD, though prospective validation is required.
PMID:39994163 | DOI:10.1007/s12094-025-03870-9
scFTAT: a novel cell annotation method integrating FFT and transformer
BMC Bioinformatics. 2025 Feb 25;26(1):62. doi: 10.1186/s12859-025-06061-z.
ABSTRACT
BACKGROUND: Advancements in high-throughput sequencing and deep learning have boosted single-cell RNA studies. However, current methods for annotating single-cell data face challenges due to high data sparsity and tedious manual annotation on large-scale data.
RESULTS: Thus, we proposed a novel annotation model integrating FFT (Fast Fourier Transform) and an enhanced Transformer, named scFTAT. Initially, it reduces data sparsity using LDA (Linear Discriminant Analysis). Subsequently, automatic cell annotation is achieved through a proposed module integrating FFT and an enhanced Transformer. Moreover, the model is fine-tuned to improve training performance by effectively incorporating such techniques as kernel approximation, position encoding enhancement, and attention enhancement modules. Compared to existing popular annotation tools, scFTAT maintains high accuracy and robustness on six typical datasets. Specifically, the model achieves an accuracy of 0.93 on the human kidney data, with an F1 score of 0.84, precision of 0.96, recall rate of 0.80, and Matthews correlation coefficient of 0.89. The highest accuracy of the compared methods is 0.92, with an F1 score of 0.71, precision of 0.75, recall rate of 0.73, and Matthews correlation coefficient of 0.85. The compiled codes and supplements are available at: https://github.com/gladex/scFTAT .
CONCLUSION: In summary, the proposed scFTAT effectively integrates FFT and enhanced Transformer for automatic feature learning, addressing the challenges of high sparsity and tedious manual annotation in single-cell profiling data. Experiments on six typical scRNA-seq datasets from human and mouse tissues evaluate the model using five metrics as accuracy, F1 score, precision, recall, and Matthews correlation coefficient. Performance comparisons with existing methods further demonstrate the efficiency and robustness of our proposed method.
PMID:39994539 | DOI:10.1186/s12859-025-06061-z
Tensor-powered insights into neural dynamics
Commun Biol. 2025 Feb 24;8(1):298. doi: 10.1038/s42003-025-07711-x.
ABSTRACT
The complex spatiotemporal dynamics of neurons encompass a wealth of information relevant to perception and decision-making, making the decoding of neural activity a central focus in neuroscience research. Traditional machine learning or deep learning-based neural information modeling approaches have achieved significant results in decoding. Nevertheless, such methodologies require the vectorization of data, a process that disrupts the intrinsic relationships inherent in high-dimensional spaces, consequently impeding their capability to effectively process information in high-order tensor domains. In this paper, we introduce a novel decoding approach, namely the Least Squares Sport Tensor Machine (LS-STM), which is based on tensor space and represents a tensorized improvement over traditional vector learning frameworks. In extensive evaluations using human and mouse data, our results demonstrate that LS-STM exhibits superior performance in neural signal decoding tasks compared to traditional vectorization-based decoding methods. Furthermore, LS-STM demonstrates better performance in decoding neural signals with limited samples and the tensor weights of the LS-STM decoder enable the retrospective identification of key neurons during the neural encoding process. This study introduces a novel tensor computing approach and perspective for decoding high-dimensional neural information in the field.
PMID:39994447 | DOI:10.1038/s42003-025-07711-x
Algorithm for pixel-level concrete pavement crack segmentation based on an improved U-Net model
Sci Rep. 2025 Feb 24;15(1):6553. doi: 10.1038/s41598-025-91352-x.
ABSTRACT
Cracks that occur in concrete surfaces are numerous and diverse, and different cracks will affect road safety in different degrees. Accurately identifying pavement cracks is crucial for assessing road conditions and formulating maintenance strategies. This study improves the original U-shaped convolutional network (U-Net) model through the introduction of two innovations, thereby modifying its structure, reducing the number of parameters, enhancing its ability to distinguish between background and cracks, and improving its speed and accuracy in crack detection tasks. Additionally, datasets with different exposure levels and noise conditions are used to train the network, broadening its predictive ability. A custom dataset of 960 road crack images was added to the public dataset to train and evaluate the model. The test results demonstrate that the proposed U-Net-FML model achieves high accuracy and detection speed in complex environments, with MIoU, F1 score, precision, and recall values of 76.4%, 74.2%, 84.2%, and 66.4%, respectively, significantly surpassing those of the other models. Among the seven comparison models, U-Net-FML has the strongest overall performance, highlighting its engineering value for precise detection and efficient analysis of cracks.
PMID:39994438 | DOI:10.1038/s41598-025-91352-x