Deep learning
Deep learning-based image classification reveals heterogeneous execution of cell death fates during viral infection
Mol Biol Cell. 2025 Jan 22:mbcE24100438. doi: 10.1091/mbc.E24-10-0438. Online ahead of print.
ABSTRACT
Cell fate decisions, such as proliferation, differentiation, and death, are driven by complex molecular interactions and signaling cascades. While significant progress has been made in understanding the molecular determinants of these processes, historically, cell fate transitions were identified through light microscopy that focused on changes in cell morphology and function. Modern techniques have shifted towards probing molecular effectors to quantify these transitions, offering more precise quantification and mechanistic understanding. However, challenges remain in cases where the molecular signals are ambiguous, complicating the assignment of cell fate. During viral infection, programmed cell death (PCD) pathways, including apoptosis, necroptosis, and pyroptosis, exhibit complex signaling and molecular crosstalk. This can lead to simultaneous activation of multiple PCD pathways, which confounds assignment of cell fate based on molecular information alone. To address this challenge, we employed deep learning-based image classification of dying cells to analyze PCD in single Herpes Simplex Virus-1 (HSV-1)-infected cells. Our approach reveals that despite heterogeneous activation of signaling, individual cells adopt predominantly prototypical death morphologies. Nevertheless, PCD is executed heterogeneously within a uniform population of virus-infected cells and varies over time. These findings demonstrate that image-based phenotyping can provide valuable insights into cell fate decisions, complementing molecular assays. [Media: see text] [Media: see text] [Media: see text] [Media: see text].
PMID:39841552 | DOI:10.1091/mbc.E24-10-0438
PBCS-ConvNeXt: Convolutional Network-Based Automatic Diagnosis of Non-alcoholic Fatty Liver in Abdominal Ultrasound Images
J Imaging Inform Med. 2025 Jan 22. doi: 10.1007/s10278-025-01394-w. Online ahead of print.
ABSTRACT
Non-alcoholic fatty liver disease (NAFLD) is a highly prevalent chronic liver condition characterized by excessive hepatic fat accumulation. Early diagnosis is crucial as NAFLD can progress to more severe conditions like steatohepatitis, fibrosis, cirrhosis, and hepatocellular carcinoma without timely intervention. While liver biopsy remains the gold standard for NAFLD assessment, abdominal ultrasound (US) imaging has emerged as a widely adopted non-invasive modality due to convenience and low cost. However, the subjective interpretation of US images is challenging and unpredictable. This study proposes a deep learning-based computer-aided diagnosis (CAD) model, termed potent boosts channel-aware separable intent - ConvNeXt (PBCS-ConvNeXt), for automated NAFLD classification using B-mode US images. The model architecture comprises three key components: The potent stem cell, an advanced trainable preprocessing module for robust feature extraction; Enhanced ConvNeXt Blocks that amplify channel-wise features to refine processing; and the boosting block that integrates multi-stage features for effective information extraction from US data. Utilizing fatty liver gradings from attenuation imaging (ATI) as the ground truth, the PBCS-ConvNeXt model was evaluated using 5-fold cross-validation, achieving an accuracy of 82%, sensitivity of 81% and specificity of 83% for identifying fatty liver on abdominal US. The proposed CAD system demonstrates high diagnostic performance in NAFLD classification from US images, enabling early detection and informing timely clinical management to prevent disease progression.
PMID:39841370 | DOI:10.1007/s10278-025-01394-w
CDCG-UNet: Chaotic Optimization Assisted Brain Tumor Segmentation Based on Dilated Channel Gate Attention U-Net Model
Neuroinformatics. 2025 Jan 22;23(2):12. doi: 10.1007/s12021-024-09701-6.
ABSTRACT
Brain tumours are one of the most deadly and noticeable types of cancer, affecting both children and adults. One of the major drawbacks in brain tumour identification is the late diagnosis and high cost of brain tumour-detecting devices. Most existing approaches use ML algorithms to address problems, but they have drawbacks such as low accuracy, high loss, and high computing cost. To address these challenges, a novel U-Net model for tumour segmentation in magnetic resonance images (MRI) is proposed. Initially, images are claimed from the dataset and pre-processed with the Probabilistic Hybrid Wiener filter (PHWF) to remove unwanted noise and improve image quality. To reduce model complexity, the pre-processed images are submitted to a feature extraction procedure known as 3D Convolutional Vision Transformer (3D-VT). To perform the segmentation approach using chaotic optimization assisted Dilated Channel Gate attention U-Net (CDCG-UNet) model to segment brain tumour regions effectively. The proposed approach segments tumour portions as whole tumour (WT), tumour Core (TC), and Enhancing Tumour (ET) positions. The optimization loss function can be performed using the Chaotic Harris Shrinking Spiral optimization algorithm (CHSOA). The proposed CDCG-UNet model is evaluated with three datasets: BRATS 2021, BRATS 2020, and BRATS 2023. For the BRATS 2021 dataset, the proposed CDCG-UNet model obtained a dice score of 0.972 for ET, 0.987 for CT, and 0.98 for WT. For the BRATS 2020 dataset, the proposed CDCG-UNet model produced a dice score of 98.87% for ET, 98.67% for CT, and 99.1% for WT. The CDCG-UNet model is further evaluated using the BRATS 2023 dataset, which yields 98.42% for ET, 98.08% for CT, and 99.3% for WT.
PMID:39841321 | DOI:10.1007/s12021-024-09701-6
CTCNet: a fine-grained classification network for fluorescence images of circulating tumor cells
Med Biol Eng Comput. 2025 Jan 22. doi: 10.1007/s11517-025-03297-y. Online ahead of print.
ABSTRACT
The identification and categorization of circulating tumor cells (CTCs) in peripheral blood are imperative for advancing cancer diagnostics and prognostics. The intricacy of various CTCs subtypes, coupled with the difficulty in developing exhaustive datasets, has impeded progress in this specialized domain. To date, no methods have been dedicated exclusively to overcoming the classification challenges of CTCs. To address this deficit, we have developed CTCDet, a large-scale dataset meticulously annotated based on the distinctive pathological characteristics of CTCs, aimed at advancing the application of deep learning techniques in oncological research. Furthermore, we introduce CTCNet, an innovative hybrid architecture that merges the capabilities of CNNs and Transformers to achieve precise classification of CTCs. This architecture features the Parallel Token mixer, which integrates local window self-attention with large-kernel depthwise convolution, enhancing the network's ability to model intricate channel and spatial relationships. Additionally, the Deformable Large Kernel Attention (DLKAttention) module leverages deformable convolution and large-kernel operations to adeptly delineate the nuanced features of CTCs, substantially boosting classification efficacy. Comprehensive evaluations on the CTCDet dataset validate the superior performance of CTCNet, confirming its ability to outperform other general methods in accurate cell classification. Moreover, the generalizability of CTCNet has been established across various datasets, establishing its robustness and applicability. What is more, our proposed method can lead to clinical applications and provide some help in assisting cancer diagnosis and treatment. Code and Data are available at https://github.com/JasonWu404/CTCs_Classification .
PMID:39841310 | DOI:10.1007/s11517-025-03297-y
Enhanced accuracy and stability in automated intra-pancreatic fat deposition monitoring of type 2 diabetes mellitus using Dixon MRI and deep learning
Abdom Radiol (NY). 2025 Jan 22. doi: 10.1007/s00261-025-04804-3. Online ahead of print.
ABSTRACT
PURPOSE: Intra-pancreatic fat deposition (IPFD) is closely associated with the onset and progression of type 2 diabetes mellitus (T2DM). We aimed to develop an accurate and automated method for assessing IPFD on multi-echo Dixon MRI.
MATERIALS AND METHODS: In this retrospective study, 534 patients from two centers who underwent upper abdomen MRI and completed multi-echo and double-echo Dixon MRI were included. A pancreatic segmentation model was trained on double-echo Dixon water images using nnU-Net. Predicted masks were registered to the proton density fat fraction (PDFF) maps of the multi-echo Dixon sequence. Deep semantic segmentation feature-based radiomics (DSFR) and radiomics features were separately extracted on the PDFF maps and modeled using the support vector machine method with 5-fold cross-validation. The first deep learning radiomics (DLR) model was constructed to distinguish T2DM from non-diabetes and pre-diabetes by averaging the output scores of the DSFR and radiomics models. The second DLR model was then developed to distinguish pre-diabetes from non-diabetes. Two radiologist models were constructed based on the mean PDFF of three pancreatic regions of interest.
RESULTS: The mean Dice similarity coefficient for pancreas segmentation was 0.958 in the total test cohort. The AUCs of the DLR and two radiologist models in distinguishing T2DM from non-diabetes and pre-diabetes were 0.868, 0.760, and 0.782 in the training cohort, and 0.741, 0.724, and 0.653 in the external test cohort, respectively. For distinguishing pre-diabetes from non-diabetes, the AUCs were 0.881, 0.688, and 0.688 in the training cohort, which included data combined from both centers. Testing was not conducted due to limited pre-diabetic patients. Intraclass correlation coefficients between radiologists' pancreatic PDFF measurements were 0.800 and 0.699 at two centers, suggesting good and moderate reproducibility, respectively.
CONCLUSION: The DLR model demonstrated superior performance over radiologists, providing a more efficient, accurate and stable method for monitoring IPFD and predicting the risk of T2DM and pre-diabetes. This enables IPFD assessment to potentially serve as an early biomarker for T2DM, providing richer clinical information for disease progression and management.
PMID:39841227 | DOI:10.1007/s00261-025-04804-3
Impact of Scanner Manufacturer, Endorectal Coil Use, and Clinical Variables on Deep Learning-assisted Prostate Cancer Classification Using Multiparametric MRI
Radiol Artif Intell. 2025 Jan 22:e230555. doi: 10.1148/ryai.230555. Online ahead of print.
ABSTRACT
"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To assess the impact of scanner manufacturer and scan protocol on the performance of deep learning models to classify prostate cancer (PCa) aggressiveness on biparametric MRI (bpMRI). Materials and Methods In this retrospective study, 5,478 cases from ProstateNet, a PCa bpMRI dataset with examinations from 13 centers, were used to develop five deep learning (DL) models to predict PCa aggressiveness with minimal lesion information and test how using data from different subgroups-scanner manufacturers and endorectal coil (ERC) use (Siemens, Philips, GE with and without ERC and the full dataset)-impacts model performance. Performance was assessed using the area under the receiver operating characteristic curve (AUC). The impact of clinical features (age, prostate-specific antigen level, Prostate Imaging Reporting and Data System [PI-RADS] score) on model performance was also evaluated. Results DL models were trained on 4,328 bpMRI cases, and the best model achieved AUC = 0.73 when trained and tested using data from all manufacturers. Hold-out test set performance was higher when models trained with data from a manufacturer were tested on the same manufacturer (within-and between-manufacturer AUC differences of 0.05 on average, P < .001). The addition of clinical features did not improve performance (P = .24). Learning curve analyses showed that performance remained stable as training data increased. Analysis of DL features showed that scanner manufacturer and scan protocol heavily influenced feature distributions. Conclusion In automated classification of PCa aggressiveness using bpMRI data, scanner manufacturer and endorectal coil use had a major impact on DL model performance and features. Published under a CC BY 4.0 license.
PMID:39841063 | DOI:10.1148/ryai.230555
Gait patterns in unstable older patients related with vestibular hypofunction. Preliminary results in assessment with time-frequency analysis
Acta Otolaryngol. 2025 Jan 22:1-6. doi: 10.1080/00016489.2025.2450221. Online ahead of print.
ABSTRACT
BACKGROUND: Gait instability and falls significantly impact life quality and morbi-mortality in elderly populations. Early diagnosis of gait disorders is one of the most effective approaches to minimize severe injuries.
OBJECTIVE: To find a gait instability pattern in older adults through an image representation of data collected by a single sensor.
METHODS: A sample of 13 older adults (71-85 years old) with instability by vestibular hypofunction is compared to a sample of 19 adults (21-75 years old) without instability and normal vestibular function. Image representations of the gait signals acquired on a specific walk path were generated using a continuous wavelet transform and analyzed as a texture using grey level co-occurrence matrix metrics as features. A support vector machine (SVM) algorithm was used to discriminate subjects.
RESULTS: First results show a good classification performance. According to analysis of extracted features, most information relevant to instability is concentrated in the medio-lateral acceleration (X axis) and the frontal plane angular rotation (Z axis gyroscope). Performing a ten-fold cross-validation through the first ten seconds of the sample dataset, the algorithm achieves a 92,3 F1 score corresponding to 12 true-positives, 1 false positive and 1 false negative.
DISCUSSION: This preliminary report suggests that the method has potential use in assessing gait disorders in controlled and non-controlled environments. It suggests that deep learning methods could be explored given the availability of a larger population and data samples.
PMID:39840938 | DOI:10.1080/00016489.2025.2450221
Gait Video-Based Prediction of Severity of Cerebellar Ataxia Using Deep Neural Networks
Mov Disord. 2025 Jan 22. doi: 10.1002/mds.30113. Online ahead of print.
ABSTRACT
BACKGROUND: Pose estimation algorithms applied to two-dimensional videos evaluate gait disturbances; however, a few studies have used this method to evaluate ataxic gait.
OBJECTIVE: The aim was to assess whether a pose estimation algorithm can predict the severity of cerebellar ataxia by applying it to gait videos.
METHODS: We video-recorded 66 patients with degenerative cerebellar diseases performing the timed up-and-go test. Key points from the gait videos extracted by a pose estimation algorithm were input into a deep learning model to predict the Scale for the Assessment and Rating of Ataxia (SARA) score. We also evaluated video segments that the model focused on to predict ataxia severity.
RESULTS: The model achieved a root-mean-square error of 2.30 and a coefficient of determination of 0.79 in predicting the SARA score. It primarily focused on standing, turning, and body sway to assess severity.
CONCLUSIONS: This study demonstrated that the model may capture gait characteristics from key-point data and has the potential to predict SARA scores. © 2025 International Parkinson and Movement Disorder Society.
PMID:39840857 | DOI:10.1002/mds.30113
AggNet: Advancing protein aggregation analysis through deep learning and protein language model
Protein Sci. 2025 Feb;34(2):e70031. doi: 10.1002/pro.70031.
ABSTRACT
Protein aggregation is critical to various biological and pathological processes. Besides, it is also an important property in biotherapeutic development. However, experimental methods to profile protein aggregation are costly and labor-intensive, driving the need for more efficient computational alternatives. In this study, we introduce "AggNet," a novel deep learning framework based on the protein language model ESM2 and AlphaFold2, which utilizes physicochemical, evolutionary, and structural information to discriminate amyloid and non-amyloid peptides and identify aggregation-prone regions (APRs) in diverse proteins. Benchmark comparisons show that AggNet outperforms existing methods and achieves state-of-the-art performance on protein aggregation prediction. Also, the predictive ability of AggNet is stable across proteins with different secondary structures. Feature analysis and visualizations prove that the model effectively captures peptides' physicochemical properties effectively, thereby offering enhanced interpretability. Further validation through a case study on MEDI1912 confirms AggNet's practical utility in analyzing protein aggregation and guiding mutation for aggregation mitigation. This study enhances computational tools for predicting protein aggregation and highlights the potential of AggNet in protein engineering. Finally, to improve the accessibility of AggNet, the source code can be accessed at: https://github.com/Hill-Wenka/AggNet.
PMID:39840791 | DOI:10.1002/pro.70031
Mfgnn: Multi-Scale Feature-Attentive Graph Neural Networks for Molecular Property Prediction
J Comput Chem. 2025 Jan 30;46(3):e70011. doi: 10.1002/jcc.70011.
ABSTRACT
In the realm of artificial intelligence-driven drug discovery (AIDD), accurately predicting the influence of molecular structures on their properties is a critical research focus. While deep learning models based on graph neural networks (GNNs) have made significant advancements in this area, prior studies have primarily concentrated on molecule-level representations, often neglecting the impact of functional group structures and the potential relationships between fragments on molecular property predictions. To address this gap, we introduce the multi-scale feature attention graph neural network (MfGNN), which enhances traditional atom-based molecular graph representations by incorporating fragment-level representations derived from chemically synthesizable BRICS fragments. MfGNN not only effectively captures both the structural information of molecules and the features of functional groups but also pays special attention to the potential relationships between fragments, exploring how they collectively influence molecular properties. This model integrates two core mechanisms: a graph attention mechanism that captures embeddings of molecules and functional groups, and a feature extraction module that systematically processes BRICS fragment-level features to uncover relationships among the fragments. Our comprehensive experiments demonstrate that MfGNN outperforms leading machine learning and deep learning models, achieving state-of-the-art performance in 8 out of 11 learning tasks across various domains, including physical chemistry, biophysics, physiology, and toxicology. Furthermore, ablation studies reveal that the integration of multi-scale feature information and the feature extraction module enhances the richness of molecular features, thereby improving the model's predictive capabilities.
PMID:39840745 | DOI:10.1002/jcc.70011
Utilizing deep learning for automatic segmentation of the cochleae in temporal bone computed tomography
Acta Radiol. 2025 Jan 22:2841851241307333. doi: 10.1177/02841851241307333. Online ahead of print.
ABSTRACT
BACKGROUND: Segmentation of the cochlea in temporal bone computed tomography (CT) is the basis for image-guided otologic surgery. Manual segmentation is time-consuming and laborious.
PURPOSE: To assess the utility of deep learning analysis in automatic segmentation of the cochleae in temporal bone CT to differentiate abnormal images from normal images.
MATERIAL AND METHODS: Three models (3D U-Net, UNETR, and SegResNet) were trained to segment the cochlea on two CT datasets (two CT types: GE 64 and GE 256). One dataset included 77 normal samples, and the other included 154 samples (77 normal and 77 abnormal). A total of 20 samples that contained normal and abnormal cochleae in three CT types (GE 64, GE 256, and SE-DS) were tested on the three models. The Dice similarity coefficient (DSC) and Hausdorff distance (HD) were used to assess the models.
RESULTS: The segmentation performances of the three models improved after adding abnormal cochlear images for training. SegResNet achieved the best performance. The average DSC on the test set was 0.94, and the HD was 0.16 mm; the performance was higher than those obtained by the 3D U-Net and UNETR models. The DSCs obtained using the GE 256 CT, SE-DS CT, and GE 64 CT models were 0.95, 0.94, and 0.93, respectively, and the HDs were 0.15, 0.18, and 0.12 mm, respectively.
CONCLUSION: The SegResNet model is feasible and accurate for automated cochlear segmentation of temporal bone CT images.
PMID:39840644 | DOI:10.1177/02841851241307333
Artificial Intelligence in Detecting and Segmenting Vertical Misfit of Prosthesis in Radiographic Images of Dental Implants: A Cross-Sectional Analysis
Clin Oral Implants Res. 2025 Jan 22. doi: 10.1111/clr.14406. Online ahead of print.
ABSTRACT
OBJECTIVE: This study evaluated ResNet-50 and U-Net models for detecting and segmenting vertical misfit in dental implant crowns using periapical radiographic images.
METHODS: Periapical radiographs of dental implant crowns were classified by two experts based on the presence of vertical misfit (reference group). The misfit area was manually annotated in images exhibiting vertical misfit. The resulting datasets were utilized to train the ResNet-50 and U-Net deep learning models. Then, 70% of the images were allocated for training, while the remaining 30% were used for validation and testing. Five general dentists categorized the testing images as "misfit" or "fit." Inter-rater reliability with Cohen's kappa index and performance metrics were calculated. The average performance metrics of dentists and artificial intelligence (AI) were compared using the paired-samples t test.
RESULTS: A total of 638 radiographs were collected. The kappa values between dentists and AI ranged from 0.93 to 0.98, indicating perfect agreement. The ResNet-50 model achieved accuracy and precision of 92.7% and 87.5%, respectively, whereas dentists had a mean accuracy of 93.3% and precision of 89.6%. The sensitivity and specificity for AI were 90.3% and 93.8%, respectively, compared to 90.1% and 95.1% for dentists. The Dice coefficient yielded 88.9% for the ResNet-50 and 89.5% among the dentists. The U-Net algorithm produced a loss of 0.01 and an accuracy of 0.98. No significant difference was found between the average performance metrics of dentists and AI (p > 0.05).
CONCLUSION: AI can detect and segment vertical misfit of implant prosthetic crowns in periapical radiographs, comparable to clinician performance.
PMID:39840554 | DOI:10.1111/clr.14406
Improved Efficacy of Triple-Negative Breast Cancer Immunotherapy via Hydrogel-Based Co-Delivery of CAR-T Cells and Mitophagy Agonist
Adv Sci (Weinh). 2025 Jan 22:e2409835. doi: 10.1002/advs.202409835. Online ahead of print.
ABSTRACT
Leaky and structurally abnormal blood vessels and increased pressure in the tumor interstitium reduce the infiltration of CAR-T cells in solid tumors, including triple-negative breast cancer (TNBC). Furthermore, high burden of tumor cells may cause reduction of infiltrating CAR-T cells and their functional exhaustion. In this study, various effector-to-target (E:T) ratio experiments are established to model the treatment using CAR-T cells in leukemia (high E:T ratio) and solid tumor (low E:T ratio). It is found that the antitumor immune response is decreased in solid tumors with low E:T ratio. Furthermore, single cell sequencing is performed to investigate the functional exhaustion at a low ratio. It is revealed that the inhibition of mitophagy-mediated mitochondrial dysfunction diminished the antitumor efficacy of CAR-T-cell therapy. The mitophagy agonist BC1618 is screened via AI-deep learning and cytokine detection, in vivo and in vitro studies revealed that BC1618 significantly strengthened the antitumor response of CAR-T cells via improving mitophagy. Here, injection hydrogels are engineered for the controlled co-delivery of CAR-T cells and BC1618 that improves the treatment of TNBC. Local delivery of hydrogels creates an inflammatory and mitophagy-enhanced microenvironment at the tumor site, which stimulates the CAR-T cells proliferation, provides antitumor ability persistently, and improves the effect of treatment.
PMID:39840546 | DOI:10.1002/advs.202409835
YOLOv7-DWS: tea bud recognition and detection network in multi-density environment via improved YOLOv7
Front Plant Sci. 2025 Jan 7;15:1503033. doi: 10.3389/fpls.2024.1503033. eCollection 2024.
ABSTRACT
INTRODUCTION: Accurate detection and recognition of tea bud images can drive advances in intelligent harvesting machinery for tea gardens and technology for tea bud pests and diseases. In order to realize the recognition and grading of tea buds in a complex multi-density tea garden environment.
METHODS: This paper proposes an improved YOLOv7 object detection algorithm, called YOLOv7-DWS, which focuses on improving the accuracy of tea recognition. First, we make a series of improvements to the YOLOv7 algorithm, including decouple head to replace the head of YOLOv7, to enhance the feature extraction ability of the model and optimize the class decision logic. The problem of simultaneous detection and classification of one-bud-one-leaf and one-bud-two-leaves of tea was solved. Secondly, a new loss function WiseIoU is proposed for the loss function in YOLOv7, which improves the accuracy of the model. Finally, we evaluate different attention mechanisms to enhance the model's focus on key features.
RESULTS AND DISCUSSION: The experimental results show that the improved YOLOv7 algorithm has significantly improved over the original algorithm in all evaluation indexes, especially in the R Tea (+6.2%) and mAP@0.5 (+7.7%). From the results, the algorithm in this paper helps to provide a new perspective and possibility for the field of tea image recognition.
PMID:39840356 | PMC:PMC11747160 | DOI:10.3389/fpls.2024.1503033
A customized convolutional neural network-based approach for weeds identification in cotton crops
Front Plant Sci. 2025 Jan 8;15:1435301. doi: 10.3389/fpls.2024.1435301. eCollection 2024.
ABSTRACT
Smart farming is a hot research area for experts globally to fulfill the soaring demand for food. Automated approaches, based on convolutional neural networks (CNN), for crop disease identification, weed classification, and monitoring have substantially helped increase crop yields. Plant diseases and pests are posing a significant danger to the health of plants, thus causing a reduction in crop production. The cotton crop, is a major cash crop in Asian and African countries and is affected by different types of weeds leading to reduced yield. Weeds infestation starts with the germination of the crop, due to which diseases also invade the field. Therefore, proper monitoring of the cotton crop throughout the entire phases of crop development from sewing to ripening and reaping is extremely significant to identify the harmful and undesired weeds timely and efficiently so that proper measures can be taken to eradicate them. Most of the weeds and pests attack cotton plants at different stages of growth. Therefore, timely identification and classification of such weeds on virtue of their symptoms, apparent similarities, and effects can reduce the risk of yield loss. Weeds and pest infestation can be controlled through advanced digital gadgets like sensors and cameras which can provide a bulk of data to work with. Yet efficient management of this extraordinarily bulging agriculture data is a cardinal challenge for deep learning techniques too. In the given study, an approach based on deep CNN-based architecture is presented. This work covers identifying and classifying the cotton weeds efficiently alongside a comparison of other already existing CNN models like VGG-16, ResNet, DenseNet, and Xception Model. Experimental results indicate the accuracy of VGG-16, ResNet-101, DenseNet-121, XceptionNet as 95.4%, 97.1%, 96.9% and 96.1%, respectively. The proposed model achieved an accuracy of 98.3% outperforming other models.
PMID:39840351 | PMC:PMC11750437 | DOI:10.3389/fpls.2024.1435301
Recent advances in deep learning and language models for studying the microbiome
Front Genet. 2025 Jan 7;15:1494474. doi: 10.3389/fgene.2024.1494474. eCollection 2024.
ABSTRACT
Recent advancements in deep learning, particularly large language models (LLMs), made a significant impact on how researchers study microbiome and metagenomics data. Microbial protein and genomic sequences, like natural languages, form a language of life, enabling the adoption of LLMs to extract useful insights from complex microbial ecologies. In this paper, we review applications of deep learning and language models in analyzing microbiome and metagenomics data. We focus on problem formulations, necessary datasets, and the integration of language modeling techniques. We provide an extensive overview of protein/genomic language modeling and their contributions to microbiome studies. We also discuss applications such as novel viromics language modeling, biosynthetic gene cluster prediction, and knowledge integration for metagenomics studies.
PMID:39840283 | PMC:PMC11747409 | DOI:10.3389/fgene.2024.1494474
KalmanFormer: using transformer to model the Kalman Gain in Kalman Filters
Front Neurorobot. 2025 Jan 7;18:1460255. doi: 10.3389/fnbot.2024.1460255. eCollection 2024.
ABSTRACT
INTRODUCTION: Tracking the hidden states of dynamic systems is a fundamental task in signal processing. Recursive Kalman Filters (KF) are widely regarded as an efficient solution for linear and Gaussian systems, offering low computational complexity. However, real-world applications often involve non-linear dynamics, making it challenging for traditional Kalman Filters to achieve accurate state estimation. Additionally, the accurate modeling of system dynamics and noise in practical scenarios is often difficult. To address these limitations, we propose the KalmanFormer, a hybrid model-driven and data-driven state estimator. By leveraging data, the KalmanFormer promotes the performance of state estimation under non-linear conditions and partial information scenarios.
METHODS: The proposed KalmanFormer integrates classical Kalman Filter with a Transformer framework. Specifically, it utilizes the Transformer to learn the Kalman Gain directly from data without requiring prior knowledge of noise parameters. The learned Kalman Gain is then incorporated into the standard Kalman Filter workflow, enabling the system to better handle non-linearities and model mismatches. The hybrid approach combines the strengths of data-driven learning and model-driven methodologies to achieve robust state estimation.
RESULTS AND DISCUSSION: To evaluate the effectiveness of KalmanFormer, we conducted numerical experiments in both synthetic and real-world dataset. The results demonstrate that KalmanFormer outperforms the classical Extended Kalman Filter (EKF) in the same settings. It achieves superior accuracy in tracking hidden states, demonstrating resilience to non-linearities and imprecise system models.
PMID:39840232 | PMC:PMC11747084 | DOI:10.3389/fnbot.2024.1460255
Mid-infrared spectra of dried and roasted cocoa (<em>Theobroma cacao</em> L.): A dataset for machine learning-based classification of cocoa varieties and prediction of theobromine and caffeine content
Data Brief. 2024 Dec 19;58:111243. doi: 10.1016/j.dib.2024.111243. eCollection 2025 Feb.
ABSTRACT
This paper presents a comprehensive dataset of mid-infrared spectra for dried and roasted cocoa beans (Theobroma cacao L.), along with their corresponding theobromine and caffeine content. Infrared data were acquired using Attenuated Total Reflectance-Fourier Transform Infrared (ATR-FTIR) spectroscopy, while High-Performance Liquid Chromatography (HPLC) was employed to accurately quantify theobromine and caffeine in the dried cocoa beans. The theobromine/caffeine relationship served as a robust chemical marker for distinguishing between different cocoa varieties. This dataset provides a basis for further research, enabling the integration of mid-infrared spectral data with HPLC (as a standard) to fine-tune machine learning and deep learning models that could be used to simultaneously predict the theobromine and caffeine content, as well as cocoa variety in both dried and roasted cocoa samples using a non-destructive approach based on spectral data. The tools developed from this dataset could significantly advance automated processes in the cocoa industry and support decision-making on an industrial scale, facilitating real-time quality control of cocoa-based products, improving cocoa variety classification, and optimizing bean selection, blending strategies, and product formulation, while reducing the need for labor-intensive and costly quantification methods. The dataset is organized into Excel sheets and structured according to experimental conditions and replicates, providing a valuable framework for further analysis, model development, and calibration of multivariate statistical models.
PMID:39840227 | PMC:PMC11748727 | DOI:10.1016/j.dib.2024.111243
Role of Artificial Intelligence in MRI-Based Rectal Cancer Staging: A Systematic Review
Cureus. 2024 Dec 22;16(12):e76185. doi: 10.7759/cureus.76185. eCollection 2024 Dec.
ABSTRACT
Several studies explored the application of artificial intelligence (AI) in magnetic resonance imaging (MRI)-based rectal cancer (RC) staging, but a comprehensive evaluation remains lacking. This systematic review aims to review the performance of AI models in MRI-based RC staging. PubMed and Embase were searched from the inception of the database till October 2024 without any language and year restrictions. The prospective or retrospective studies evaluating AI models (including machine learning (ML) and deep learning (DL)) for diagnostic performance in MRI-based RC staging compared with any comparator were included in this review. The performance metrics were considered as outcomes. Two independent reviewers were involved in the study selection and data extraction to limit bias; any disagreements were resolved through mutual consensus or discussion with a third reviewer. A total of 716 records were identified from the databases. Out of these, 14 studies (1.95%) were finally included in this review. These studies were published between 2019 and 2024. Various MRI technologies were adapted by the studies and multiple AI models were developed. DL was the most common. The MRI images including T1-weighted images (14.28%), T2-weighted images (85.71%), diffusion-weighted images (42.85%), or the combination of these from different landscapes and systems were used to develop the AI models. The models were built using various techniques, mainly DL such as conventional neural network (28.57%), DL reconstruction (14.28%), Weakly supervISed model DevelOpment fraMework (7.12%), deep neural network (7.12%), Faster region-based CNN (7.12%), ResNet, DL-based clinical-radiomics nomogram (7.12%), LASSO (7.12%), and random forest classifier (7.12%). All the models that used single-type images or combined imaging modalities showed a better performance than manual assessment in terms of higher accuracy, sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and area under the curve with a score of >0.75. This is considered to be a good performance. The current study indicates that MRI-based AI models for RC staging show great promise with a high performance.
PMID:39840208 | PMC:PMC11748814 | DOI:10.7759/cureus.76185
Optical coherence tomography-enabled classification of the human venoatrial junction
J Biomed Opt. 2025 Jan;30(1):016005. doi: 10.1117/1.JBO.30.1.016005. Epub 2025 Jan 21.
ABSTRACT
SIGNIFICANCE: Radiofrequency ablation to treat atrial fibrillation (AF) involves isolating the pulmonary vein from the left atria to prevent AF from occurring. However, creating ablation lesions within the pulmonary veins can cause adverse complications.
AIM: We propose automated classification algorithms to classify optical coherence tomography (OCT) volumes of human venoatrial junctions.
APPROACH: A dataset of comprehensive OCT volumes of 26 venoatrial junctions was used for this study. Texture, statistical, and optical features were extracted from OCT patches. Patches were classified as a left atrium or pulmonary vein using random forest (RF), logistic regression (LR), and convolutional neural networks (CNNs). The features were inputs into the RF and LR classifiers. The inputs to the CNNs included: (1) patches and (2) an ensemble of patches and patch-derived features.
RESULTS: Utilizing a sevenfold cross-validation, the patch-only CNN balances sensitivity and specificity best, with an area under the receiver operating characteristic (AUROC) curve of 0.84 ± 0.109 across the test sets. RF is more sensitive than LR, with an AUROC curve of 0.78 ± 0.102 .
CONCLUSIONS: Cardiac tissues can be identified in benchtop OCT images by automated analysis. Extending this analysis to data obtained in vivo is required to tune automated analysis further. Performing this classification in vivo could aid doctors in identifying substrates of interest and treating AF.
PMID:39840147 | PMC:PMC11747903 | DOI:10.1117/1.JBO.30.1.016005