Deep learning
Multimodal data integration using machine learning to predict the risk of clear cell renal cancer metastasis: a retrospective multicentre study
Abdom Radiol (NY). 2024 Jun 15. doi: 10.1007/s00261-024-04418-1. Online ahead of print.
ABSTRACT
PURPOSE: To develop and validate a predictive combined model for metastasis in patients with clear cell renal cell carcinoma (ccRCC) by integrating multimodal data.
MATERIALS AND METHODS: In this retrospective study, the clinical and imaging data (CT and ultrasound) of patients with ccRCC confirmed by pathology from three tertiary hospitals in different regions were collected from January 2013 to January 2023. We developed three models, including a clinical model, a radiomics model, and a combined model. The performance of the model was determined based on its discriminative power and clinical utility. The evaluation indicators included area under the receiver operating characteristic curve (AUC) value, accuracy, sensitivity, specificity, negative predictive value, positive predictive value and decision curve analysis (DCA) curve.
RESULTS: A total of 251 patients were evaluated. Patients (n = 166) from Shandong University Qilu Hospital (Jinan) were divided into the training cohort, of which 50 patients developed metastases; patients (n = 37) from Shandong University Qilu Hospital (Qingdao) were used as internal testing, of which 15 patients developed metastases; patients (n = 48) from Changzhou Second People's Hospital were used as external testing, of which 13 patients developed metastases. In the training set, the combined model showed the highest performance (AUC, 0.924) in predicting lymph node metastasis (LNM), while the clinical and radiomics models both had AUCs of 0.845 and 0.870, respectively. In the internal testing, the combined model had the highest performance (AUC, 0.877) for predicting LNM, while the AUCs of the clinical and radiomics models were 0.726 and 0.836, respectively. In the external testing, the combined model had the highest performance (AUC, 0.849) for predicting LNM, while the AUCs of the clinical and radiomics models were 0.708 and 0.804, respectively. The DCA curve showed that the combined model had a significant prediction probability in predicting the risk of LNM in ccRCC patients compared with the clinical model or the radiomics model.
CONCLUSION: The combined model was superior to the clinical and radiomics models in predicting LNM in ccRCC patients.
PMID:38879708 | DOI:10.1007/s00261-024-04418-1
A deep-learning-based model for assessment of autoimmune hepatitis from histology: AI(H)
Virchows Arch. 2024 Jun 15. doi: 10.1007/s00428-024-03841-5. Online ahead of print.
ABSTRACT
Histological assessment of autoimmune hepatitis (AIH) is challenging. As one of the possible results of these challenges, nonclassical features such as bile-duct injury stays understudied in AIH. We aim to develop a deep learning tool (artificial intelligence for autoimmune hepatitis [AI(H)]) that analyzes the liver biopsies and provides reproducible, quantifiable, and interpretable results directly from routine pathology slides. A total of 123 pre-treatment liver biopsies, whole-slide images with confirmed AIH diagnosis from the archives of the Institute of Pathology at University Hospital Basel, were used to train several convolutional neural network models in the Aiforia artificial intelligence (AI) platform. The performance of AI models was evaluated on independent test set slides against pathologist's manual annotations. The AI models were 99.4%, 88.0%, 83.9%, 81.7%, and 79.2% accurate (ratios of correct predictions) for tissue detection, liver microanatomy, necroinflammation features, bile duct damage detection, and portal inflammation detection, respectively, on hematoxylin and eosin-stained slides. Additionally, the immune cells model could detect and classify different immune cells (lymphocyte, plasma cell, macrophage, eosinophil, and neutrophil) with 72.4% accuracy. On Sirius red-stained slides, the test accuracies were 99.4%, 94.0%, and 87.6% for tissue detection, liver microanatomy, and fibrosis detection, respectively. Additionally, AI(H) showed bile duct injury in 81 AIH cases (68.6%). The AI models were found to be accurate and efficient in predicting various morphological components of AIH biopsies. The computational analysis of biopsy slides provides detailed spatial and density data of immune cells in AIH landscape, which is difficult by manual counting. AI(H) can aid in improving the reproducibility of AIH biopsy assessment and bring new descriptive and quantitative aspects to AIH histology.
PMID:38879691 | DOI:10.1007/s00428-024-03841-5
Traffic planning in modern large cities Paris and Istanbul
Sci Rep. 2024 Jun 15;14(1):13829. doi: 10.1038/s41598-024-64483-w.
ABSTRACT
The enhancement of flexibility, energy efficiency, and environmental friendliness constitutes a widely acknowledged trend in the development of urban infrastructure. The proliferation of various types of transportation vehicles exacerbates the complexity of traffic regulation. Intelligent transportation systems, leveraging real-time traffic status prediction technologies, such as velocity estimation, emerge as viable solutions for the efficacious management and control of urban road networks. The objective of this project is to address the complex task of increasing accuracy in predicting traffic conditions on a big scale using deep learning techniques. To accomplish the objective of the study, the historical traffic data of Paris and Istanbul within a certain timeframe were used, considering the impact of variables such as speed, traffic volume, and direction. Specifically, traffic movie clips based on 2 years of real-world data for the two cities were utilized. The movies were generated with HERE data derived from over 100 billion GPS (Global Positioning System) probe points collected from a substantial fleet of automobiles. The model presented by us, unlike the majority of previous ones, takes into account the cumulative impact of speed, flow, and direction. The developed model showed better results compared to the well-known models, in particular, in comparison with the SR-ResNet model. The pixel-wise MAE (mean absolute error) values for Paris and Istanbul were 4.299 and 3.884 respectively, compared to 4.551 and 3.993 for SR-ResNET. Thus, the created model demonstrated the possibilities for further enhancing the accuracy and efficacy of intelligent transportation systems, particularly in large urban centres, thereby facilitating heightened safety, energy efficiency, and convenience for road users. The obtained results will be useful for local policymakers responsible for infrastructure development planning, as well as for specialists and researchers in the field. Future research should investigate how to incorporate more sources of information, in particular previous information from physical traffic flow models, information about weather conditions, etc. into the deep learning framework, as well as further increasing of the throughput capacity and reducing processing time.
PMID:38879639 | DOI:10.1038/s41598-024-64483-w
Deep cell phenotyping and spatial analysis of multiplexed imaging with TRACERx-PHLEX
Nat Commun. 2024 Jun 15;15(1):5135. doi: 10.1038/s41467-024-48870-5.
ABSTRACT
The growing scale and dimensionality of multiplexed imaging require reproducible and comprehensive yet user-friendly computational pipelines. TRACERx-PHLEX performs deep learning-based cell segmentation (deep-imcyto), automated cell-type annotation (TYPEx) and interpretable spatial analysis (Spatial-PHLEX) as three independent but interoperable modules. PHLEX generates single-cell identities, cell densities within tissue compartments, marker positivity calls and spatial metrics such as cellular barrier scores, along with summary graphs and spatial visualisations. PHLEX was developed using imaging mass cytometry (IMC) in the TRACERx study, validated using published Co-detection by indexing (CODEX), IMC and orthogonal data and benchmarked against state-of-the-art approaches. We evaluated its use on different tissue types, tissue fixation conditions, image sizes and antibody panels. As PHLEX is an automated and containerised Nextflow pipeline, manual assessment, programming skills or pathology expertise are not essential. PHLEX offers an end-to-end solution in a growing field of highly multiplexed data and provides clinically relevant insights.
PMID:38879602 | DOI:10.1038/s41467-024-48870-5
Microscopy Image Dataset for Deep Learning-Based Quantitative Assessment of Pulmonary Vascular Changes
Sci Data. 2024 Jun 15;11(1):635. doi: 10.1038/s41597-024-03473-z.
ABSTRACT
Pulmonary hypertension (PH) is a syndrome complex that accompanies a number of diseases of different etiologies, associated with basic mechanisms of structural and functional changes of the pulmonary circulation vessels and revealed pressure increasing in the pulmonary artery. The structural changes in the pulmonary circulation vessels are the main limiting factor determining the prognosis of patients with PH. Thickening and irreversible deposition of collagen in the pulmonary artery branches walls leads to rapid disease progression and a therapy effectiveness decreasing. In this regard, histological examination of the pulmonary circulation vessels is critical both in preclinical studies and clinical practice. However, measurements of quantitative parameters such as the average vessel outer diameter, the vessel walls area, and the hypertrophy index claimed significant time investment and the requirement for specialist training to analyze micrographs. A dataset of pulmonary circulation vessels for pathology assessment using semantic segmentation techniques based on deep-learning is presented in this work. 609 original microphotographs of vessels, numerical data from experts' measurements, and microphotographs with outlines of these measurements for each of the vessels are presented. Furthermore, here we cite an example of a deep learning pipeline using the U-Net semantic segmentation model to extract vascular regions. The presented database will be useful for the development of new software solutions for the analysis of histological micrograph.
PMID:38879569 | DOI:10.1038/s41597-024-03473-z
Deep learning classification and quantification of pejorative and non-pejorative architectures in resected hepatocellular carcinoma from digital histopathological images
Am J Pathol. 2024 Jun 13:S0002-9440(24)00206-2. doi: 10.1016/j.ajpath.2024.05.007. Online ahead of print.
ABSTRACT
Liver resection is one of the best treatments for small hepatocellular carcinoma, but post-resection recurrence is frequent. Biotherapies have emerged as an efficient adjuvant treatment making the identification of patients at high risk of recurrence critical. Microvascular invasion, poor differentiation, pejorative macrotrabecular, and "vessels encapsulating tumor clusters" architectures are the most accurate histological predictors of recurrence but their evaluation is time-consuming and imperfect. A supervised deep learning-based approach with ResNet34 on 680 Whole Slide Images from 107 liver resection specimens allowed to build an algorithm for the identification and quantification of these pejorative architectures. This model achieved an accuracy of 0.864 at patch-level and 0.823 at Whole Slide Image-level. To assess its robustness, it was validated on an external cohort of 29 hepatocellular carcinomas from another hospital with an accuracy of 0.787 at Whole Slide Image-level, affirming its generalization capabilities. Moreover, largest connected areas of the pejorative architectures extracted from the model were positively correlated to the presence of microvascular invasion and the number of tumor emboli. These results suggest that the identification of pejorative architectures could be an efficient surrogate of microvascular invasion and have a strong predictive value for the risk of recurrence. This study is the first step in the construction of a composite predictive algorithm for early post-resection recurrence of hepatocellular carcinoma, including artificial intelligence-based features.
PMID:38879083 | DOI:10.1016/j.ajpath.2024.05.007
Deep Learning for Grading Endometrial Cancer
Am J Pathol. 2024 Jun 13:S0002-9440(24)00202-5. doi: 10.1016/j.ajpath.2024.05.003. Online ahead of print.
ABSTRACT
Endometrial cancer, the fourth most common cancer in females in the United States, with the lifetime risk for developing this disease is approximately 2.8% in women. Precise histologic evaluation and molecular classification of endometrial cancer is important for effective patient management and determining the best treatment modalities. This study introduces EndoNet, which uses convolutional neural networks for extracting histologic features and a vision transformer for aggregating these features and classifying slides based on their visual characteristics into high- and low- grade. The model was trained on 929 digitized hematoxylin and eosin-stained whole-slide images of endometrial cancer from hysterectomy cases at Dartmouth-Health. It classifies these slides into low-grade (Endometroid Grades 1 and 2) and high-grade (endometroid carcinoma FIGO grade 3, uterine serous carcinoma, carcinosarcoma) categories. EndoNet was evaluated on an internal test set of 110 patients and an external test set of 100 patients from the public TCGA database. The model achieved a weighted average F1-score of 0.91 (95% CI: 0.86-0.95) and an AUC of 0.95 (95% CI: 0.89-0.99) on the internal test, and 0.86 (95% CI: 0.80-0.94) for F1-score and 0.86 (95% CI: 0.75-0.93) for AUC on the external test. Pending further validation, EndoNet has the potential to support pathologists without the need of manual annotations in classifying the grades of gynecologic pathology tumors.
PMID:38879079 | DOI:10.1016/j.ajpath.2024.05.003
An Emerging Network for COVID-19 CT-Scan Classification using an ensemble deep transfer learning model
Acta Trop. 2024 Jun 13:107277. doi: 10.1016/j.actatropica.2024.107277. Online ahead of print.
ABSTRACT
Over the past few years, the widespread outbreak of COVID-19 has caused the death of millions of people worldwide. Early diagnosis of the virus is essential to control its spread and provide timely treatment. Artificial intelligence methods are often used as powerful tools to reach a COVID-19 diagnosis via computed tomography (CT) samples. In this paper, artificial intelligence-based methods are introduced to diagnose COVID-19. At first, a network called CT6-CNN is designed, and then two ensemble deep transfer learning models are developed based on Xception, ResNet-101, DenseNet-169, and CT6-CNN to reach a COVID-19 diagnosis by CT samples. The publicly available SARS-CoV-2 CT dataset is utilized for our implementation, including 2481 CT scans. The dataset is separated into 2108, 248, and 125 images for training, validation, and testing, respectively. Based on experimental results, the CT6-CNN model achieved 94.66% accuracy, 94.67% precision, 94.67% sensitivity, and 94.65% F1-score rate. Moreover, the ensemble learning models reached 99.2% accuracy. Experimental results affirm the effectiveness of designed models, especially the ensemble deep learning models, to reach a diagnosis of COVID-19.
PMID:38878849 | DOI:10.1016/j.actatropica.2024.107277
DentalSegmentator: robust open source deep learning-based CT and CBCT image segmentation
J Dent. 2024 Jun 13:105130. doi: 10.1016/j.jdent.2024.105130. Online ahead of print.
ABSTRACT
OBJECTIVES: Segmentation of anatomical structures on dento-maxillo-facial (DMF) computed tomography (CT) or cone beam computed tomography (CBCT) scans is increasingly needed in digital dentistry. The main aim of this research was to propose and evaluate a novel open source tool called DentalSegmentator for fully automatic segmentation of five anatomic structures on DMF CT and CBCT scans: maxilla/upper skull, mandible, upper teeth, lower teeth, and the mandibular canal.
METHODS: A retrospective sample of 470 CT and CBCT scans was used as a training/validation set. The performance and generalizability of the tool was evaluated by comparing segmentations provided by experts and automatic segmentations in two hold-out test datasets: an internal dataset of 133 CT and CBCT scans acquired before orthognathic surgery and an external dataset of 123 CBCT scans randomly sampled from routine examinations in 5 institutions.
RESULTS: The mean overall results in the internal test dataset (n = 133) were a Dice similarity coefficient (DSC) of 92.2 ± 6.3% and a normalised surface distance (NSD) of 98.2 ± 2.2%. The mean overall results on the external test dataset (n = 123) were a DSC of 94.2 ± 7.4% and a NSD of 98.4 ± 3.6%.
CONCLUSIONS: The results obtained from this highly diverse dataset demonstrate that this tool can provide fully automatic and robust multiclass segmentation for DMF CT and CBCT scans. To encourage the clinical deployment of DentalSegmentator, the pre-trained nnU-Net model has been made publicly available along with an extension for the 3D Slicer software.
CLINICAL SIGNIFICANCE: DentalSegmentator open source 3D Slicer extension provides a free, robust, and easy-to-use approach to obtaining patient-specific three-dimensional models from CT and CBCT scans. These models serve various purposes in a digital dentistry workflow, such as visualization, treatment planning, intervention, and follow-up.
PMID:38878813 | DOI:10.1016/j.jdent.2024.105130
CVAE-DF: A hybrid deep learning framework for fertilization status detection of pre-incubation duck eggs based on VIS/NIR spectroscopy
Spectrochim Acta A Mol Biomol Spectrosc. 2024 May 31;320:124569. doi: 10.1016/j.saa.2024.124569. Online ahead of print.
ABSTRACT
Unfertilized duck eggs not removed prior to incubation will deteriorate quickly, posing a risk of contaminating the normally fertilized duck eggs. Thus, detecting the fertilization status of breeding duck eggs as early as possible is a meaningful and challenging task. Most existing work usually focus on the characteristics of chicken eggs during mid-term hatching. However, little attention has been paid to the detection for duck eggs prior to incubation. In this paper, we present a novel hybrid deep learning detection framework for the fertilization status of pre-incubation duck eggs, termed CVAE-DF, based on visible/near-infrared (VIS/NIR) transmittance spectroscopy. The framework comprises the encoder of a convolutional variational autoencoder (CVAE) and an improved deep forest (DF) model. More specifically, we first collected transmittance spectral data (400-1000 nm) of 255 duck eggs before hatching. The multiplicative scatter correction (MSC) method was then used to eliminate noise and extraneous information of the raw spectral data. Two efficient data augmentation methods were adopted to provide sufficient data. After that, CVAE was applied to extract representative features and reduce the feature dimension for the detection task. Finally, an improved DF model was employed to build the classification model on the enhanced feature set. The CVAE-DF model achieved an overall accuracy of 95.94 % on the test dataset. These experimental results in terms of four metrics demonstrate that our CVAE-DF method outperforms the traditional methods by a significant margin. Furthermore, the results also indicate that CVAE holds great promise as a novel feature extraction method for the VIS/NIR spectral analysis of other agricultural products. It is extremely beneficial to practical engineering.
PMID:38878719 | DOI:10.1016/j.saa.2024.124569
Coronary computed tomographic angiography-derived anatomic and hemodynamic plaque characteristics in prediction of cardiovascular events
Int J Cardiovasc Imaging. 2024 Jun 15. doi: 10.1007/s10554-024-03149-0. Online ahead of print.
ABSTRACT
This study investigated the association of anatomic and hemodynamic plaque characteristics based on deep learning coronary computed tomography angiography (CCTA) with high-risk plaques that caused subsequent major adverse cardiovascular events (MACE). A retrospective analysis was conducted on patients who underwent CCTA between 1 month and 3 years prior to the occurrence of a MACE. Deep learning and computational fluid dynamics algorithms based on CCTA were applied to extract adverse plaque characteristics (low-attenuation plaque, positive remodeling, napkin-ring sign, and spotty calcification), and hemodynamic parameters (fractional flow reserve derived by coronary computed tomographic angiography [FFRCT], change in FFRCT across the lesion [△FFRCT], wall shear stress [WSS], and axial plaque stress [APS]). Correlation analysis, logistic regression, and Cox proportional risk analysis were conducted to understand the relationship between these measures and the occurrence of MACE and assess the value of hemodynamic parameters in predicting the incidence of MACE events and their prognosis. Our study included 86 patients with a total of 134 vessels exhibiting plaque formation and 83 culprit vessels with a subsequent coronary event. Culprit vessels had percent diameter stenosis [%DS] (0.54 ± 0.16 vs. 0.62 ± 0.13, P = 0.003), larger non-calcified plaque volume (45.8 vs. 101.7, P < 0.001), larger low-attenuation plaque volume (3.6 vs. 14.5, P < 0.001), more lesions with ≥ 3 adverse plaque characteristics (APC) (4 vs.26, P = 0.002), and worse hemodynamic features of adverse plaque. FFRCT demonstrated better visualization of maximum achievable flow in the presence of coronary stenosis and better correlation with the stenosis severity, while maximum of wall shear stress (WSSmax) was highly correlated with low-attenuation plaques and APC. The inclusion of hemodynamic parameters improved the efficacy of the predictive model, and a high WSS suggested a higher probability of MACE. Hemodynamic parameters based on CCTA are significantly correlated with plaque morphology. Importantly, integrating CCTA-derived parameters can refine the predictive performance of MACE occurrence.
PMID:38878147 | DOI:10.1007/s10554-024-03149-0
Selection of convolutional neural network model for bladder tumor classification of cystoscopy images and comparison with humans
J Endourol. 2024 Jun 15. doi: 10.1089/end.2024.0250. Online ahead of print.
ABSTRACT
PURPOSE: An investigation of various convolutional neural network (CNN)-based deep learning algorithms was conducted to select the appropriate artificial intelligence (AI) model for calculating the diagnostic performance of bladder tumor classification on cystoscopy images, with the performance of the selected model to be compared against that of medical students and urologists.
METHODS: A total of 3,731 cystoscopic images that contained 2,191 tumor images were obtained from 543 bladder tumor cases and 219 normal cases were evaluated. A total of 17 CNN models were trained for tumor classification with various hyperparameters. The diagnostic performance of the selected AI model was compared with the results obtained from urologists and medical students by using the receiver operating characteristic (ROC) curve graph and metrics.
RESULTS: EfficientNetB0 was selected as the appropriate AI model. In the test results, EfficientNetB0 achieved a balanced accuracy of 81%, sensitivity of 88%, specificity of 74%, and an AUC of 92%. In contrast, human-derived diagnostic statistics for the test data showed an average balanced accuracy of 75%, sensitivity of 94%, and specificity of 55%. Specifically, urologists had an average balanced accuracy of 91%, sensitivity of 95%, and specificity of 88%, while medical students had an average balanced accuracy of 69%, sensitivity of 94%, and specificity of 44% Conclusions: Among the various AI models, we suggest that EfficientNetB0 is an appropriate AI classification model for determining the presence of bladder tumors in cystoscopic images. EfficientNetB0 showed the highest performance among several models and showed high accuracy and specificity compared to medical student. This AI technology will be helpful for less experienced urologists or non-urologists in making diagnoses. Image-based deep learning classifies bladder cancer using cystoscopy images and shows promise for generalized applications in biomedical image analysis and clinical decision-making.
PMID:38877795 | DOI:10.1089/end.2024.0250
Deep recognition of rice disease images: how many training samples do we really need?
J Sci Food Agric. 2024 Jun 15. doi: 10.1002/jsfa.13636. Online ahead of print.
ABSTRACT
BACKGROUND: With the rapid development of deep learning, the recognition of rice disease images using deep neural networks has become a hot research topic. However, most previous studies only focus on the modification of deep learning models, while lacking research to systematically and scientifically explore the impact of different data sizes on the image recognition task for rice diseases. In this study, a functional model was developed to predict the relationship between the size of dataset and the accuracy rate of model recognition.
RESULTS: Training VGG16 deep learning models with different quantities of images of rice blast-diseased leaves and healthy rice leaves, it was found that the test accuracy of the resulting models could be well fitted with an exponential model (A = 0.9965 - e(-0.0603×I50-1.6693)). Experimental results showed that with an increase of image quantity, the recognition accuracy of deep learning models would show a rapid increase at first. Yet when the image quantity increases beyond a certain threshold, the accuracy of image classification would not improve much, and the marginal benefit would be reduced. This trend remained similar when the composition of the dataset was changed, no matter whether (i) the disease class was changed, (ii) the number of classes was increased or (iii) the image data were augmented.
CONCLUSIONS: This study provided a scientific basis for the impact of data size on the accuracy of rice disease image recognition, and may also serve as a reference for researchers for database construction. © 2024 Society of Chemical Industry.
PMID:38877787 | DOI:10.1002/jsfa.13636
Deep learning-based pathological prediction of lymph node metastasis for patient with renal cell carcinoma from primary whole slide images
J Transl Med. 2024 Jun 14;22(1):568. doi: 10.1186/s12967-024-05382-6.
ABSTRACT
BACKGROUND: Metastasis renal cell carcinoma (RCC) patients have extremely high mortality rate. A predictive model for RCC micrometastasis based on pathomics could be beneficial for clinicians to make treatment decisions.
METHODS: A total of 895 formalin-fixed and paraffin-embedded whole slide images (WSIs) derived from three cohorts, including Shanghai General Hospital (SGH), Clinical Proteomic Tumor Analysis Consortium (CPTAC) and Cancer Genome Atlas (TCGA) cohorts, and another 588 frozen section WSIs from TCGA dataset were involved in the study. The deep learning-based strategy for predicting lymphatic metastasis was developed based on WSIs through clustering-constrained-attention multiple-instance learning method and verified among the three cohorts. The performance of the model was further verified in frozen-pathological sections. In addition, the model was also tested the prognosis prediction of patients with RCC in multi-source patient cohorts.
RESULTS: The AUC of the lymphatic metastasis prediction performance was 0.836, 0.865 and 0.812 in TCGA, SGH and CPTAC cohorts, respectively. The performance on frozen section WSIs was with the AUC of 0.801. Patients with high deep learning-based prediction of lymph node metastasis values showed worse prognosis.
CONCLUSIONS: In this study, we developed and verified a deep learning-based strategy for predicting lymphatic metastasis from primary RCC WSIs, which could be applied in frozen-pathological sections and act as a prognostic factor for RCC to distinguished patients with worse survival outcomes.
PMID:38877591 | DOI:10.1186/s12967-024-05382-6
A deep learning framework for predicting disease-gene associations with functional modules and graph augmentation
BMC Bioinformatics. 2024 Jun 14;25(1):214. doi: 10.1186/s12859-024-05841-3.
ABSTRACT
BACKGROUND: The exploration of gene-disease associations is crucial for understanding the mechanisms underlying disease onset and progression, with significant implications for prevention and treatment strategies. Advances in high-throughput biotechnology have generated a wealth of data linking diseases to specific genes. While graph representation learning has recently introduced groundbreaking approaches for predicting novel associations, existing studies always overlooked the cumulative impact of functional modules such as protein complexes and the incompletion of some important data such as protein interactions, which limits the detection performance.
RESULTS: Addressing these limitations, here we introduce a deep learning framework called ModulePred for predicting disease-gene associations. ModulePred performs graph augmentation on the protein interaction network using L3 link prediction algorithms. It builds a heterogeneous module network by integrating disease-gene associations, protein complexes and augmented protein interactions, and develops a novel graph embedding for the heterogeneous module network. Subsequently, a graph neural network is constructed to learn node representations by collectively aggregating information from topological structure, and gene prioritization is carried out by the disease and gene embeddings obtained from the graph neural network. Experimental results underscore the superiority of ModulePred, showcasing the effectiveness of incorporating functional modules and graph augmentation in predicting disease-gene associations. This research introduces innovative ideas and directions, enhancing the understanding and prediction of gene-disease relationships.
PMID:38877401 | DOI:10.1186/s12859-024-05841-3
Deep Kernel learning for reaction outcome prediction and optimization
Commun Chem. 2024 Jun 14;7(1):136. doi: 10.1038/s42004-024-01219-x.
ABSTRACT
Recent years have seen a rapid growth in the application of various machine learning methods for reaction outcome prediction. Deep learning models have gained popularity due to their ability to learn representations directly from the molecular structure. Gaussian processes (GPs), on the other hand, provide reliable uncertainty estimates but are unable to learn representations from the data. We combine the feature learning ability of neural networks (NNs) with uncertainty quantification of GPs in a deep kernel learning (DKL) framework to predict the reaction outcome. The DKL model is observed to obtain very good predictive performance across different input representations. It significantly outperforms standard GPs and provides comparable performance to graph neural networks, but with uncertainty estimation. Additionally, the uncertainty estimates on predictions provided by the DKL model facilitated its incorporation as a surrogate model for Bayesian optimization (BO). The proposed method, therefore, has a great potential towards accelerating reaction discovery by integrating accurate predictive models that provide reliable uncertainty estimates with BO.
PMID:38877182 | DOI:10.1038/s42004-024-01219-x
Ultrasensitive plasma-based monitoring of tumor burden using machine-learning-guided signal enrichment
Nat Med. 2024 Jun 14. doi: 10.1038/s41591-024-03040-4. Online ahead of print.
ABSTRACT
In solid tumor oncology, circulating tumor DNA (ctDNA) is poised to transform care through accurate assessment of minimal residual disease (MRD) and therapeutic response monitoring. To overcome the sparsity of ctDNA fragments in low tumor fraction (TF) settings and increase MRD sensitivity, we previously leveraged genome-wide mutational integration through plasma whole-genome sequencing (WGS). Here we now introduce MRD-EDGE, a machine-learning-guided WGS ctDNA single-nucleotide variant (SNV) and copy-number variant (CNV) detection platform designed to increase signal enrichment. MRD-EDGESNV uses deep learning and a ctDNA-specific feature space to increase SNV signal-to-noise enrichment in WGS by ~300× compared to previous WGS error suppression. MRD-EDGECNV also reduces the degree of aneuploidy needed for ultrasensitive CNV detection through WGS from 1 Gb to 200 Mb, vastly expanding its applicability within solid tumors. We harness the improved performance to identify MRD following surgery in multiple cancer types, track changes in TF in response to neoadjuvant immunotherapy in lung cancer and demonstrate ctDNA shedding in precancerous colorectal adenomas. Finally, the radical signal-to-noise enrichment in MRD-EDGESNV enables plasma-only (non-tumor-informed) disease monitoring in advanced melanoma and lung cancer, yielding clinically informative TF monitoring for patients on immune-checkpoint inhibition.
PMID:38877116 | DOI:10.1038/s41591-024-03040-4
Development of a real-time cattle lameness detection system using a single side-view camera
Sci Rep. 2024 Jun 14;14(1):13734. doi: 10.1038/s41598-024-64664-7.
ABSTRACT
Recent advancements in machine learning and deep learning have revolutionized various computer vision applications, including object detection, tracking, and classification. This research investigates the application of deep learning for cattle lameness detection in dairy farming. Our study employs image processing techniques and deep learning methods for cattle detection, tracking, and lameness classification. We utilize two powerful object detection algorithms: Mask-RCNN from Detectron2 and the popular YOLOv8. Their performance is compared to identify the most effective approach for this application. Bounding boxes are drawn around detected cattle to assign unique local IDs, enabling individual tracking and isolation throughout the video sequence. Additionally, mask regions generated by the chosen detection algorithm provide valuable data for feature extraction, which is crucial for subsequent lameness classification. The extracted cattle mask region values serve as the basis for feature extraction, capturing relevant information indicative of lameness. These features, combined with the local IDs assigned during tracking, are used to compute a lameness score for each cattle. We explore the efficacy of various established machine learning algorithms, such as Support Vector Machines (SVM), AdaBoost and so on, in analyzing the extracted lameness features. Evaluation of the proposed system was conducted across three key domains: detection, tracking, and lameness classification. Notably, the detection module employing Detectron2 achieved an impressive accuracy of 98.98%. Similarly, the tracking module attained a high accuracy of 99.50%. In lameness classification, AdaBoost emerged as the most effective algorithm, yielding the highest overall average accuracy (77.9%). Other established machine learning algorithms, including Decision Trees (DT), Support Vector Machines (SVM), and Random Forests, also demonstrated promising performance (DT: 75.32%, SVM: 75.20%, Random Forest: 74.9%). The presented approach demonstrates the successful implementation for cattle lameness detection. The proposed system has the potential to revolutionize dairy farm management by enabling early lameness detection and facilitating effective monitoring of cattle health. Our findings contribute valuable insights into the application of advanced computer vision methods for livestock health management.
PMID:38877097 | DOI:10.1038/s41598-024-64664-7
Deep learning-driven hybrid model for short-term load forecasting and smart grid information management
Sci Rep. 2024 Jun 14;14(1):13720. doi: 10.1038/s41598-024-63262-x.
ABSTRACT
Accurate power load forecasting is crucial for the sustainable operation of smart grids. However, the complexity and uncertainty of load, along with the large-scale and high-dimensional energy information, present challenges in handling intricate dynamic features and long-term dependencies. This paper proposes a computational approach to address these challenges in short-term power load forecasting and energy information management, with the goal of accurately predicting future load demand. The study introduces a hybrid method that combines multiple deep learning models, the Gated Recurrent Unit (GRU) is employed to capture long-term dependencies in time series data, while the Temporal Convolutional Network (TCN) efficiently learns patterns and features in load data. Additionally, the attention mechanism is incorporated to automatically focus on the input components most relevant to the load prediction task, further enhancing model performance. According to the experimental evaluation conducted on four public datasets, including GEFCom2014, the proposed algorithm outperforms the baseline models on various metrics such as prediction accuracy, efficiency, and stability. Notably, on the GEFCom2014 dataset, FLOP is reduced by over 48.8%, inference time is shortened by more than 46.7%, and MAPE is improved by 39%. The proposed method significantly enhances the reliability, stability, and cost-effectiveness of smart grids, which facilitates risk assessment optimization and operational planning under the context of information management for smart grid systems.
PMID:38877081 | DOI:10.1038/s41598-024-63262-x
Attention and sentiment of Chinese public toward rural landscape based on Sina Weibo
Sci Rep. 2024 Jun 14;14(1):13724. doi: 10.1038/s41598-024-64527-1.
ABSTRACT
Rural landscapes, as products of the interaction between humans and nature, not only reflect the history and culture of rural areas but also symbolize economic and social progress. This study proposes a deep learning-based model for Weibo data analysis aimed at exploring the development direction of rural landscapes from the perspective of the Chinese public. The research reveals that the Chinese public's attention to rural landscapes has significantly increased with the evolution of government governance concepts. Most people express a high level of satisfaction and happiness with the existing rural landscapes, while a minority harbor negative emotions towards unreasonable new rural construction. Through the analysis of public opinion regarding rural landscapes, this study will assist decision-makers in understanding the mechanisms of public discourse on social media. It will also aid relevant scholars and designers in providing targeted solutions, which hold significant importance for policy formulation and the exploration of specific development patterns.
PMID:38877046 | DOI:10.1038/s41598-024-64527-1