Deep learning
Time-Dependent Deep Learning Prediction of Multiple Sclerosis Disability
J Imaging Inform Med. 2024 Jun 13. doi: 10.1007/s10278-024-01031-y. Online ahead of print.
ABSTRACT
The majority of deep learning models in medical image analysis concentrate on single snapshot timepoint circumstances, such as the identification of current pathology on a given image or volume. This is often in contrast to the diagnostic methodology in radiology where presumed pathologic findings are correlated to prior studies and subsequent changes over time. For multiple sclerosis (MS), the current body of literature describes various forms of lesion segmentation with few studies analyzing disability progression over time. For the purpose of longitudinal time-dependent analysis, we propose a combinatorial analysis of a video vision transformer (ViViT) benchmarked against traditional recurrent neural network of Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) architectures and a hybrid Vision Transformer-LSTM (ViT-LSTM) to predict long-term disability based upon the Extended Disability Severity Score (EDSS). The patient cohort was procured from a two-site institution with 703 patients' multisequence, contrast-enhanced MRIs of the cervical spine between the years 2002 and 2023. Following a competitive performance analysis, a VGG-16-based CNN-LSTM was compared to ViViT with an ablation analysis to determine time-dependency of the models. The VGG16-LSTM predicted trinary classification of EDSS score in 6 years with 0.74 AUC versus the ViViT with 0.84 AUC (p-value < 0.001 per 5 × 2 cross-validation F-test) on an 80:20 hold-out testing split. However, the VGG16-LSTM outperformed ViViT when patients with only 2 years of MRIs (n = 94) (0.75 AUC versus 0.72 AUC, respectively). Exact EDSS classification was investigated for both models using both classification and regression strategies but showed collectively worse performance. Our experimental results demonstrate the ability of time-dependent deep learning models to predict disability in MS using trinary stratification of disability, mimicking clinical practice. Further work includes external validation and subsequent observational clinical trials.
PMID:38871944 | DOI:10.1007/s10278-024-01031-y
Multi-label classification of retinal diseases based on fundus images using Resnet and Transformer
Med Biol Eng Comput. 2024 Jun 14. doi: 10.1007/s11517-024-03144-6. Online ahead of print.
ABSTRACT
Retinal disorders are a major cause of irreversible vision loss, which can be mitigated through accurate and early diagnosis. Conventionally, fundus images are used as the gold diagnosis standard in detecting retinal diseases. In recent years, more and more researchers have employed deep learning methods for diagnosing ophthalmic diseases using fundus photography datasets. Among the studies, most of them focus on diagnosing a single disease in fundus images, making it still challenging for the diagnosis of multiple diseases. In this paper, we propose a framework that combines ResNet and Transformer for multi-label classification of retinal disease. This model employs ResNet to extract image features, utilizes Transformer to capture global information, and enhances the relationships between categories through learnable label embedding. On the publicly available Ocular Disease Intelligent Recognition (ODIR-5 k) dataset, the proposed method achieves a mean average precision of 92.86%, an area under the curve (AUC) of 97.27%, and a recall of 90.62%, which outperforms other state-of-the-art approaches for the multi-label classification. The proposed method represents a significant advancement in the field of retinal disease diagnosis, offering a more accurate, efficient, and comprehensive model for the detection of multiple retinal conditions.
PMID:38871856 | DOI:10.1007/s11517-024-03144-6
Robust diagnosis and meta visualizations of plant diseases through deep neural architecture with explainable AI
Sci Rep. 2024 Jun 13;14(1):13695. doi: 10.1038/s41598-024-64601-8.
ABSTRACT
Deep learning has emerged as a highly effective and precise method for classifying images. The presence of plant diseases poses a significant threat to food security. However, accurately identifying these diseases in plants is challenging due to limited infrastructure and techniques. Fortunately, the recent advancements in deep learning within the field of computer vision have opened up new possibilities for diagnosing plant pathology. Detecting plant diseases at an early stage is crucial, and this research paper proposes a deep convolutional neural network model that can rapidly and accurately identify plant diseases. Given the minimal variation in image texture and color, deep learning techniques are essential for robust recognition. In this study, we introduce a deep, explainable neural architecture specifically designed for recognizing plant diseases. Fine-tuned deep convolutional neural network is designed by freezing the layers and adjusting the weights of learnable layers. By extracting deep features from a down sampled feature map of a fine-tuned neural network, we are able to classify these features using a customized K-Nearest Neighbors Algorithm. To train and validate our model, we utilize the largest standard plant village dataset, which consists of 38 classes. To evaluate the performance of our proposed system, we estimate specificity, sensitivity, accuracy, and AUC. The results demonstrate that our system achieves an impressive maximum validation accuracy of 99.95% and an AUC of 1, making it the most ideal and highest-performing approach compared to current state-of-the-art deep learning methods for automatically identifying plant diseases.
PMID:38871765 | DOI:10.1038/s41598-024-64601-8
Occurrence and Distribution of Antibacterial Quaternary Ammonium Compounds in Chinese Estuaries Revealed by Machine Learning-Assisted Mass Spectrometric Analysis
Environ Sci Technol. 2024 Jun 13. doi: 10.1021/acs.est.4c02380. Online ahead of print.
ABSTRACT
Antimicrobial resistance (AMR) undermines the United Nations Sustainable Development Goals of good health and well-being. Antibiotics are known to exacerbate AMR, but nonantibiotic antimicrobials, such as quaternary ammonium compounds (QACs), are now emerging as another significant driver of AMR. However, assessing the AMR risks of QACs in complex environmental matrices remains challenging due to the ambiguity in their chemical structures and antibacterial activity. By machine learning prediction and high-resolution mass spectrometric analysis, a list of antibacterial QACs (n = 856) from industrial chemical inventories is compiled, and it leads to the identification of 50 structurally diverse antibacterial QACs in sediments, including traditional hydrocarbon-based compounds and new subclasses that bear additional functional groups, such as choline, ester, betaine, aryl ether, and pyridine. Urban wastewater, aquaculture, and hospital discharges are the main factors influencing QAC distribution patterns in estuarine sediments. Toxic unit calculations and metagenomic analysis revealed that these QACs can influence antibiotic resistance genes (particularly sulfonamide resistance genes) through cross- and coresistances. The potential to influence the AMR is related to their environmental persistence. These results suggest that controlling the source, preventing the co-use of QACs and sulfonamides, and prioritizing control of highly persistent molecules will lead to global stewardship and sustainable use of QACs.
PMID:38871667 | DOI:10.1021/acs.est.4c02380
Multi-feature Fusion Network on Gray Scale Ultrasonography: Effective Differentiation of Adenolymphoma and Pleomorphic Adenoma
Acad Radiol. 2024 Jun 12:S1076-6332(24)00308-8. doi: 10.1016/j.acra.2024.05.023. Online ahead of print.
ABSTRACT
RATIONALE AND OBJECTIVES: to develop a deep learning radiomics graph network (DLRN) that integrates deep learning features extracted from gray scale ultrasonography, radiomics features and clinical features, for distinguishing parotid pleomorphic adenoma (PA) from adenolymphoma (AL) MATERIALS AND METHODS: A total of 287 patients (162 in training cohort, 70 in internal validation cohort and 55 in external validation cohort) from two centers with histologically confirmed PA or AL were enrolled. Deep transfer learning features and radiomics features extracted from gray scale ultrasound images were input to machine learning classifiers including logistic regression (LR), support vector machines (SVM), KNN, RandomForest (RF), ExtraTrees, XGBoost, LightGBM, and MLP to construct deep transfer learning radiomics (DTL) models and Rad models respectively. Deep learning radiomics (DLR) models were constructed by integrating the two features and DLR signatures were generated. Clinical features were further combined with the signatures to develop a DLRN model. The performance of these models was evaluated using receiver operating characteristic (ROC) curve analysis, calibration, decision curve analysis (DCA), and the Hosmer-Lemeshow test.
RESULTS: In the internal validation cohort and external validation cohort, comparing to Clinic (AUC=0.767 and 0.777), Rad (AUC=0.841 and 0.748), DTL (AUC=0.740 and 0.825) and DLR (AUC=0.863 and 0.859), the DLRN model showed greatest discriminatory ability (AUC=0.908 and 0.908) showed optimal discriminatory ability.
CONCLUSION: The DLRN model built based on gray scale ultrasonography significantly improved the diagnostic performance for benign salivary gland tumors. It can provide clinicians with a non-invasive and accurate diagnostic approach, which holds important clinical significance and value. Ensemble of multiple models helped alleviate overfitting on the small dataset compared to using Resnet50 alone.
PMID:38871552 | DOI:10.1016/j.acra.2024.05.023
Other possible perspectives for solving the negative outcome penalty paradox in the application of artificial intelligence in clinical diagnostics
J Med Ethics. 2024 Jun 13:jme-2024-109968. doi: 10.1136/jme-2024-109968. Online ahead of print.
ABSTRACT
Artificial intelligence (AI), represented by machine learning, artificial neural networks and deep learning, is impacting all areas of medicine, including translational research (from bench to bedside to health policy), clinical medicine (including diagnosis, treatment, prognosis and healthcare resource allocation) and public health. At a time when almost everyone is focused on how to better realise the promise of AI to transform the entire healthcare system, Dr Appel calls for public attention to the AI in medicine and the negative outcome penalty paradox. Proposing this topic has deepened our thinking about the application of AI in clinical diagnostics, and also prompted us to find more effective ways to integrate AI more effectively into future clinical practice. In addition to Dr Appel's insightful advice, I hope to offer three other possible perspectives, including changing public perceptions, re-engineering clinical practice processes and introducing more stakeholders, to further the discussion on this topic.
PMID:38871400 | DOI:10.1136/jme-2024-109968
A Deep Learning Approach to Predict Recanalization First-Pass Effect following Mechanical Thrombectomy in Patients with Acute Ischemic Stroke
AJNR Am J Neuroradiol. 2024 Jun 13. doi: 10.3174/ajnr.A8272. Online ahead of print.
ABSTRACT
BACKGROUND AND PURPOSE: Following endovascular thrombectomy in patients with large-vessel occlusion stroke, successful recanalization from 1 attempt, known as the first-pass effect, has correlated favorably with long-term outcomes. Pretreatment imaging may contain information that can be used to predict the first-pass effect. Recently, applications of machine learning models have shown promising results in predicting recanalization outcomes, albeit requiring manual segmentation. In this study, we sought to construct completely automated methods using deep learning to predict the first-pass effect from pretreatment CT and MR imaging.
MATERIALS AND METHODS: Our models were developed and evaluated using a cohort of 326 patients who underwent endovascular thrombectomy at UCLA Ronald Reagan Medical Center from 2014 to 2021. We designed a hybrid transformer model with nonlocal and cross-attention modules to predict the first-pass effect on MR imaging and CT series.
RESULTS: The proposed method achieved a mean 0.8506 (SD, 0.0712) for cross-validation receiver operating characteristic area under the curve (ROC-AUC) on MR imaging and 0.8719 (SD, 0.0831) for cross-validation ROC-AUC on CT. When evaluated on the prospective test sets, our proposed model achieved a mean ROC-AUC of 0.7967 (SD, 0.0335) with a mean sensitivity of 0.7286 (SD, 0.1849) and specificity of 0.8462 (SD, 0.1216) for MR imaging and a mean ROC-AUC of 0.8051 (SD, 0.0377) with a mean sensitivity of 0.8615 (SD, 0.1131) and specificity 0.7500 (SD, 0.1054) for CT, respectively, representing the first classification of the first-pass effect from MR imaging alone and the first automated first-pass effect classification method in CT.
CONCLUSIONS: Results illustrate that both nonperfusion MR imaging and CT from admission contain signals that can predict a successful first-pass effect following endovascular thrombectomy using our deep learning methods without requiring time-intensive manual segmentation.
PMID:38871371 | DOI:10.3174/ajnr.A8272
The use of artificial intelligence algorithms to detect macroplastics in aquatic environments: A critical review
Sci Total Environ. 2024 Jun 11:173843. doi: 10.1016/j.scitotenv.2024.173843. Online ahead of print.
ABSTRACT
The presence of macroplastics (MPs) is having serious consequences on natural ecosystems, directly affecting biota and human wellbeing. Given this scenario, estimating MPs' abundance is crucial for assessing the issue and formulating effective waste management strategies. In this context, the main objective of this critical review is to analyze the use of machine learning (ML) techniques, with a particular interest in deep learning (DL) approaches to detect, classify and quantify MPs in aquatic environments, supported by datasets such as satellite or aerial images and video recordings taken by unmanned aerial vehicles. This article provides a concise overview of artificial intelligence concepts, followed by a bibliometric analysis and a critical review. The search methodology aimed to categorize the scientific contributions through a temporal and spatial criterion for bibliometric analysis, whereas the critical review was based on generating homogeneous groups according to the complexity of ML and DL methods, as well as the type of dataset. In light of the review carried out, classical ML techniques, such as random forest or support vector machines, showed robustness in MPs detection. However, it seems that achieving optimal efficiencies in multiclass classification is a limitation for these methods. Consequently, more advanced techniques such as DL approaches are taking the lead for the detection and multiclass classification of MPs. A series of architectures based on convolutional neural networks, and the use of complex pre-trained models through the transfer learning, are currently being explored (e.g., VGG16 and YOLO models), although currently the computational expense is high due to the need for processing large volumes of data. Additionally, there seems to be a trend towards detecting smaller plastic, necessitating higher resolution images. Finally, it is important to stress that since 2020 there has been a significant increase in scientific research focusing on transformer-based architectures for object detection. Although this can be considered the current state of the art, no studies have been identified that utilize these architectures for MP detection.
PMID:38871326 | DOI:10.1016/j.scitotenv.2024.173843
Harnessing the power of hybrid deep learning algorithm for the estimation of global horizontal irradiance
Sci Total Environ. 2024 Jun 11:173958. doi: 10.1016/j.scitotenv.2024.173958. Online ahead of print.
ABSTRACT
Accurately and precisely estimating global horizontal irradiance (GHI) poses significant challenges due to the unpredictable nature of climate parameters and geographical limitations. To address this challenge, this study proposes a forecasting framework using an integrated model of the convolutional neural network (CNN), long short-term memory (LSTM), and gated recurrent unit (GRU). The proposed model uses a dataset of four different districts in Rajasthan, each with unique solar irradiance patterns. Firstly, the data was preprocessed and then trained with the optimized parameters of the standalone and hybrid models and compared. It can be observed that the proposed hybrid model (CNN-LSTM-GRU) consistently outperformed all other models regarding Mean absolute error (MAE) and Root mean squared error (RMSE). The experimental results demonstrate that the proposed method forecasts accurate GHI with a RMSE of 0.00731, 0.00730, 0.00775, 0.00810 and MAE of 0.00516, 0.00524, 0.00552, 0.00592 for Barmer, Jaisalmer, Jodhpur and Bikaner respectively. This indicates that the model is better at minimizing prediction errors and providing more accurate GHI estimates. Additionally, the proposed model achieved a higher coefficient of determination (R (Ghimire et al., 2019)), suggesting that it best fits the dataset. A higher R2 value signifies that the proposed model could explain a significant portion of the variance in the GHI dataset, further emphasizing its predictive capabilities. In conclusion, this work demonstrates the effectiveness of the hybrid algorithm in improving adaptability and enhancing prediction accuracy for GHI estimation.
PMID:38871320 | DOI:10.1016/j.scitotenv.2024.173958
Using artificial intelligence and deep learning to optimise the selection of adult congenital heart disease patients in S-ICD screening
Indian Pacing Electrophysiol J. 2024 Jun 11:S0972-6292(24)00073-1. doi: 10.1016/j.ipej.2024.06.003. Online ahead of print.
ABSTRACT
INTRODUCTION: The risk of complications associated with transvenous ICDs make the subcutaneous implantable cardiac defibrillator (S-ICD) a valuable alternative in patients with adult congenital heart disease (ACHD). However, higher S-ICD ineligibility and higher inappropriate shock rates-mostly caused by T wave oversensing (TWO)- are observed in this population. We report a novel application of deep learning methods to screen patients for S-ICD eligibility over a longer period than conventional screening.
METHODS: Adult patients with ACHD and a control group of normal subjects were fitted with a 24-h Holters to record their S-ICD vectors. Their T:R ratio was analysed utilising phase space reconstruction matrices and a deep learning-based model to provide an in-depth description of the T: R variation plot for each vector. T: R variation was compared statistically using t-test.
RESULTS: 13 patients (age 37.4 ± 7.89 years, 61.5 % male, 6 ACHD and 7 control subjects) were enrolled. A significant difference was observed in the mean and median T: R values between the two groups (p < 0.001). There was also a significant difference in the standard deviation of T: R between both groups (p = 0.04).
CONCLUSIONS: T:R ratio, a main determinant for S-ICD eligibility, is significantly higher with more tendency to fluctuate in ACHD patients when compared to a population with normal hearts. We hypothesise that our novel model could be used to select S-ICD eligible patients by better characterisation of T:R ratio, reducing the risk of TWO and inappropriate shocks in the ACHD patient cohort.
PMID:38871179 | DOI:10.1016/j.ipej.2024.06.003
The prediction of Recombination Hotspot Based on Automated Machine Learning
J Mol Biol. 2024 Jun 11:168653. doi: 10.1016/j.jmb.2024.168653. Online ahead of print.
ABSTRACT
Meiotic recombination plays a pivotal role in genetic evolution. Genetic variation induced by recombination is a crucial factor in generating biodiversity and a driving force for evolution. At present, the development of recombination hotspot prediction methods has encountered challenges related to insufficient feature extraction and limited generalization capabilities. This paper focused on the research of recombination hotspot prediction methods. We explored deep learning-based recombination hotspot prediction and scrutinized the shortcomings of prevalent models in addressing the challenge of recombination hotspot prediction. To addressing these deficiencies, an automated machine learning approach was utilized to construct recombination hotspot prediction model. The model combined sequence information with physicochemical properties by employing TF-IDF-Kmer and DNA composition components to acquire more effective feature data. Experimental results validate the effectiveness of the feature extraction method and automated machine learning technology used in this study. The final model was validated on three distinct datasets and yielded accuracy rates of 97.14%, 79.71%, and 98.73%, surpassing the current leading models by 2%, 2.56%, and 4%, respectively. In addition, we incorporated tools such as SHAP and AutoGluon to analyze the interpretability of black-box models, delved into the impact of individual features on the results, and investigated the reasons behind misclassification of samples. Finally, a website of recombination hotspot prediction was established to facilitate easy access to necessary information and tools for researchers. The research outcomes of this paper underscore the enormous potential of automated machine learning methods in gene sequence prediction.
PMID:38871176 | DOI:10.1016/j.jmb.2024.168653
SEP-AlgPro: An efficient allergen prediction tool utilizing traditional machine learning and deep learning techniques with protein language model features
Int J Biol Macromol. 2024 Jun 11:133085. doi: 10.1016/j.ijbiomac.2024.133085. Online ahead of print.
ABSTRACT
Allergy is a hypersensitive condition in which individuals develop objective symptoms when exposed to harmless substances at a dose that would cause no harm to a "normal" person. Most current computational methods for allergen identification rely on homology or conventional machine learning using limited set of feature descriptors or validation on specific datasets, making them inefficient and inaccurate. Here, we propose SEP-AlgPro for the accurate identification of allergen protein from sequence information. We analyzed 10 conventional protein-based features and 14 different features derived from protein language models to gauge their effectiveness in differentiating allergens from non-allergens using 15 different classifiers. However, the final optimized model employs top 10 feature descriptors with top seven machine learning classifiers. Results show that the features derived from protein language models exhibit superior discriminative capabilities compared to traditional feature sets. This enabled us to select the most discriminatory baseline models, whose predicted outputs were aggregated and used as input to a deep neural network for the final allergen prediction. Extensive case studies showed that SEP-AlgPro outperforms state-of-the-art predictors in accurately identifying allergens. A user-friendly web server was developed and made freely available at https://balalab-skku.org/SEP-AlgPro/, making it a powerful tool for identifying potential allergens.
PMID:38871100 | DOI:10.1016/j.ijbiomac.2024.133085
DP-site: A dual deep learning-based method for protein-peptide interaction site prediction
Methods. 2024 Jun 11:S1046-2023(24)00143-9. doi: 10.1016/j.ymeth.2024.06.001. Online ahead of print.
ABSTRACT
BACKGROUND: Protein-peptide interaction prediction is an important topic for several applications including various biological processes, understanding drug discovery, protein function abnormal cellular behaviors, and treating diseases. Over the years, studies have shown that experimental methods have improved the identification of this bio-molecular interaction. However, predicting protein-peptide interactions using these methods is laborious, time-consuming, dependent on third-party tools, and costly.
METHOD: To address these previous drawbacks, this study introduces a computational framework called DP-Site. The proposed framework concentrates on using a compound of a dual pipeline along with a combination predictor. A deep convolutional neural network for feature extraction and classification is embedded in pipeline 1. In addition, pipeline 2 includes a deep long-short-term memory-based and a random forest classifier for feature extraction and classification. In this investigation, the evolutionary, structure-based, sequence-based, and physicochemical information of proteins is utilized for identifying protein-peptide interaction at the residue level.
RESULTS: The proposed method is evaluated on both the ten-fold cross-validation and independent test sets. The robust and consistent results between cross-validation and independent test sets confirm the ability of the proposed method to predict peptide binding residues in proteins. Moreover, experimental findings demonstrate that DP-Site has significantly outperformed other state-of-the-art sequence-based and structure-based methods. The proposed method achieves a remarkable balance between a specificity of 0.799 and a sensitivity of 0.770, along with the best f-measure of 0.661 and the highest precision of 0.580 using an independent test set.
CONCLUSIONS: The outcome of various experiments confirms the proficiency of the proposed method and outperforms state-of-the-art sequence-based and structure-based methods in terms of the mentioned criteria. DP-Site can be accessed at https://github.com/shafiee 95/shima.shafiee.DP-Site.
PMID:38871095 | DOI:10.1016/j.ymeth.2024.06.001
Prediction of epidermal growth factor receptor mutation subtypes in non-small cell lung cancer from hematoxylin and eosin-stained slides using deep learning
Lab Invest. 2024 Jun 11:102094. doi: 10.1016/j.labinv.2024.102094. Online ahead of print.
ABSTRACT
Accurate assessment of epidermal growth factor receptor (EGFR) mutation status and subtype are critical for the treatment of non-small cell lung cancer (NSCLC) patients. Conventional molecular testing methods for detecting EGFR mutations have limitations. In this study, an artificial intelligence-powered deep learning framework was developed for weakly supervised prediction of EGFR mutations in NSCLC from hematoxylin and eosin (H&E)-stained histopathology whole-slide images (WSIs). The study cohort was partitioned into training and validation subsets. Foreground regions containing tumor tissue were extracted from WSIs. A convolutional neural network (CNN) employing a contrastive learning paradigm was implemented to extract patch-level morphological features. These features were aggregated using a vision-transformer-based model to predict EGFR mutation status and classify patient cases. The established prediction model was validated on unseen datasets. In internal validation with a cohort from (USTC)(n=172), the model achieved patient-level areas under the receiver operating characteristic (ROC) curve (AUCs) of 0.927 and 0.907, sensitivities of 81.6% and 93.0%, and specificities of 83.3% and 92.3%, for surgical resection and biopsy specimens in EGFR mutation subtype prediction, respectively. External validation with cohorts from the Second Affiliated Hospital of Anhui Medical University (AMU) and the First Affiliated Hospital of Wannan Medical College (WMC) (n=193) yielded patient-level AUCs of 0.849 and 0.871, sensitivities of 75.7% and 72.1%, and specificities of 90.5% and 90.3% for surgical and biopsy specimens, respectively. Further validation with The Cancer Genome Atlas (TCGA) dataset (n=81) showed an AUC of 0.861, sensitivity of 84.6%, and specificity of 90.5%. Deep learning solutions demonstrate potential advantages for automated, non-invasive, fast, cost-effective, and accurate inference of EGFR alterations from histomorphology. Integration of such artificial intelligence frameworks into routine digital pathology workflows could augment existing molecular testing pipelines.
PMID:38871058 | DOI:10.1016/j.labinv.2024.102094
Binary classification of dead detector elements in flat panel detectors using convolutional neural networks
Biomed Phys Eng Express. 2024 Jun 13. doi: 10.1088/2057-1976/ad57cd. Online ahead of print.
ABSTRACT
This work aims to provide a novel deep learning technique that can be used to generate dead detector maps for flat panel detectors in the absence of ground truth maps. These maps are useful in monitoring the overall health of a flat panel detector, and in many cases are not readily available to the medical physicist responsible for quality assurance.
Approach: We greatly expand upon a previous work by providing a novel technique for classifying dead detector elements at single pixel resolution. We also demonstrate that this technique can be trained on one detector, and then tested and validated on another with moderate success, which demonstrates some ability to generalize to different detectors. The technique requires 3 flat field, or "noise", images to be taken to predict the dead detector element maps for the system.
Main Results: Models using only for-processing pixel data were unable to successfully generalize from one detector to the other. Models preprocessed using the standard deviation across three for-processing images were able to classify dead detector element maps with an F1 score ranging from 0.4527 to 0.8107 and recall ranging from 0.5420 to 0.9303 with better performance, on average, observed using the low exposure data set. 
Significance: Many physicists do not have access to the dead detector maps for their diagnostic systems. CNNs are capable of predicting the dead detector maps of flat panel detectors with single pixel resolution. Physicists can implement this tool by acquiring three flat field images and then inputting it into the model. Model performance saw a marginal increase when trained on the low exposure set data, as opposed to the high exposure set data, indicating high exposure, low relative noise images may not be necessary for optimal performance. Model performance across detectors manufactured by different vendors requires further investigation. 
.
PMID:38870913 | DOI:10.1088/2057-1976/ad57cd
Scaling DEPP phylogenetic placement to ultra-large reference trees: a tree-aware ensemble approach
Bioinformatics. 2024 Jun 13:btae361. doi: 10.1093/bioinformatics/btae361. Online ahead of print.
ABSTRACT
MOTIVATION: Phylogenetic placement of a query sequence on a backbone tree is increasingly used across biomedical sciences to identify the content of a sample from its DNA content. The accuracy of such analyses depends on the density of the backbone tree, making it crucial that placement methods scale to very large trees. Moreover, a new paradigm has been recently proposed to place sequences on the species tree using single-gene data. The goal is to better characterize the samples and to enable combined analyses of marker-gene (e.g., 16S rRNA gene amplicon) and genome-wide data. The recent method DEPP enables performing such analyses using metric learning. However, metric learning is hampered by a need to compute and save a quadratically growing matrix of pairwise distances during training. Thus, the training phase of DEPP does not scale to more than roughly ten thousand backbone species, a problem that we faced when trying to use our recently released Greengenes2 (GG2) reference tree containing 331,270 species.
RESULTS: This paper explores divide-and-conquer for training ensembles of DEPP models, culminating in a method called C-DEPP. While divide-and-conquer has been extensively used in phylogenetics, applying divide-and-conquer to data-hungry machine learning methods needs nuance. C-DEPP uses carefully crafted techniques to enable quasi-linear scaling while maintaining accuracy. C-DEPP enables placing twenty million 16S fragments on the GG2 reference tree in 41 hours of computation.
AVAILABILITY AND IMPLEMENTATION: The dataset and C-DEPP software are freely available at https://github.com/yueyujiang/dataset_cdepp/.
SUPPLEMENTARY INFORMATION: Supplementary note is available at Bioinformatics online.
PMID:38870525 | DOI:10.1093/bioinformatics/btae361
Deep learning based ECG segmentation for delineation of diverse arrhythmias
PLoS One. 2024 Jun 13;19(6):e0303178. doi: 10.1371/journal.pone.0303178. eCollection 2024.
ABSTRACT
Accurate delineation of key waveforms in an ECG is a critical step in extracting relevant features to support the diagnosis and treatment of heart conditions. Although deep learning based methods using segmentation models to locate P, QRS, and T waves have shown promising results, their ability to handle arrhythmias has not been studied in any detail. In this paper we investigate the effect of arrhythmias on delineation quality and develop strategies to improve performance in such cases. We introduce a U-Net-like segmentation model for ECG delineation with a particular focus on diverse arrhythmias. This is followed by a post-processing algorithm which removes noise and automatically determines the boundaries of P, QRS, and T waves. Our model has been trained on a diverse dataset and evaluated against the LUDB and QTDB datasets to show strong performance, with F1-scores exceeding 99% for QRS and T waves, and over 97% for P waves in the LUDB dataset. Furthermore, we assess various models across a wide array of arrhythmias and observe that models with a strong performance on standard benchmarks may still perform poorly on arrhythmias that are underrepresented in these benchmarks, such as tachycardias. We propose solutions to address this discrepancy.
PMID:38870233 | DOI:10.1371/journal.pone.0303178
Investigating molecular descriptors in cell-penetrating peptides prediction with deep learning: Employing N, O, and hydrophobicity according to the Eisenberg scale
PLoS One. 2024 Jun 13;19(6):e0305253. doi: 10.1371/journal.pone.0305253. eCollection 2024.
ABSTRACT
Cell-penetrating peptides comprise a group of molecules that can naturally cross the lipid bilayer membrane that protects cells, sharing physicochemical and structural properties, and having several pharmaceutical applications, particularly in drug delivery. Investigations of molecular descriptors have provided not only an improvement in the performance of classifiers but also less computational complexity and an enhanced understanding of membrane permeability. Furthermore, the employment of new technologies, such as the construction of deep learning models using overfitting treatment, promotes advantages in tackling this problem. In this study, the descriptors nitrogen, oxygen, and hydrophobicity on the Eisenberg scale were investigated, using the proposed ConvBoost-CPP composed of an improved convolutional neural network with overfitting treatment and an XGBoost model with adjusted hyperparameters. The results revealed favorable to the use of ConvBoost-CPP, having as input nitrogen, oxygen, and hydrophobicity together with ten other descriptors previously investigated in this research line, showing an increase in accuracy from 88% to 91.2% in cross-validation and 82.6% to 91.3% in independent test.
PMID:38870192 | DOI:10.1371/journal.pone.0305253
Research on load clustering algorithm based on variational autoencoder and hierarchical clustering
PLoS One. 2024 Jun 13;19(6):e0303977. doi: 10.1371/journal.pone.0303977. eCollection 2024.
ABSTRACT
Time series data complexity presents new challenges in clustering analysis across fields such as electricity, energy, industry, and finance. Despite advances in representation learning and clustering with Variational Autoencoders (VAE) based deep learning techniques, issues like the absence of discriminative power in feature representation, the disconnect between instance reconstruction and clustering objectives, and scalability challenges with large datasets persist. This paper introduces a novel deep time series clustering approach integrating VAE with metric learning. It leverages a VAE based on Gated Recurrent Units for temporal feature extraction, incorporates metric learning for joint optimization of latent space representation, and employs the sum of log likelihoods as the clustering merging criterion, markedly improving clustering accuracy and interpretability. Experimental findings demonstrate a 27.16% improvement in average clustering accuracy and a 47.15% increase in speed on industrial load data. This study offers novel insights and tools for the thorough analysis and application of time series data, with further exploration of VAE's potential in time series clustering anticipated in future research.
PMID:38870191 | DOI:10.1371/journal.pone.0303977
FaceTouch: Detecting hand-to-face touch with supervised contrastive learning to assist in tracing infectious diseases
PLoS One. 2024 Jun 13;19(6):e0288670. doi: 10.1371/journal.pone.0288670. eCollection 2024.
ABSTRACT
Through our respiratory system, many viruses and diseases frequently spread and pass from one person to another. Covid-19 served as an example of how crucial it is to track down and cut back on contacts to stop its spread. There is a clear gap in finding automatic methods that can detect hand-to-face contact in complex urban scenes or indoors. In this paper, we introduce a computer vision framework, called FaceTouch, based on deep learning. It comprises deep sub-models to detect humans and analyse their actions. FaceTouch seeks to detect hand-to-face touches in the wild, such as through video chats, bus footage, or CCTV feeds. Despite partial occlusion of faces, the introduced system learns to detect face touches from the RGB representation of a given scene by utilising the representation of the body gestures such as arm movement. This has been demonstrated to be useful in complex urban scenarios beyond simply identifying hand movement and its closeness to faces. Relying on Supervised Contrastive Learning, the introduced model is trained on our collected dataset, given the absence of other benchmark datasets. The framework shows a strong validation in unseen datasets which opens the door for potential deployment.
PMID:38870182 | DOI:10.1371/journal.pone.0288670