Deep learning
Artificial intelligence methods for modeling gasification of waste biomass: a review
Environ Monit Assess. 2024 Feb 26;196(3):309. doi: 10.1007/s10661-024-12443-2.
ABSTRACT
Gasification is a highly promising thermochemical process that shows considerable potential for the efficient conversion of waste biomass into syngas. The assessment of the feasibility and comparative advantages of different biomass and waste gasification schemes is contingent upon a multifaceted combination of interrelated criteria. Conventional analytical approaches employed to facilitate decision-making rely on a multitude of inadequately defined parameters. Consequently, substantial efforts have been directed toward enhancing the efficiency and productivity of thermochemical conversion processes. In recent times, artificial intelligence (AI)-based models and algorithms have gained prominence, serving as indispensable tools for expediting these processes and formulating strategies to address the growing demand for energy. Notably, machine learning (ML) and deep learning (DL) have emerged as cutting-edge AI models, demonstrating exceptional effectiveness and profound relevance in the realm of thermochemical conversion systems. This study provides an overview of the machine learning (ML) and deep learning (DL) approaches utilized during gasification and evaluates their benefits and drawbacks. Many industries and applications related to energy conversion systems use AI algorithms. Predicting the output of conversion systems and subjects linked to optimization are two of this science's critical applications. This review sheds light on the burgeoning utility of AI, particularly ML and DL, which have garnered significant attention due to their applications in productivity prediction, process optimization, real-time process monitoring, and control. Furthermore, the integration of hybrid models has become commonplace, primarily owing to their demonstrated success in modeling and optimization tasks. Importantly, the adoption of these algorithms significantly enhances the model's capability to tackle intricate challenges, as DL methodologies have evolved to offer heightened accuracy and reduced susceptibility to errors. Within the scope of this study, an exhaustive exploration of ML and DL techniques and their applications has been conducted, uncovering existing research knowledge gaps. Based on a comprehensive critical analysis, this review offers recommendations for future research directions, accentuating the pivotal findings and conclusions derived from the study.
PMID:38407668 | DOI:10.1007/s10661-024-12443-2
Deep-learning-assisted spectroscopic single-molecule localization microscopy based on spectrum-to-spectrum denoising
Nanoscale. 2024 Feb 26. doi: 10.1039/d3nr05870k. Online ahead of print.
ABSTRACT
Spectroscopic single-molecule localization microscopy (sSMLM) simultaneously captures spatial localizations and spectral signatures, providing the ability of multiplexed and functional subcellular imaging applications. However, extracting accurate spectral information in sSMLM remains challenging due to the poor signal-to-noise ratio (SNR) of spectral images set by a limited photon budget from single-molecule fluorescence emission and inherent electronic noise during the image acquisition using digital cameras. Here, we report a novel spectrum-to-spectrum (Spec2Spec) framework, a self-supervised deep-learning network that can significantly suppress the noise and accurately recover low SNR emission spectra from a single-molecule localization event. A training strategy of Spec2Spec was designed for sSMLM data by exploiting correlated spectral information in spatially adjacent pixels, which contain independent noise. By validating the qualitative and quantitative performance of Spec2Spec on simulated and experimental sSMLM data, we demonstrated that Spec2Spec can improve the SNR and the structure similarity index measure (SSIM) of single-molecule spectra by about 6-fold and 3-fold, respectively, further facilitating 94.6% spectral classification accuracy and nearly 100% data utilization ratio in dual-color sSMLM imaging.
PMID:38407360 | DOI:10.1039/d3nr05870k
Direct Observation and Automated Measurement of Stomatal Responses to Pseudomonas syringae pv. tomato DC3000 in Arabidopsis thaliana
J Vis Exp. 2024 Feb 9;(204). doi: 10.3791/66112.
ABSTRACT
Stomata are microscopic pores found in the plant leaf epidermis. Regulation of stomatal aperture is pivotal not only for balancing carbon dioxide uptake for photosynthesis and transpirational water loss but also for restricting bacterial invasion. While plants close stomata upon recognition of microbes, pathogenic bacteria, such as Pseudomonas syringae pv. tomato DC3000 (Pto), reopen the closed stomata to gain access into the leaf interior. In conventional assays for assessing stomatal responses to bacterial invasion, leaf epidermal peels, leaf discs, or detached leaves are floated on bacterial suspension, and then stomata are observed under a microscope followed by manual measurement of stomatal aperture. However, these assays are cumbersome and may not reflect stomatal responses to natural bacterial invasion in a leaf attached to the plant. Recently, a portable imaging device was developed that can observe stomata by pinching a leaf without detaching it from the plant, together with a deep learning-based image analysis pipeline designed to automatically measure stomatal aperture from leaf images captured by the device. Here, building on these technical advances, a new method to assess stomatal responses to bacterial invasion in Arabidopsis thaliana is introduced. This method consists of three simple steps: spray inoculation of Pto mimicking natural infection processes, direct observation of stomata on a leaf of the Pto-inoculated plant using the portable imaging device, and automated measurement of stomatal aperture by the image analysis pipeline. This method was successfully used to demonstrate stomatal closure and reopening during Pto invasion under conditions that closely mimic the natural plant-bacteria interaction.
PMID:38407316 | DOI:10.3791/66112
Deep Learning based Retinal Vessel Caliber Measurement and the Association with Hypertension
Curr Eye Res. 2024 Feb 26:1-11. doi: 10.1080/02713683.2024.2319755. Online ahead of print.
ABSTRACT
PURPOSE: To develop a highly efficient and fully automated method that measures retinal vessel caliber using digital retinal photographs and evaluate the association between retinal vessel caliber and hypertension.
METHODS: The subjects of this study were from two sources in Beijing, China, a hypertension case-control study from Tongren Hospital (Tongren study) and a community-based atherosclerosis cohort from Peking University First Hospital (Shougang study). Retinal vessel segmentation and arteriovenous classification were achieved simultaneously by a customized deep learning model. Two experienced ophthalmologists evaluated whether retinal vessels were correctly segmented and classified. The ratio of incorrectly segmented and classified retinal vessels was used to measure the accuracy of the model's recognition. Central retinal artery equivalents, central retinal vein equivalents and arteriolar-to-venular diameter ratio were computed to analyze the association between retinal vessel caliber and the risk of hypertension. The association was then compared to that derived from the widely used semi-automated software (Integrative Vessel Analysis).
RESULTS: The deep learning model achieved an arterial recognition error rate of 1.26%, a vein recognition error rate of 0.79%, and a total error rate of 1.03%. Central retinal artery equivalents and arteriolar-to-venular diameter ratio measured by both Integrative Vessel Analysis and deep learning methods were inversely associated with the odds of hypertension in both Tongren and Shougang studies. The comparisons of areas under the receiver operating characteristic curves from the proposed deep learning method and Integrative Vessel Analysis were all not significantly different (p > .05).
CONCLUSION: The proposed deep learning method showed a comparable diagnostic value to Integrative Vessel Analysis software. Compared with semi-automatic software, our deep learning model has significant advantage in efficiency and can be applied to population screening and risk evaluation.
PMID:38407139 | DOI:10.1080/02713683.2024.2319755
Identification and Structural Characterization of Twisted Atomically Thin Bilayer Materials by Deep Learning
Nano Lett. 2024 Feb 26. doi: 10.1021/acs.nanolett.3c04815. Online ahead of print.
ABSTRACT
Two-dimensional materials are expected to play an important role in next-generation electronics and optoelectronic devices. Recently, twisted bilayer graphene and transition metal dichalcogenides have attracted significant attention due to their unique physical properties and potential applications. In this study, we describe the use of optical microscopy to collect the color space of chemical vapor deposition (CVD) of molybdenum disulfide (MoS2) and the application of a semantic segmentation convolutional neural network (CNN) to accurately and rapidly identify thicknesses of MoS2 flakes. A second CNN model is trained to provide precise predictions on the twist angle of CVD-grown bilayer flakes. This model harnessed a data set comprising over 10,000 synthetic images, encompassing geometries spanning from hexagonal to triangular shapes. Subsequent validation of the deep learning predictions on twist angles was executed through the second harmonic generation and Raman spectroscopy. Our results introduce a scalable methodology for automated inspection of twisted atomically thin CVD-grown bilayers.
PMID:38407030 | DOI:10.1021/acs.nanolett.3c04815
EnAMP: A novel deep learning ensemble antibacterial peptide recognition algorithm based on multi-features
J Bioinform Comput Biol. 2024 Feb 26:2450001. doi: 10.1142/S021972002450001X. Online ahead of print.
ABSTRACT
Antimicrobial peptides (AMPs), as the preferred alternatives to antibiotics, have wide application with good prospects. Identifying AMPs through wet lab experiments remains expensive, time-consuming and challenging. Many machine learning methods have been proposed to predict AMPs and achieved good results. In this work, we combine two kinds of word embedding features with the statistical features of peptide sequences to develop an ensemble classifier, named EnAMP, in which, two deep neural networks are trained based on Word2vec and Glove word embedding features of peptide sequences, respectively, meanwhile, we utilize statistical features of peptide sequences to train random forest and support vector machine classifiers. The average of four classifiers is the final prediction result. Compared with other state-of-the-art algorithms on six datasets, EnAMP outperforms most existing models with similar computational costs, even when compared with high computational cost algorithms based on Bidirectional Encoder Representation from Transformers (BERT), the performance of our model is comparable. EnAMP source code and the data are available at https://github.com/ruisue/EnAMP.
PMID:38406833 | DOI:10.1142/S021972002450001X
Optical imaging technologies for <em>in vivo</em> cancer detection in low-resource settings
Curr Opin Biomed Eng. 2023 Dec;28:100495. doi: 10.1016/j.cobme.2023.100495. Epub 2023 Aug 23.
ABSTRACT
Cancer continues to affect underserved populations disproportionately. Novel optical imaging technologies, which can provide rapid, non-invasive, and accurate cancer detection at the point of care, have great potential to improve global cancer care. This article reviews the recent technical innovations and clinical translation of low-cost optical imaging technologies, highlighting the advances in both hardware and software, especially the integration of artificial intelligence, to improve in vivo cancer detection in low-resource settings. Additionally, this article provides an overview of existing challenges and future perspectives of adapting optical imaging technologies into clinical practice, which can potentially contribute to novel insights and programs that effectively improve cancer detection in low-resource settings.
PMID:38406798 | PMC:PMC10883072 | DOI:10.1016/j.cobme.2023.100495
CAISHI: A benchmark histopathological H&E image dataset for cervical adenocarcinoma <em>in situ</em> identification, retrieval and few-shot learning evaluation
Data Brief. 2024 Feb 9;53:110141. doi: 10.1016/j.dib.2024.110141. eCollection 2024 Apr.
ABSTRACT
A benchmark histopathological Hematoxylin and Eosin (H&E) image dataset for Cervical Adenocarcinoma in Situ (CAISHI), containing 2240 histopathological images of Cervical Adenocarcinoma in Situ (AIS), is established to fill the current data gap, of which 1010 are images of normal cervical glands and another 1230 are images of cervical AIS. The sampling method is endoscope biopsy. Pathological sections are obtained by H&E staining from Shengjing Hospital, China Medical University. These images have a magnification of 100 and are captured by the Axio Scope. A1 microscope. The size of the image is 3840 × 2160 pixels, and the format is ".png". The collection of CAISHI is subject to an ethical review by China Medical University with approval number 2022PS841K. These images are analyzed at multiple levels, including classification tasks and image retrieval tasks. A variety of computer vision and machine learning methods are used to evaluate the performance of the data. For classification tasks, a variety of classical machine learning classifiers such as k-means, support vector machines (SVM), and random forests (RF), as well as convolutional neural network classifiers such as Residual Network 50 (ResNet50), Vision Transformer (ViT), Inception version 3 (Inception-V3), and Visual Geometry Group Network 16 (VGG-16), are used. In addition, the Siamese network is used to evaluate few-shot learning tasks. In terms of image retrieval functions, color features, texture features, and deep learning features are extracted, and their performances are tested. CAISHI can help with the early diagnosis and screening of cervical cancer. Researchers can use this dataset to develop new computer-aided diagnostic tools that could improve the accuracy and efficiency of cervical cancer screening and advance the development of automated diagnostic algorithms.
PMID:38406254 | PMC:PMC10885606 | DOI:10.1016/j.dib.2024.110141
Notation of Javanese <em>Gamelan</em> dataset for traditional music applications
Data Brief. 2024 Feb 6;53:110116. doi: 10.1016/j.dib.2024.110116. eCollection 2024 Apr.
ABSTRACT
The Javanese gamelan notation dataset comprises Javanese gamelan gendhing (song) notations for various gamelan instruments. This dataset includes 35 songs categorized into 7 song structures, which are similar to genres in modern music. Each song in this dataset includes the primary melody and notations for various instrument groups, including the balungan instruments group (saron, demung, and slenthem), the bonang barung and bonang penerus instruments, the peking instrument group, and the structural instruments group (kenong, kethuk, kempyang, kempul, and gong). The primary melody is derived from https://www.gamelanbvg.com/gendhing/index.php, a collection of Javanese gamelan songs. On the other hand, the notation of each instrument group is the result of our creation by following the rules of gamelan playing on each instrument. In Javanese gamelan songs, usually written only the main melody notation in the form of numerical notation and the characteristics of the song, such as song title, song structure type, rhythm, scale and mode of the song. Naturally, this is not an easy task for a beginner gamelan player, but a more complete notation will make it easier for anyone who wants to play gamelan. Each song is compiled into a sheet of music, which is presented in a Portable Document Format (PDF) file. This dataset is valuable for developing deep learning models to classify or recognize Javanese gamelan songs based on their instrument notations, as previous gamelan research has mostly used audio data. Furthermore, this dataset has the capability to automatically generate Javanese gamelan notation for songs of similar types. Additionally, it will be useful for educational purposes to facilitate the learning of Javanese gamelan songs and for the preservation of traditional Javanese gamelan music.
PMID:38406241 | PMC:PMC10885543 | DOI:10.1016/j.dib.2024.110116
Establishment and comparison of <em>in situ</em> detection models for foodborne pathogen contamination on mutton based on SWIR-HSI
Front Nutr. 2024 Feb 9;11:1325934. doi: 10.3389/fnut.2024.1325934. eCollection 2024.
ABSTRACT
INTRODUCTION: Rapid and accurate detection of food-borne pathogens on mutton is of great significance to ensure the safety of mutton and its products and the health of consumers.
OBJECTIVES: The feasibility of short-wave infrared hyperspectral imaging (SWIR-HSI) in detecting the contamination status and species of Escherichia coli (EC), Staphylococcus aureus (SA) and Salmonella typhimurium (ST) contaminated on mutton was explored.
MATERIALS AND METHODS: The hyperspectral images of uncontaminated and contaminated mutton samples with different concentrations (108, 107, 106, 105, 104, 103 and 102 CFU/mL) of EC, SA and ST were acquired. The one dimensional convolutional neural network (1D-CNN) model was constructed and the influence of structure hyperparameters on the model was explored. The effects of different spectral preprocessing methods on partial least squares-discriminant analysis (PLS-DA), support vector machine (SVM) and 1D-CNN models were discussed. In addition, the feasibility of using the characteristic wavelength to establish simplified models was explored.
RESULTS AND DISCUSSION: The best full band model was the 1D-CNN model with the convolution kernels number of (64, 16) and the activation function of tanh established by the original spectra, and its accuracy of training set, test set and external validation set were 100.00, 92.86 and 97.62%, respectively. The optimal simplified model was genetic algorithm optimization support vector machine (GA-SVM). For discriminating the pathogen species, the accuracies of SVM models established by full band spectra preprocessed by 2D and all 1D-CNN models with the convolution kernel number of (32, 16) and the activation function of tanh were 100.00%. In addition, the accuracies of all simplified models were 100.00% except for the 1D-CNN models. Considering the complexity of features and model calculation, the 1D-CNN models established by original spectra were the optimal models for pathogenic bacteria contamination status and species. The simplified models provide basis for developing multispectral detection instruments.
CONCLUSION: The results proved that SWIR-HSI combined with machine learning and deep learning could accurately detect the foodborne pathogen contamination on mutton, and the performance of deep learning models were better than that of machine learning. This study can promote the application of HSI technology in the detection of foodborne pathogens on meat.
PMID:38406188 | PMC:PMC10884184 | DOI:10.3389/fnut.2024.1325934
Identifying Reproducibly Important EEG Markers of Schizophrenia with an Explainable Multi-Model Deep Learning Approach
bioRxiv [Preprint]. 2024 Feb 13:2024.02.09.579600. doi: 10.1101/2024.02.09.579600.
ABSTRACT
The diagnosis of schizophrenia (SZ) can be challenging due to its diverse symptom presentation. As such, many studies have sought to identify diagnostic biomarkers of SZ using explainable machine learning methods. However, the generalizability of identified biomarkers in many machine learning-based studies is highly questionable given that most studies only analyze explanations from a small number of models. In this study, we present (1) a novel feature interaction-based explainability approach and (2) several new approaches for summarizing multi-model explanations. We implement our approach within the context of electroencephalogram (EEG) spectral power data. We further analyze both training and test set explanations with the goal of extracting generalizable insights from the models. Importantly, our analyses identify effects of SZ upon the α, β, and θ frequency bands, the left hemisphere of the brain, and interhemispheric interactions across a majority of folds. We hope that our analysis will provide helpful insights into SZ and inspire the development of robust approaches for identifying neuropsychiatric disorder biomarkers from explainable machine learning models.
PMID:38405889 | PMC:PMC10888920 | DOI:10.1101/2024.02.09.579600
Automated 3D analysis of social head-gaze behaviors in freely moving marmosets
bioRxiv [Preprint]. 2024 Feb 18:2024.02.16.580693. doi: 10.1101/2024.02.16.580693.
ABSTRACT
Social communication relies on the ability to perceive and interpret the direction of others' attention, which is commonly conveyed through head orientation and gaze direction in both humans and non-human primates. However, traditional social gaze experiments in non-human primates require restraining head movements, which significantly limit their natural behavioral repertoire. Here, we developed a novel framework for accurately tracking facial features and three-dimensional head gaze orientations of multiple freely moving common marmosets (Callithrix jacchus). To accurately track the facial features of marmoset dyads in an arena, we adapted computer vision tools using deep learning networks combined with triangulation algorithms applied to the detected facial features to generate dynamic geometric facial frames in 3D space, overcoming common occlusion challenges. Furthermore, we constructed a virtual cone, oriented perpendicular to the facial frame, to model the head gaze directions. Using this framework, we were able to detect different types of interactive social gaze events, including partner-directed gaze and jointly-directed gaze to a shared spatial location. We observed clear effects of sex and familiarity on both interpersonal distance and gaze dynamics in marmoset dyads. Unfamiliar pairs exhibited more stereotyped patterns of arena occupancy, more sustained levels of social gaze across inter-animal distance, and increased gaze monitoring. On the other hand, familiar pairs exhibited higher levels of joint gazes. Moreover, males displayed significantly elevated levels of gazes toward females' faces and the surrounding regions irrespective of familiarity. Our study lays the groundwork for a rigorous quantification of primate behaviors in naturalistic settings.
PMID:38405818 | PMC:PMC10888878 | DOI:10.1101/2024.02.16.580693
TopoFormer: Multiscale Topology-enabled Structure-to-Sequence Transformer for Protein-Ligand Interaction Predictions
Res Sq [Preprint]. 2024 Feb 9:rs.3.rs-3640878. doi: 10.21203/rs.3.rs-3640878/v1.
ABSTRACT
Pre-trained deep Transformers have had tremendous success in a wide variety of disciplines. However, in computational biology, essentially all Transformers are built upon the biological sequences, which ignores vital stereochemical information and may result in crucial errors in downstream predictions. On the other hand, three-dimensional (3D) molecular structures are incompatible with the sequential architecture of Transformer and natural language processing (NLP) models in general. This work addresses this foundational challenge by a topological Transformer (TopoFormer). TopoFormer is built by integrating NLP and a multiscale topology techniques, the persistent topological hyperdigraph Laplacian (PTHL), which systematically converts intricate 3D protein-ligand complexes at various spatial scales into a NLP-admissible sequence of topological invariants and homotopic shapes. Element-specific PTHLs are further developed to embed crucial physical, chemical, and biological interactions into topological sequences. TopoFormer surges ahead of conventional algorithms and recent deep learning variants and gives rise to exemplary scoring accuracy and superior performance in ranking, docking, and screening tasks in a number of benchmark datasets. The proposed topological sequences can be extracted from all kinds of structural data in data science to facilitate various NLP models, heralding a new era in AI-driven discovery.
PMID:38405777 | PMC:PMC10889053 | DOI:10.21203/rs.3.rs-3640878/v1
Evaluation and optimization of sequence-based gene regulatory deep learning models
bioRxiv [Preprint]. 2024 Feb 17:2023.04.26.538471. doi: 10.1101/2023.04.26.538471.
ABSTRACT
Neural networks have emerged as immensely powerful tools in predicting functional genomic regions, notably evidenced by recent successes in deciphering gene regulatory logic. However, a systematic evaluation of how model architectures and training strategies impact genomics model performance is lacking. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast, to best capture the relationship between regulatory DNA and gene expression. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. While some benchmarks produced similar results across the top-performing models, others differed substantially. All top-performing models used neural networks, but diverged in architectures and novel training strategies, tailored to genomics sequence data. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide any given model into logically equivalent building blocks. We tested all possible combinations for the top three models and observed performance improvements for each. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets. Overall, we demonstrate that high-quality gold-standard genomics datasets can drive significant progress in model development.
PMID:38405704 | PMC:PMC10888977 | DOI:10.1101/2023.04.26.538471
Noninvasive total counting of cultured cells using a home-use scanner with a pattern sheet
iScience. 2024 Feb 9;27(3):109170. doi: 10.1016/j.isci.2024.109170. eCollection 2024 Mar 15.
ABSTRACT
The inherent variability in cell culture techniques hinders their reproducibility. To address this issue, we introduce a comprehensive cell observation device. This new approach enhances the features of existing home-use scanners by implementing a pattern sheet. Compared with fluorescent staining, our method over- or underestimated the cell count by a mere 5%. The proposed technique showcased a strong correlation with conventional methodologies, displaying R2 values of 0.91 and 0.99 compared with the standard chamber and fluorescence methods, respectively. Simulations of microscopic observations indicated the potential to estimate accurately the total cell count using just 20 fields of view. Our proposed cell-counting device offers a straightforward, noninvasive means of measuring the number of cultured cells. By harnessing the power of deep learning, this device ensures data integrity, thereby making it an attractive option for future cell culture research.
PMID:38405610 | PMC:PMC10884908 | DOI:10.1016/j.isci.2024.109170
Multi-modal physiological time-frequency feature extraction network for accurate sleep stage classification
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Feb 25;41(1):26-33. doi: 10.7507/1001-5515.202306010.
ABSTRACT
Sleep stage classification is essential for clinical disease diagnosis and sleep quality assessment. Most of the existing methods for sleep stage classification are based on single-channel or single-modal signal, and extract features using a single-branch, deep convolutional network, which not only hinders the capture of the diversity features related to sleep and increase the computational cost, but also has a certain impact on the accuracy of sleep stage classification. To solve this problem, this paper proposes an end-to-end multi-modal physiological time-frequency feature extraction network (MTFF-Net) for accurate sleep stage classification. First, multi-modal physiological signal containing electroencephalogram (EEG), electrocardiogram (ECG), electrooculogram (EOG) and electromyogram (EMG) are converted into two-dimensional time-frequency images containing time-frequency features by using short time Fourier transform (STFT). Then, the time-frequency feature extraction network combining multi-scale EEG compact convolution network (Ms-EEGNet) and bidirectional gated recurrent units (Bi-GRU) network is used to obtain multi-scale spectral features related to sleep feature waveforms and time series features related to sleep stage transition. According to the American Academy of Sleep Medicine (AASM) EEG sleep stage classification criterion, the model achieved 84.3% accuracy in the five-classification task on the third subgroup of the Institute of Systems and Robotics of the University of Coimbra Sleep Dataset (ISRUC-S3), with 83.1% macro F1 score value and 79.8% Cohen's Kappa coefficient. The experimental results show that the proposed model achieves higher classification accuracy and promotes the application of deep learning algorithms in assisting clinical decision-making.
PMID:38403601 | DOI:10.7507/1001-5515.202306010
Research on Parkinson's disease recognition algorithm based on sample enhancement
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Feb 25;41(1):17-25. doi: 10.7507/1001-5515.202304011.
ABSTRACT
Parkinson's disease patients have early vocal cord damage, and their voiceprint characteristics differ significantly from those of healthy individuals, which can be used to identify Parkinson's disease. However, the samples of the voiceprint dataset of Parkinson's disease patients are insufficient, so this paper proposes a double self-attention deep convolutional generative adversarial network model for sample enhancement to generate high-resolution spectrograms, based on which deep learning is used to recognize Parkinson's disease. This model improves the texture clarity of samples by increasing network depth and combining gradient penalty and spectral normalization techniques, and a family of pure convolutional neural networks (ConvNeXt) classification network based on Transfer learning is constructed to extract voiceprint features and classify them, which improves the accuracy of Parkinson's disease recognition. The validation experiments of the effectiveness of this paper's algorithm are carried out on the Parkinson's disease speech dataset. Compared with the pre-sample enhancement, the clarity of the samples generated by the proposed model in this paper as well as the Fréchet inception distance (FID) are improved, and the network model in this paper is able to achieve an accuracy of 98.8%. The results of this paper show that the Parkinson's disease recognition algorithm based on double self-attention deep convolutional generative adversarial network sample enhancement can accurately distinguish between healthy individuals and Parkinson's disease patients, which helps to solve the problem of insufficient samples for early recognition of voiceprint data in Parkinson's disease. In summary, the method effectively improves the classification accuracy of small-sample Parkinson's disease speech dataset and provides an effective solution idea for early Parkinson's disease speech diagnosis.
PMID:38403600 | DOI:10.7507/1001-5515.202304011
A multi-class brain tumor grading system based on histopathological images using a hybrid YOLO and RESNET networks
Sci Rep. 2024 Feb 26;14(1):4584. doi: 10.1038/s41598-024-54864-6.
ABSTRACT
Gliomas are primary brain tumors caused by glial cells. These cancers' classification and grading are crucial for prognosis and treatment planning. Deep learning (DL) can potentially improve the digital pathology investigation of brain tumors. In this paper, we developed a technique for visualizing a predictive tumor grading model on histopathology pictures to help guide doctors by emphasizing characteristics and heterogeneity in forecasts. The proposed technique is a hybrid model based on YOLOv5 and ResNet50. The function of YOLOv5 is to localize and classify the tumor in large histopathological whole slide images (WSIs). The suggested technique incorporates ResNet into the feature extraction of the YOLOv5 framework, and the detection results show that our hybrid network is effective for identifying brain tumors from histopathological images. Next, we estimate the glioma grades using the extreme gradient boosting classifier. The high-dimensional characteristics and nonlinear interactions present in histopathology images are well-handled by this classifier. DL techniques have been used in previous computer-aided diagnosis systems for brain tumor diagnosis. However, by combining the YOLOv5 and ResNet50 architectures into a hybrid model specifically designed for accurate tumor localization and predictive grading within histopathological WSIs, our study presents a new approach that advances the field. By utilizing the advantages of both models, this creative integration goes beyond traditional techniques to produce improved tumor localization accuracy and thorough feature extraction. Additionally, our method ensures stable training dynamics and strong model performance by integrating ResNet50 into the YOLOv5 framework, addressing concerns about gradient explosion. The proposed technique is tested using the cancer genome atlas dataset. During the experiments, our model outperforms the other standard ways on the same dataset. Our results indicate that the proposed hybrid model substantially impacts tumor subtype discrimination between low-grade glioma (LGG) II and LGG III. With 97.2% of accuracy, 97.8% of precision, 98.6% of sensitivity, and the Dice similarity coefficient of 97%, the proposed model performs well in classifying four grades. These results outperform current approaches for identifying LGG from high-grade glioma and provide competitive performance in classifying four categories of glioma in the literature.
PMID:38403597 | DOI:10.1038/s41598-024-54864-6
Low-tube-voltage whole-body CT angiography with extremely low iodine dose: a comparison between hybrid-iterative reconstruction and deep-learning image-reconstruction algorithms
Clin Radiol. 2024 Feb 15:S0009-9260(24)00096-5. doi: 10.1016/j.crad.2024.02.002. Online ahead of print.
ABSTRACT
AIM: To evaluate arterial enhancement, its depiction, and image quality in low-tube potential whole-body computed tomography (CT) angiography (CTA) with extremely low iodine dose and compare the results with those obtained by hybrid-iterative reconstruction (IR) and deep-learning image-reconstruction (DLIR) methods.
MATERIALS AND METHODS: This prospective study included 34 consecutive participants (27 men; mean age, 74.2 years) who underwent whole-body CTA at 80 kVp for evaluating aortic diseases between January and July 2020. Contrast material (240 mg iodine/ml) with simultaneous administration of its quarter volume of saline, which corresponded to 192 mg iodine/ml, was administered. CT raw data were reconstructed using adaptive statistical IR-Veo of 40% (hybrid-IR), DLIR with medium- (DLIR-M), and high-strength level (DLIR-H). A radiologist measured CT attenuation of the arteries and background noise, and the signal-to-noise ratio (SNR) was then calculated. Two reviewers qualitatively evaluated the arterial depictions and diagnostic acceptability on axial, multiplanar-reformatted (MPR), and volume-rendered (VR) images.
RESULTS: Mean contrast material volume and iodine weight administered were 64.1 ml and 15.4 g, respectively. The SNRs of the arteries were significantly higher in the following order of the DLIR-H, DLIR-M, and hybrid-IR (p<0.001). Depictions of six arteries on axial, three arteries on MPR, and four arteries on VR images were significantly superior in the DLIR-M or hybrid-IR than in the DLIR-H (p≤0.009 for each). Diagnostic acceptability was significantly better in the DLIR-M and DLIR-H than in the hybrid-IR (p<0.001-0.005).
CONCLUSION: DLIR-M showed well-balanced arterial depictions and image quality compared with the hybrid-IR and DLIR-H.
PMID:38403540 | DOI:10.1016/j.crad.2024.02.002
ECG-only Explainable Deep Learning Algorithm Predicts the Risk for Malignant Ventricular Arrhythmia in Phospholamban Cardiomyopathy
Heart Rhythm. 2024 Feb 23:S1547-5271(24)00210-8. doi: 10.1016/j.hrthm.2024.02.038. Online ahead of print.
ABSTRACT
BACKGROUND: Phospholamban (PLN) p.(Arg14del) variant carriers are at risk of developing malignant ventricular arrhythmias (MVA). Accurate risk stratification allows for timely implantation of intracardiac defibrillators (ICD) and is currently performed using a multimodality prediction model.
OBJECTIVE: This study aims to investigate whether an explainable deep learning-based approach allows for risk prediction using only electrocardiogram (ECG) data.
METHODS: A total of 679 PLN p.(Arg14del) carriers without MVA at baseline were identified. A deep learning-based variational auto-encoder, trained on 1.1 million ECGs, was used to convert the 12-lead baseline ECG into its FactorECG, a compressed version of the ECG which summarizes it into 32 explainable factors. Prediction models were developed using Cox regression.
RESULTS: The deep learning-based ECG-only approach was able to predict MVA with a c-statistic of 0.79 [95% CI 0.76 - 0.83], comparable to the current prediction model (c-statistic 0.83 [95% CI 0.79 - 0.88], p = 0.064) and outperforming a model based on conventional ECG parameters (low voltage ECG and negative T waves; c-statistic 0.65 [95% CI 0.58 - 0.73], p < 0.001). Clinical simulations showed that a two-step approach, with ECG-only screening followed by a full work-up, resulted in 60% less additional diagnostics, while outperforming the use of the multimodal prediction model in all patients. A visualization tool was created to provide interactive visualizations (https://pln.ecgx.ai).
CONCLUSION: Our deep learning-based algorithm based on ECG data only accurately predicts the occurrence of MVA in PLN p.(Arg14del) carriers, enabling more efficient stratification of patients that need additional diagnostic testing and follow-up.
PMID:38403235 | DOI:10.1016/j.hrthm.2024.02.038