Deep learning
UAdam: Unified Adam-Type Algorithmic Framework for Nonconvex Optimization
Neural Comput. 2024 Jul 18:1-27. doi: 10.1162/neco_a_01692. Online ahead of print.
ABSTRACT
Adam-type algorithms have become a preferred choice for optimization in the deep learning setting; however, despite their success, their convergence is still not well understood. To this end, we introduce a unified framework for Adam-type algorithms, termed UAdam. It is equipped with a general form of the second-order moment, which makes it possible to include Adam and its existing and future variants as special cases, such as NAdam, AMSGrad, AdaBound, AdaFom, and Adan. The approach is supported by a rigorous convergence analysis of UAdam in the general nonconvex stochastic setting, showing that UAdam converges to the neighborhood of stationary points with a rate of O(1/T). Furthermore, the size of the neighborhood decreases as the parameter β1 increases. Importantly, our analysis only requires the first-order momentum factor to be close enough to 1, without any restrictions on the second-order momentum factor. Theoretical results also reveal the convergence conditions of vanilla Adam, together with the selection of appropriate hyperparameters. This provides a theoretical guarantee for the analysis, applications, and further developments of the whole general class of Adam-type algorithms. Finally, several numerical experiments are provided to support our theoretical findings.
PMID:39106463 | DOI:10.1162/neco_a_01692
Deep learning-based dose prediction for magnetic resonance-guided prostate radiotherapy
Med Phys. 2024 Aug 6. doi: 10.1002/mp.17312. Online ahead of print.
ABSTRACT
BACKGROUND: Daily adaptive radiotherapy, as performed with the Elekta Unity MR-Linac, requires choosing between different adaptation methods, namely ATP (Adapt to Position) and ATS (Adapt to Shape), where the latter requires daily re-contouring to obtain a dose plan tailored to the daily anatomy. These steps are inherently resource-intensive, and quickly predicting the dose distribution and the dosimetric evaluation criteria while the patient is on the table could facilitate a fast selection of adaptation method and decrease the treatment times.
PURPOSE: In this work, we aimed to develop a deep-learning-based dose-prediction pipeline for prostate MR-Linac treatments.
METHODS: Two hundred twelve MR-images, structure sets, and dose distributions from 35 prostate patients treated with 6.1 Gy for 7 or 6 fractions at our MR-Linac were included, split into train/test partitions of 152/60 images, respectively. A deep-learning segmentation network was trained to segment the CTV (prostate), bladder, and rectum. A second network was trained to predict the dose distribution based on manually delineated structures. At inference, the predicted segmentations acted as input to the dose prediction network, and the predicted dose was compared to the true (optimized in the treatment planning system) dose distribution.
RESULTS: Median DSC values from the segmentation network were 0.90/0.94/0.87 for CTV/bladder/rectum. Predicted segmentations as input to the dose prediction resulted in mean differences between predicted and true doses of 0.7%/0.7%/1.7% (relative to the prescription dose) for D98%/D95%/D2% for the CTV. For the bladder, the difference was 0.7%/0.3% for Dmean/D2% and for the rectum 0.1/0.2/0.2 pp (percentage points) for V33Gy/V38Gy/V41Gy. In comparison, true segmentations as input resulted in differences of 1.1%/0.9%/1.6% for CTV, 0.5%/0.4% for bladder, and 0.7/0.4/0.3 pp for the rectum. Only D2% for CTV and Dmean/D2% for bladder were found to be statistically significantly better when using true structures instead of predicted structures as input to the dose prediction.
CONCLUSIONS: Small differences in the fulfillment of clinical dose-volume constraints are seen between utilizing deep-learning predicted structures as input to a dose prediction network and manual structures. Overall mean differences <2% indicate that the dose-prediction pipeline is useful as a decision support tool where differences are >2%.
PMID:39106418 | DOI:10.1002/mp.17312
Single-Image-Based Deep Learning for Precise Atomic Defect Identification
Nano Lett. 2024 Aug 6. doi: 10.1021/acs.nanolett.4c02654. Online ahead of print.
ABSTRACT
Defect engineering is widely used to impart the desired functionalities on materials. Despite the widespread application of atomic-resolution scanning transmission electron microscopy (STEM), traditional methods for defect analysis are highly sensitive to random noise and human bias. While deep learning (DL) presents a viable alternative, it requires extensive amounts of training data with labeled ground truth. Herein, employing cycle generative adversarial networks (CycleGAN) and U-Nets, we propose a method based on a single experimental STEM image to tackle high annotation costs and image noise for defect detection. Not only atomic defects but also oxygen dopants in monolayer MoS2 are visualized. The method can be readily extended to other two-dimensional systems, as the training is based on unit-cell-level images. Therefore, our results outline novel ways to train the model with minimal data sets, offering great opportunities to fully exploit the power of DL in the materials science community.
PMID:39106329 | DOI:10.1021/acs.nanolett.4c02654
An Eye Movement Classification Method based on Cascade Forest
IEEE J Biomed Health Inform. 2024 Aug 6;PP. doi: 10.1109/JBHI.2024.3439568. Online ahead of print.
ABSTRACT
Eye tracking technology has become increasingly important in scientific research and practical applications. In the field of eye tracking research, analysis of eye movement data is crucial, particularly for classifying raw eye movement data into eye movement events. Current classification methods exhibit considerable variation in adaptability across different participants, and it is necessary to address the issues of class imbalance and data scarcity in eye movement classification. In the current study, we introduce a novel eye movement classification method based on cascade forest (EMCCF), which comprises two modules: (1) a feature extraction module that employs a multi-scale time window method to extract features from raw eye movement data; (2) a classification module that innovatively employs a layered ensemble architecture, integrating the cascade forest structure with ensemble learning principles, specifically for eye movement classification. Consequently, EMCCF not only enhanced the accuracy and efficiency of eye movement classification but also represents an advancement in applying ensemble learning techniques within this domain. Furthermore, experimental results indicated that EMCCF outperformed existing deep learning-based classification models in several metrics and demonstrated robust performance across different datasets and participants.
PMID:39106144 | DOI:10.1109/JBHI.2024.3439568
DMAMP: A deep-learning model for detecting antimicrobial peptides and their multi-activities
IEEE/ACM Trans Comput Biol Bioinform. 2024 Aug 6;PP. doi: 10.1109/TCBB.2024.3439541. Online ahead of print.
ABSTRACT
Due to the broad-spectrum and high-efficiency antibacterial activity, antimicrobial peptides (AMPs) and their functions have been studied in the field of drug discovery. Using biological experiments to detect the AMPs and corresponding activities require a high cost, whereas computational technologies do so for much less. Currently, most computational methods solve the identification of AMPs and their activities as two independent tasks, which ignore the relationship between them. Therefore, the combination and sharing of patterns for two tasks is a crucial problem that needs to be addressed. In this study, we propose a deep learning model, called DMAMP, for detecting AMPs and activities simultaneously, which is benefited from multi-task learning. The first stage is to utilize convolutional neural network models and residual blocks to extract the sharing hidden features from two related tasks. The next stage is to use two fully connected layers to learn the distinct information of two tasks. Meanwhile, the original evolutionary features from the peptide sequence are also fed to the predictor of the second task to complement the forgotten information. The experiments on the independent test dataset demonstrate that our method performs better than the single-task model with 4.28% of Matthews Correlation Coefficient (MCC) on the first task, and achieves 0.2627 of an average MCC which is higher than the single-task model and two existing methods for five activities on the second task. To understand whether features derived from the convolutional layers of models capture the differences between target classes, we visualize these high-dimensional features by projecting into 3D space. In addition, we show that our predictor has the ability to identify peptides that achieve activity against Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2). We hope that our proposed method can give new insights into the discovery of novel antiviral peptide drugs.
PMID:39106141 | DOI:10.1109/TCBB.2024.3439541
Artificial Intelligence for Early Detection of Pediatric Eye Diseases Using Mobile Photos
JAMA Netw Open. 2024 Aug 1;7(8):e2425124. doi: 10.1001/jamanetworkopen.2024.25124.
ABSTRACT
IMPORTANCE: Identifying pediatric eye diseases at an early stage is a worldwide issue. Traditional screening procedures depend on hospitals and ophthalmologists, which are expensive and time-consuming. Using artificial intelligence (AI) to assess children's eye conditions from mobile photographs could facilitate convenient and early identification of eye disorders in a home setting.
OBJECTIVE: To develop an AI model to identify myopia, strabismus, and ptosis using mobile photographs.
DESIGN, SETTING, AND PARTICIPANTS: This cross-sectional study was conducted at the Department of Ophthalmology of Shanghai Ninth People's Hospital from October 1, 2022, to September 30, 2023, and included children who were diagnosed with myopia, strabismus, or ptosis.
MAIN OUTCOMES AND MEASURES: A deep learning-based model was developed to identify myopia, strabismus, and ptosis. The performance of the model was assessed using sensitivity, specificity, accuracy, the area under the curve (AUC), positive predictive values (PPV), negative predictive values (NPV), positive likelihood ratios (P-LR), negative likelihood ratios (N-LR), and the F1-score. GradCAM++ was utilized to visually and analytically assess the impact of each region on the model. A sex subgroup analysis and an age subgroup analysis were performed to validate the model's generalizability.
RESULTS: A total of 1419 images obtained from 476 patients (225 female [47.27%]; 299 [62.82%] aged between 6 and 12 years) were used to build the model. Among them, 946 monocular images were used to identify myopia and ptosis, and 473 binocular images were used to identify strabismus. The model demonstrated good sensitivity in detecting myopia (0.84 [95% CI, 0.82-0.87]), strabismus (0.73 [95% CI, 0.70-0.77]), and ptosis (0.85 [95% CI, 0.82-0.87]). The model showed comparable performance in identifying eye disorders in both female and male children during sex subgroup analysis. There were differences in identifying eye disorders among different age subgroups.
CONCLUSIONS AND RELEVANCE: In this cross-sectional study, the AI model demonstrated strong performance in accurately identifying myopia, strabismus, and ptosis using only smartphone images. These results suggest that such a model could facilitate the early detection of pediatric eye diseases in a convenient manner at home.
PMID:39106068 | DOI:10.1001/jamanetworkopen.2024.25124
Application of image recognition technology in pathological diagnosis of blood smears
Clin Exp Med. 2024 Aug 6;24(1):181. doi: 10.1007/s10238-024-01379-z.
ABSTRACT
Traditional manual blood smear diagnosis methods are time-consuming and prone to errors, often relying heavily on the experience of clinical laboratory analysts for accuracy. As breakthroughs in key technologies such as neural networks and deep learning continue to drive digital transformation in the medical field, image recognition technology is increasingly being leveraged to enhance existing medical processes. In recent years, advancements in computer technology have led to improved efficiency in the identification of blood cells in blood smears through the use of image recognition technology. This paper provides a comprehensive summary of the methods and steps involved in utilizing image recognition algorithms for diagnosing diseases in blood smears, with a focus on malaria and leukemia. Furthermore, it offers a forward-looking research direction for the development of a comprehensive blood cell pathological detection system.
PMID:39105953 | DOI:10.1007/s10238-024-01379-z
Deep learning approaches for the detection of scar presence from cine cardiac magnetic resonance adding derived parametric images
Med Biol Eng Comput. 2024 Aug 6. doi: 10.1007/s11517-024-03175-z. Online ahead of print.
ABSTRACT
This work proposes a convolutional neural network (CNN) that utilizes different combinations of parametric images computed from cine cardiac magnetic resonance (CMR) images, to classify each slice for possible myocardial scar tissue presence. The CNN performance comparison in respect to expert interpretation of CMR with late gadolinium enhancement (LGE) images, used as ground truth (GT), was conducted on 206 patients (158 scar, 48 control) from Centro Cardiologico Monzino (Milan, Italy) at both slice- and patient-levels. Left ventricle dynamic features were extracted in non-enhanced cine images using parametric images based on both Fourier and monogenic signal analyses. The CNN, fed with cine images and Fourier-based parametric images, achieved an area under the ROC curve of 0.86 (accuracy 0.79, F1 0.81, sensitivity 0.9, specificity 0.65, and negative (NPV) and positive (PPV) predictive values 0.83 and 0.77, respectively), for individual slice classification. Remarkably, it exhibited 1.0 prediction accuracy (F1 0.98, sensitivity 1.0, specificity 0.9, NPV 1.0, and PPV 0.97) in patient classification as a control or pathologic. The proposed approach represents a first step towards scar detection in contrast-free CMR images. Patient-level results suggest its preliminary potential as a screening tool to guide decisions regarding LGE-CMR prescription, particularly in cases where indication is uncertain.
PMID:39105884 | DOI:10.1007/s11517-024-03175-z
IEA-Net: Internal and External Dual-Attention Medical Segmentation Network with High-Performance Convolutional Blocks
J Imaging Inform Med. 2024 Aug 6. doi: 10.1007/s10278-024-01217-4. Online ahead of print.
ABSTRACT
Currently, deep learning is developing rapidly in the field of image segmentation, and medical image segmentation is one of the key applications in this field. Conventional CNN has achieved great success in general medical image segmentation tasks, but it has feature loss in the feature extraction part and lacks the ability to explicitly model remote dependencies, which makes it difficult to adapt to the task of human organ segmentation. Although methods containing attention mechanisms have made good progress in the field of semantic segmentation, most of the current attention mechanisms are limited to a single sample, while the number of samples of human organ images is large, ignoring the correlation between the samples is not conducive to image segmentation. In order to solve these problems, an internal and external dual-attention segmentation network (IEA-Net) is proposed in this paper, and the ICSwR (interleaved convolutional system with residual) module and the IEAM module are designed in this network. The ICSwR contains interleaved convolution and hopping connection, which are used for the initial extraction of the features in the encoder part. The IEAM module (internal and external dual-attention module) consists of the LGGW-SA (local-global Gaussian-weighted self-attention) module and the EA module, which are in a tandem structure. The LGGW-SA module focuses on learning local-global feature correlations within individual samples for efficient feature extraction. Meanwhile, the EA module is designed to capture inter-sample connections, addressing multi-sample complexities. Additionally, skip connections will be incorporated into each IEAM module within both the encoder and decoder to reduce feature loss. We tested our method on the Synapse multi-organ segmentation dataset and the ACDC cardiac segmentation dataset, and the experimental results show that the proposed method achieves better performance than other state-of-the-art methods.
PMID:39105850 | DOI:10.1007/s10278-024-01217-4
Principles of artificial intelligence in radiooncology
Strahlenther Onkol. 2024 Aug 6. doi: 10.1007/s00066-024-02272-0. Online ahead of print.
ABSTRACT
PURPOSE: In the rapidly expanding field of artificial intelligence (AI) there is a wealth of literature detailing the myriad applications of AI, particularly in the realm of deep learning. However, a review that elucidates the technical principles of deep learning as relevant to radiation oncology in an easily understandable manner is still notably lacking. This paper aims to fill this gap by providing a comprehensive guide to the principles of deep learning that is specifically tailored toward radiation oncology.
METHODS: In light of the extensive variety of AI methodologies, this review selectively concentrates on the specific domain of deep learning. It emphasizes the principal categories of deep learning models and delineates the methodologies for training these models effectively.
RESULTS: This review initially delineates the distinctions between AI and deep learning as well as between supervised and unsupervised learning. Subsequently, it elucidates the fundamental principles of major deep learning models, encompassing multilayer perceptrons (MLPs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), transformers, generative adversarial networks (GANs), diffusion-based generative models, and reinforcement learning. For each category, it presents representative networks alongside their specific applications in radiation oncology. Moreover, the review outlines critical factors essential for training deep learning models, such as data preprocessing, loss functions, optimizers, and other pivotal training parameters including learning rate and batch size.
CONCLUSION: This review provides a comprehensive overview of deep learning principles tailored toward radiation oncology. It aims to enhance the understanding of AI-based research and software applications, thereby bridging the gap between complex technological concepts and clinical practice in radiation oncology.
PMID:39105746 | DOI:10.1007/s00066-024-02272-0
Deep learning for autosegmentation for radiotherapy treatment planning: State-of-the-art and novel perspectives
Strahlenther Onkol. 2024 Aug 6. doi: 10.1007/s00066-024-02262-2. Online ahead of print.
ABSTRACT
The rapid development of artificial intelligence (AI) has gained importance, with many tools already entering our daily lives. The medical field of radiation oncology is also subject to this development, with AI entering all steps of the patient journey. In this review article, we summarize contemporary AI techniques and explore the clinical applications of AI-based automated segmentation models in radiotherapy planning, focusing on delineation of organs at risk (OARs), the gross tumor volume (GTV), and the clinical target volume (CTV). Emphasizing the need for precise and individualized plans, we review various commercial and freeware segmentation tools and also state-of-the-art approaches. Through our own findings and based on the literature, we demonstrate improved efficiency and consistency as well as time savings in different clinical scenarios. Despite challenges in clinical implementation such as domain shifts, the potential benefits for personalized treatment planning are substantial. The integration of mathematical tumor growth models and AI-based tumor detection further enhances the possibilities for refining target volumes. As advancements continue, the prospect of one-stop-shop segmentation and radiotherapy planning represents an exciting frontier in radiotherapy, potentially enabling fast treatment with enhanced precision and individualization.
PMID:39105745 | DOI:10.1007/s00066-024-02262-2
Fully Automated Deep Learning Model to Detect Clinically Significant Prostate Cancer at MRI
Radiology. 2024 Aug;312(2):e232635. doi: 10.1148/radiol.232635.
ABSTRACT
Background Multiparametric MRI can help identify clinically significant prostate cancer (csPCa) (Gleason score ≥7) but is limited by reader experience and interobserver variability. In contrast, deep learning (DL) produces deterministic outputs. Purpose To develop a DL model to predict the presence of csPCa by using patient-level labels without information about tumor location and to compare its performance with that of radiologists. Materials and Methods Data from patients without known csPCa who underwent MRI from January 2017 to December 2019 at one of multiple sites of a single academic institution were retrospectively reviewed. A convolutional neural network was trained to predict csPCa from T2-weighted images, diffusion-weighted images, apparent diffusion coefficient maps, and T1-weighted contrast-enhanced images. The reference standard was pathologic diagnosis. Radiologist performance was evaluated as follows: Radiology reports were used for the internal test set, and four radiologists' PI-RADS ratings were used for the external (ProstateX) test set. The performance was compared using areas under the receiver operating characteristic curves (AUCs) and the DeLong test. Gradient-weighted class activation maps (Grad-CAMs) were used to show tumor localization. Results Among 5735 examinations in 5215 patients (mean age, 66 years ± 8 [SD]; all male), 1514 examinations (1454 patients) showed csPCa. In the internal test set (400 examinations), the AUC was 0.89 and 0.89 for the DL classifier and radiologists, respectively (P = .88). In the external test set (204 examinations), the AUC was 0.86 and 0.84 for the DL classifier and radiologists, respectively (P = .68). DL classifier plus radiologists had an AUC of 0.89 (P < .001). Grad-CAMs demonstrated activation over the csPCa lesion in 35 of 38 and 56 of 58 true-positive examinations in internal and external test sets, respectively. Conclusion The performance of a DL model was not different from that of radiologists in the detection of csPCa at MRI, and Grad-CAMs localized the tumor. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Johnson and Chandarana in this issue.
PMID:39105640 | DOI:10.1148/radiol.232635
Corrigendum: A novel approach for sports injury risk prediction: based on time-series image encoding and deep learning
Front Physiol. 2024 Jul 22;15:1441107. doi: 10.3389/fphys.2024.1441107. eCollection 2024.
ABSTRACT
[This corrects the article DOI: 10.3389/fphys.2023.1174525.].
PMID:39105083 | PMC:PMC11298417 | DOI:10.3389/fphys.2024.1441107
Genome composition-based deep learning predicts oncogenic potential of HPVs
Front Cell Infect Microbiol. 2024 Jul 22;14:1430424. doi: 10.3389/fcimb.2024.1430424. eCollection 2024.
ABSTRACT
Human papillomaviruses (HPVs) account for more than 30% of cancer cases, with definite identification of the oncogenic role of viral E6 and E7 genes. However, the identification of high-risk HPV genotypes has largely relied on lagged biological exploration and clinical observation, with types unclassified and oncogenicity unknown for many HPVs. In the present study, we retrieved and cleaned HPV sequence records with high quality and analyzed their genomic compositional traits of dinucleotide (DNT) and DNT representation (DCR) to overview the distribution difference among various types of HPVs. Then, a deep learning model was built to predict the oncogenic potential of all HPVs based on E6 and E7 genes. Our results showed that the main three groups of Alpha, Beta, and Gamma HPVs were clearly separated between/among types in the DCR trait for either E6 or E7 coding sequence (CDS) and were clustered within the same group. Moreover, the DCR data of either E6 or E7 were learnable with a convolutional neural network (CNN) model. Either CNN classifier predicted accurately the oncogenicity label of high and low oncogenic HPVs. In summary, the compositional traits of HPV oncogenicity-related genes E6 and E7 were much different between the high and low oncogenic HPVs, and the compositional trait of the DCR-based deep learning classifier predicted the oncogenic phenotype accurately of HPVs. The trained predictor in this study will facilitate the identification of HPV oncogenicity, particularly for those HPVs without clear genotype or phenotype.
PMID:39104853 | PMC:PMC11298479 | DOI:10.3389/fcimb.2024.1430424
Strengths and limitations of web servers for the modeling of TCRpMHC complexes
Comput Struct Biotechnol J. 2024 Jul 1;23:2938-2948. doi: 10.1016/j.csbj.2024.06.028. eCollection 2024 Dec.
ABSTRACT
Cellular immunity relies on the ability of a T-cell receptor (TCR) to recognize a peptide (p) presented by a class I major histocompatibility complex (MHC) receptor on the surface of a cell. The TCR-peptide-MHC (TCRpMHC) interaction is a crucial step in activating T-cells, and the structural characteristics of these molecules play a significant role in determining the specificity and affinity of this interaction. Hence, obtaining 3D structures of TCRpMHC complexes offers valuable insights into various aspects of cellular immunity and can facilitate the development of T-cell-based immunotherapies. Here, we aimed to compare three popular web servers for modeling the structures of TCRpMHC complexes, namely ImmuneScape (IS), TCRpMHCmodels, and TCRmodel2, to examine their strengths and limitations. Each method employs a different modeling strategy, including docking, homology modeling, and deep learning. The accuracy of each method was evaluated by reproducing the 3D structures of a dataset of 87 TCRpMHC complexes with experimentally determined crystal structures available on the Protein Data Bank (PDB). All selected structures were limited to human MHC alleles, presenting a diverse set of peptide ligands. A detailed analysis of produced models was conducted using multiple metrics, including Root Mean Square Deviation (RMSD) and standardized assessments from CAPRI and DockQ. Special attention was given to the complementarity-determining region (CDR) loops of the TCRs and to the peptide ligands, which define most of the unique features and specificity of a given TCRpMHC interaction. Our study provides an optimistic view of the current state-of-the-art for TCRpMHC modeling but highlights some remaining challenges that must be addressed in order to support the future application of these tools for TCR engineering and computer-aided design of TCR-based immunotherapies.
PMID:39104710 | PMC:PMC11298609 | DOI:10.1016/j.csbj.2024.06.028
Deep learning networks based decision fusion model of EEG and fNIRS for classification of cognitive tasks
Cogn Neurodyn. 2024 Aug;18(4):1489-1506. doi: 10.1007/s11571-023-09986-4. Epub 2023 Jun 30.
ABSTRACT
The detection of the cognitive tasks performed by a subject during data acquisition of a neuroimaging method has a wide range of applications: functioning of brain-computer interface (BCI), detection of neuronal disorders, neurorehabilitation for disabled patients, and many others. Recent studies show that the combination or fusion of electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) demonstrates improved classification and detection performance compared to sole-EEG and sole-fNIRS. Deep learning (DL) networks are suitable for the classification of large volume time-series data like EEG and fNIRS. This study performs the decision fusion of EEG and fNIRS. The classification of EEG, fNIRS, and decision-fused EEG-fNIRSinto cognitive task labels is performed by DL networks. Two different open-source datasets of simultaneously recorded EEG and fNIRS are examined in this study. Dataset 01 is comprised of 26 subjects performing 3 cognitive tasks: n-back, discrimination or selection response (DSR), and word generation (WG). After data acquisition, fNIRS is converted to oxygenated hemoglobin (HbO2) and deoxygenated hemoglobin (HbR) in Dataset 01. Dataset 02 is comprised of 29 subjects who performed 2 tasks: motor imagery and mental arithmetic. The classification procedure of EEG and fNIRS (or HbO2, HbR) are carried out by 7 DL classifiers: convolutional neural network (CNN), long short-term memory network (LSTM), gated recurrent unit (GRU), CNN-LSTM, CNN-GRU, LSTM-GRU, and CNN-LSTM-GRU. After the classification of single modalities, their prediction scores or decisions are combined to obtain the decision-fused modality. The classification performance is measured by overall accuracy and area under the ROC curve (AUC). The highest accuracy and AUC recorded in Dataset 01 are 96% and 100% respectively; both by the decision fusion modality using CNN-LSTM-GRU. For Dataset 02, the highest accuracy and AUC are 82.76% and 90.44% respectively; both by the decision fusion modality using CNN-LSTM. The experimental result shows that decision-fused EEG-HbO2-HbR and EEG-fNIRSdeliver higher performances compared to their constituent unimodalities in most cases. For DL classifiers, CNN-LSTM-GRU in Dataset 01 and CNN-LSTM in Dataset 02 yield the highest performance.
PMID:39104699 | PMC:PMC11297873 | DOI:10.1007/s11571-023-09986-4
ADHD/CD-NET: automated EEG-based characterization of ADHD and CD using explainable deep neural network technique
Cogn Neurodyn. 2024 Aug;18(4):1609-1625. doi: 10.1007/s11571-023-10028-2. Epub 2023 Nov 28.
ABSTRACT
In this study, attention deficit hyperactivity disorder (ADHD), a childhood neurodevelopmental disorder, is being studied alongside its comorbidity, conduct disorder (CD), a behavioral disorder. Because ADHD and CD share commonalities, distinguishing them is difficult, thus increasing the risk of misdiagnosis. It is crucial that these two conditions are not mistakenly identified as the same because the treatment plan varies depending on whether the patient has CD or ADHD. Hence, this study proposes an electroencephalogram (EEG)-based deep learning system known as ADHD/CD-NET that is capable of objectively distinguishing ADHD, ADHD + CD, and CD. The 12-channel EEG signals were first segmented and converted into channel-wise continuous wavelet transform (CWT) correlation matrices. The resulting matrices were then used to train the convolutional neural network (CNN) model, and the model's performance was evaluated using 10-fold cross-validation. Gradient-weighted class activation mapping (Grad-CAM) was also used to provide explanations for the prediction result made by the 'black box' CNN model. Internal private dataset (45 ADHD, 62 ADHD + CD and 16 CD) and external public dataset (61 ADHD and 60 healthy controls) were used to evaluate ADHD/CD-NET. As a result, ADHD/CD-NET achieved classification accuracy, sensitivity, specificity, and precision of 93.70%, 90.83%, 95.35% and 91.85% for the internal evaluation, and 98.19%, 98.36%, 98.03% and 98.06% for the external evaluation. Grad-CAM also identified significant channels that contributed to the diagnosis outcome. Therefore, ADHD/CD-NET can perform temporal localization and choose significant EEG channels for diagnosis, thus providing objective analysis for mental health professionals and clinicians to consider when making a diagnosis.
SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11571-023-10028-2.
PMID:39104684 | PMC:PMC11297883 | DOI:10.1007/s11571-023-10028-2
Cognitive workload estimation using physiological measures: a review
Cogn Neurodyn. 2024 Aug;18(4):1445-1465. doi: 10.1007/s11571-023-10051-3. Epub 2023 Dec 26.
ABSTRACT
Estimating cognitive workload levels is an emerging research topic in the cognitive neuroscience domain, as participants' performance is highly influenced by cognitive overload or underload results. Different physiological measures such as Electroencephalography (EEG), Functional Magnetic Resonance Imaging, Functional near-infrared spectroscopy, respiratory activity, and eye activity are efficiently used to estimate workload levels with the help of machine learning or deep learning techniques. Some reviews focus only on EEG-based workload estimation using machine learning classifiers or multimodal fusion of different physiological measures for workload estimation. However, a detailed analysis of all physiological measures for estimating cognitive workload levels still needs to be discovered. Thus, this survey highlights the in-depth analysis of all the physiological measures for assessing cognitive workload. This survey emphasizes the basics of cognitive workload, open-access datasets, the experimental paradigm of cognitive tasks, and different measures for estimating workload levels. Lastly, we emphasize the significant findings from this review and identify the open challenges. In addition, we also specify future scopes for researchers to overcome those challenges.
PMID:39104683 | PMC:PMC11297869 | DOI:10.1007/s11571-023-10051-3
An EEG-based marker of functional connectivity: detection of major depressive disorder
Cogn Neurodyn. 2024 Aug;18(4):1671-1687. doi: 10.1007/s11571-023-10041-5. Epub 2023 Dec 1.
ABSTRACT
Major depressive disorder (MDD) is a prevalent psychiatric disorder globally. There are many assays for MDD, but rapid and reliable detection remains a pressing challenge. In this study, we present a fusion feature called P-MSWC, as a novel marker to construct brain functional connectivity matrices and utilize the convolutional neural network (CNN) to identify MDD based on electroencephalogram (EEG) signal. Firstly, we combine synchrosqueezed wavelet transform and coherence theory to get synchrosqueezed wavelet coherence. Then, we obtain the fusion feature by incorporating synchrosqueezed wavelet coherence value and phase-locking value, which outperforms conventional functional connectivity markers by comprehensively capturing the original EEG signal's information and demonstrating notable noise-resistance capabilities. Finally, we propose a lightweight CNN model that effectively utilizes the high-dimensional connectivity matrix of the brain, constructed using our novel marker, to enable more accurate and efficient detection of MDD. The proposed method achieves 99.92% accuracy on a single dataset and 97.86% accuracy on a combined dataset. Moreover, comparison experiments have shown that the performance of the proposed method is superior to traditional machine learning methods. Furthermore, visualization experiments reveal differences in the distribution of brain connectivity between MDD patients and healthy subjects, including decreased connectivity in the T7, O1, F8, and C3 channels of the gamma band. The results of the experiments indicate that the fusion feature can be utilized as a new marker for constructing functional brain connectivity, and the combination of deep learning and functional connectivity matrices can provide more help for the detection of MDD.
PMID:39104678 | PMC:PMC11297863 | DOI:10.1007/s11571-023-10041-5
Deep learning and remote photoplethysmography powered advancements in contactless physiological measurement
Front Bioeng Biotechnol. 2024 Jul 17;12:1420100. doi: 10.3389/fbioe.2024.1420100. eCollection 2024.
ABSTRACT
In recent decades, there has been ongoing development in the application of computer vision (CV) in the medical field. As conventional contact-based physiological measurement techniques often restrict a patient's mobility in the clinical environment, the ability to achieve continuous, comfortable and convenient monitoring is thus a topic of interest to researchers. One type of CV application is remote imaging photoplethysmography (rPPG), which can predict vital signs using a video or image. While contactless physiological measurement techniques have an excellent application prospect, the lack of uniformity or standardization of contactless vital monitoring methods limits their application in remote healthcare/telehealth settings. Several methods have been developed to improve this limitation and solve the heterogeneity of video signals caused by movement, lighting, and equipment. The fundamental algorithms include traditional algorithms with optimization and developing deep learning (DL) algorithms. This article aims to provide an in-depth review of current Artificial Intelligence (AI) methods using CV and DL in contactless physiological measurement and a comprehensive summary of the latest development of contactless measurement techniques for skin perfusion, respiratory rate, blood oxygen saturation, heart rate, heart rate variability, and blood pressure.
PMID:39104628 | PMC:PMC11298756 | DOI:10.3389/fbioe.2024.1420100