Deep learning
A narrative review of deep learning in thyroid imaging: current progress and future prospects
Quant Imaging Med Surg. 2024 Feb 1;14(2):2069-2088. doi: 10.21037/qims-23-908. Epub 2024 Jan 19.
ABSTRACT
BACKGROUND AND OBJECTIVE: Deep learning (DL) has contributed substantially to the evolution of image analysis by unlocking increased data and computational power. These DL algorithms have further facilitated the growing trend of implementing precision medicine, particularly in areas of diagnosis and therapy. Thyroid imaging, as a routine means to screening for thyroid diseases on large-scale populations, is a massive data source for the development of DL models. Thyroid disease is a global health problem and involves structural and functional changes. The objective of this study was to evaluate the general rules and future directions of DL networks in thyroid medical image analysis through a review of original articles published between 2018 and 2023.
METHODS: We searched for English-language articles published between April 2018 and September 2023 in the databases of PubMed, Web of Science, and Google Scholar. The keywords used in the search included artificial intelligence or DL, thyroid diseases, and thyroid nodule or thyroid carcinoma.
KEY CONTENT AND FINDINGS: The computer vision tasks of DL in thyroid imaging included classification, segmentation, and detection. The current applications of DL in clinical workflow were found to mainly include management of thyroid nodules/carcinoma, risk evaluation of thyroid cancer metastasis, and discrimination of functional thyroid diseases.
CONCLUSIONS: DL is expected to enhance the quality of thyroid images and provide greater precision in the assessment of thyroid images. Specifically, DL can increase the diagnostic accuracy of thyroid diseases and better inform clinical decision-making.
PMID:38415152 | PMC:PMC10895129 | DOI:10.21037/qims-23-908
PXPermute reveals staining importance in multichannel imaging flow cytometry
Cell Rep Methods. 2024 Feb 26;4(2):100715. doi: 10.1016/j.crmeth.2024.100715.
ABSTRACT
Imaging flow cytometry (IFC) allows rapid acquisition of numerous single-cell images per second, capturing information from multiple fluorescent channels. However, the traditional process of staining cells with fluorescently labeled conjugated antibodies for IFC analysis is time consuming, expensive, and potentially harmful to cell viability. To streamline experimental workflows and reduce costs, it is crucial to identify the most relevant channels for downstream analysis. In this study, we introduce PXPermute, a user-friendly and powerful method for assessing the significance of IFC channels, particularly for cell profiling. Our approach evaluates channel importance by permuting pixel values within each channel and analyzing the resulting impact on machine learning or deep learning models. Through rigorous evaluation of three multichannel IFC image datasets, we demonstrate PXPermute's potential in accurately identifying the most informative channels, aligning with established biological knowledge. PXPermute can assist biologists with systematic channel analysis, experimental design optimization, and biomarker identification.
PMID:38412831 | DOI:10.1016/j.crmeth.2024.100715
Rapid classification of coffee origin by combining mass spectrometry analysis of coffee aroma with deep learning
Food Chem. 2024 Feb 22;446:138811. doi: 10.1016/j.foodchem.2024.138811. Online ahead of print.
ABSTRACT
Mislabeling the geographical origin of coffee is a prevalent form of fraud. In this study, a rapid, nondestructive, and high-throughput method combining mass spectrometry (MS) analysis and intelligence algorithms to classify coffee origin was developed. Specifically, volatile compounds in coffee aroma were detected using self-aspiration corona discharge ionization mass spectrometry (SACDI-MS), and the acquired MS data were processed using a customized deep learning algorithm to perform origin authentication automatically. To facilitate high-throughput analysis, an air curtain sampling device was designed and coupled with SACDI-MS to prevent volatile mixing and signal overlap. An accuracy of 99.78% was achieved in the classification of coffee samples from six origins at a throughput of 1 s per sample. The proposed approach may be effective in preventing coffee fraud owing to its straightforward operation, rapidity, and high accuracy and thus benefit consumers.
PMID:38412809 | DOI:10.1016/j.foodchem.2024.138811
Mode combinability: Exploring convex combinations of permutation aligned models
Neural Netw. 2024 Feb 23;173:106204. doi: 10.1016/j.neunet.2024.106204. Online ahead of print.
ABSTRACT
We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors ΘA and ΘB of size d. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube [0,1]d and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we analyze the functional and weight similarity of model combinations and show that such combinations are non-vacuous in the sense that there are significant functional differences between the resulting models.
PMID:38412738 | DOI:10.1016/j.neunet.2024.106204
Noncompact uniform universal approximation
Neural Netw. 2024 Feb 15;173:106181. doi: 10.1016/j.neunet.2024.106181. Online ahead of print.
ABSTRACT
The universal approximation theorem is generalised to uniform convergence on the (noncompact) input space Rn. All continuous functions that vanish at infinity can be uniformly approximated by neural networks with one hidden layer, for all activation functions φ that are continuous, nonpolynomial, and asymptotically polynomial at ±∞. When φ is moreover bounded, we exactly determine which functions can be uniformly approximated by neural networks, with the following unexpected results. Let Nφl(Rn)¯ denote the vector space of functions that are uniformly approximable by neural networks with l hidden layers and n inputs. For all n and all l≥2, Nφl(Rn)¯ turns out to be an algebra under the pointwise product. If the left limit of φ differs from its right limit (for instance, when φ is sigmoidal) the algebra Nφl(Rn)¯ (l≥2) is independent of φ and l, and equals the closed span of products of sigmoids composed with one-dimensional projections. If the left limit of φ equals its right limit, Nφl(Rn)¯ (l≥1) equals the (real part of the) commutative resolvent algebra, a C*-algebra which is used in mathematical approaches to quantum theory. In the latter case, the algebra is independent of l≥1, whereas in the former case Nφ2(Rn)¯ is strictly bigger than Nφ1(Rn)¯.
PMID:38412737 | DOI:10.1016/j.neunet.2024.106181
The added value of temporal data and the best way to handle it: A use-case for atrial fibrillation using general practitioner data
Comput Biol Med. 2024 Feb 12;171:108097. doi: 10.1016/j.compbiomed.2024.108097. Online ahead of print.
ABSTRACT
INTRODUCTION: Temporal data has numerous challenges for deep learning such as irregularity of sampling. New algorithms are being developed that can handle these temporal challenges better. However, it is unclear how the performance ranges from classical non-temporal models to newly developed algorithms. Therefore, this study compares different non-temporal and temporal algorithms for a relevant use case, the prediction of atrial fibrillation (AF) using general practitioner (GP) data.
METHODS: Three datasets with a 365-day observation window and prediction windows of 14, 180 and 360 days were used. Data consisted of medication, lab, symptom, and chronic diseases codings registered by the GP. The benchmark discarded temporality and used logistic regression, XGBoost models and neural networks on the presence of codings over the whole year. Pattern data extracted common patterns of GP codings and tested using the same algorithms. LSTM and CKConv models were trained as models incorporating temporality.
RESULTS: Algorithms which incorporated temporality (LSTM and CKConv, (max AUC 0.734 at 360 days prediction window) outperformed both benchmark and pattern algorithms (max AUC 0.723, with a significant improvement using the 360 days prediction window (p = 0.04). The difference between the benchmark and the LSTM or CKConv algorithm decreased with smaller prediction windows, indicating temporal importance for longer prediction windows. The CKConv and LSTM algorithm performed similarly, possibly due to limited sequence length.
CONCLUSION: Temporal models outperformed non-temporal models for the prediction of AF. For temporal models, CKConv is a promising algorithm to handle temporal data using GP data as it can handle irregular data.
PMID:38412689 | DOI:10.1016/j.compbiomed.2024.108097
Metabolic heterogeneity in clear cell renal cell carcinoma revealed by single-cell RNA sequencing and spatial transcriptomics
J Transl Med. 2024 Feb 27;22(1):210. doi: 10.1186/s12967-024-04848-x.
ABSTRACT
BACKGROUND: Clear cell renal cell carcinoma is a prototypical tumor characterized by metabolic reprogramming, which extends beyond tumor cells to encompass diverse cell types within the tumor microenvironment. Nonetheless, current research on metabolic reprogramming in renal cell carcinoma mostly focuses on either tumor cells alone or conducts analyses of all cells within the tumor microenvironment as a mixture, thereby failing to precisely identify metabolic changes in different cell types within the tumor microenvironment.
METHODS: Gathering 9 major single-cell RNA sequencing databases of clear cell renal cell carcinoma, encompassing 195 samples. Spatial transcriptomics data were selected to conduct metabolic activity analysis with spatial localization. Developing scMet program to convert RNA-seq data into scRNA-seq data for downstream analysis.
RESULTS: Diverse cellular entities within the tumor microenvironment exhibit distinct infiltration preferences across varying histological grades and tissue origins. Higher-grade tumors manifest pronounced immunosuppressive traits. The identification of tumor cells in the RNA splicing state reveals an association between the enrichment of this particular cellular population and an unfavorable prognostic outcome. The energy metabolism of CD8+ T cells is pivotal not only for their cytotoxic effector functions but also as a marker of impending cellular exhaustion. Sphingolipid metabolism evinces a correlation with diverse macrophage-specific traits, particularly M2 polarization. The tumor epicenter is characterized by heightened metabolic activity, prominently marked by elevated tricarboxylic acid cycle and glycolysis while the pericapsular milieu showcases a conspicuous enrichment of attributes associated with vasculogenesis, inflammatory responses, and epithelial-mesenchymal transition. The scMet facilitates the transformation of RNA sequencing datasets sourced from TCGA into scRNA sequencing data, maintaining a substantial degree of correlation.
CONCLUSIONS: The tumor microenvironment of clear cell renal cell carcinoma demonstrates significant metabolic heterogeneity across various cell types and spatial dimensions. scMet exhibits a notable capability to transform RNA sequencing data into scRNA sequencing data with a high degree of correlation.
PMID:38414015 | DOI:10.1186/s12967-024-04848-x
Artificial intelligence-based model for predicting pulmonary arterial hypertension on chest x-ray images
BMC Pulm Med. 2024 Feb 27;24(1):101. doi: 10.1186/s12890-024-02891-4.
ABSTRACT
BACKGROUND: Pulmonary arterial hypertension is a serious medical condition. However, the condition is often misdiagnosed or a rather long delay occurs from symptom onset to diagnosis, associated with decreased 5-year survival. In this study, we developed and tested a deep-learning algorithm to detect pulmonary arterial hypertension using chest X-ray (CXR) images.
METHODS: From the image archive of Chiba University Hospital, 259 CXR images from 145 patients with pulmonary arterial hypertension and 260 CXR images from 260 control patients were identified; of which 418 were used for training and 101 were used for testing. Using the testing dataset for each image, the algorithm outputted a numerical value from 0 to 1 (the probability of the pulmonary arterial hypertension score). The training process employed a binary cross-entropy loss function with stochastic gradient descent optimization (learning rate parameter, α = 0.01). In addition, using the same testing dataset, the algorithm's ability to identify pulmonary arterial hypertension was compared with that of experienced doctors.
RESULTS: The area under the curve (AUC) of the receiver operating characteristic curve for the detection ability of the algorithm was 0.988. Using an AUC threshold of 0.69, the sensitivity and specificity of the algorithm were 0.933 and 0.982, respectively. The AUC of the algorithm's detection ability was superior to that of the doctors.
CONCLUSION: The CXR image-derived deep-learning algorithm had superior pulmonary arterial hypertension detection capability compared with that of experienced doctors.
PMID:38413932 | DOI:10.1186/s12890-024-02891-4
Effectiveness of opportunistic osteoporosis screening on chest CT using the DCNN model
BMC Musculoskelet Disord. 2024 Feb 27;25(1):176. doi: 10.1186/s12891-024-07297-1.
ABSTRACT
OBJECTIVE: To develop and evaluate a deep learning model based on chest CT that achieves favorable performance on opportunistic osteoporosis screening using the lumbar 1 + lumbar 2 vertebral bodies fusion feature images, and explore the feasibility and effectiveness of the model based on the lumbar 1 vertebral body alone.
MATERIALS AND METHODS: The chest CT images of 1048 health check subjects from January 2021 to June were retrospectively collected as the internal dataset (the segmentation model: 548 for training, 100 for tuning and 400 for test. The classification model: 530 for training, 100 for validation and 418 for test set). The subjects were divided into three categories according to the quantitative CT measurements, namely, normal, osteopenia and osteoporosis. First, a deep learning-based segmentation model was constructed, and the dice similarity coefficient(DSC) was used to compare the consistency between the model and manual labelling. Then, two classification models were established, namely, (i) model 1 (fusion feature construction of lumbar vertebral bodies 1 and 2) and (ii) model 2 (feature construction of lumbar 1 alone). Receiver operating characteristic curves were used to evaluate the diagnostic efficacy of the models, and the Delong test was used to compare the areas under the curve.
RESULTS: When the number of images in the training set was 300, the DSC value was 0.951 ± 0.030 in the test set. The results showed that the model 1 diagnosing normal, osteopenia and osteoporosis achieved an AUC of 0.990, 0.952 and 0.980; the model 2 diagnosing normal, osteopenia and osteoporosis achieved an AUC of 0.983, 0.940 and 0.978. The Delong test showed that there was no significant difference in area under the curve (AUC) values between the osteopenia group and osteoporosis group (P = 0.210, 0.546), while the AUC value of normal model 2 was higher than that of model 1 (0.990 vs. 0.983, P = 0.033).
CONCLUSION: This study proposed a chest CT deep learning model that achieves favorable performance on opportunistic osteoporosis screening using the lumbar 1 + lumbar 2 vertebral bodies fusion feature images. We further constructed the comparable model based on the lumbar 1 vertebra alone which can shorten the scan length, reduce the radiation dose received by patients, and reduce the training cost of technologists.
PMID:38413868 | DOI:10.1186/s12891-024-07297-1
GADNN: a revolutionary hybrid deep learning neural network for age and sex determination utilizing cone beam computed tomography images of maxillary and frontal sinuses
BMC Med Res Methodol. 2024 Feb 27;24(1):50. doi: 10.1186/s12874-024-02183-9.
ABSTRACT
INTRODUCTION: The determination of identity factors such as age and sex has gained significance in both criminal and civil cases. Paranasal sinuses like frontal and maxillary sinuses, are resistant to trauma and can aid profiling. We developed a deep learning (DL) model optimized by an evolutionary algorithm (genetic algorithm/GA) to determine sex and age using paranasal sinus parameters based on cone-beam computed tomography (CBCT).
METHODS: Two hundred and forty CBCT images (including 129 females and 111 males, aged 18-52) were included in this study. CBCT images were captured using the Newtom3G device with specific exposure parameters. These images were then analyzed in ITK-SNAP 3.6.0 beta software to extract four paranasal sinus parameters: height, width, length, and volume for both the frontal and maxillary sinuses. A hybrid model, Genetic Algorithm-Deep Neural Network (GADNN), was proposed for feature selection and classification. Traditional statistical methods and machine learning models, including logistic regression (LR), random forest (RF), multilayer perceptron neural network (MLP), and deep learning (DL) were evaluated for their performance. The synthetic minority oversampling technique was used to deal with the unbalanced data.
RESULTS: GADNN showed superior accuracy in both sex determination (accuracy of 86%) and age determination (accuracy of 68%), outperforming other models. Also, DL and RF were the second and third superior methods in sex determination (accuracy of 78% and 71% respectively) and age determination (accuracy of 92% and 57%).
CONCLUSIONS: The study introduces a novel approach combining DL and GA to enhance sex determination and age determination accuracy. The potential of DL in forensic dentistry is highlighted, demonstrating its efficiency in improving accuracy for sex determination and age determination. The study contributes to the burgeoning field of DL in dentistry and forensic sciences.
PMID:38413856 | DOI:10.1186/s12874-024-02183-9
Revolutionizing core muscle analysis in female sexual dysfunction based on machine learning
Sci Rep. 2024 Feb 27;14(1):4795. doi: 10.1038/s41598-024-54967-0.
ABSTRACT
The purpose of this study is to investigate the role of core muscles in female sexual dysfunction (FSD) and develop comprehensive rehabilitation programs to address this issue. We aim to answer the following research questions: what are the roles of core muscles in FSD, and how can machine and deep learning models accurately predict changes in core muscles during FSD? FSD is a common condition that affects women of all ages, characterized by symptoms such as decreased libido, difficulty achieving orgasm, and pain during intercourse. We conducted a comprehensive analysis of changes in core muscles during FSD using machine and deep learning. We evaluated the performance of multiple models, including multi-layer perceptron (MLP), long short-term memory (LSTM), convolutional neural network (CNN), recurrent neural network (RNN), ElasticNetCV, random forest regressor, SVR, and Bagging regressor. The models were evaluated based on mean squared error (MSE), mean absolute error (MAE), and R-squared (R2) score. Our results show that CNN and random forest regressor are the most accurate models for predicting changes in core muscles during FSD. CNN achieved the lowest MSE (0.002) and the highest R2 score (0.988), while random forest regressor also performed well with an MSE of 0.0021 and an R2 score of 0.9905. Our study demonstrates that machine and deep learning models can accurately predict changes in core muscles during FSD. The neglected core muscles play a significant role in FSD, highlighting the need for comprehensive rehabilitation programs that address these muscles. By developing these programs, we can improve the quality of life for women with FSD and help them achieve optimal sexual health.
PMID:38413786 | DOI:10.1038/s41598-024-54967-0
Deep learning algorithm applied to plain CT images to identify superior mesenteric artery abnormalities
Eur J Radiol. 2024 Feb 23;173:111388. doi: 10.1016/j.ejrad.2024.111388. Online ahead of print.
ABSTRACT
OBJECTIVES: Atypical presentations, lack of biomarkers, and low sensitivity of plain CT can delay the diagnosis of superior mesenteric artery (SMA) abnormalities, resulting in poor clinical outcomes. Our study aims to develop a deep learning (DL) model for detecting SMA abnormalities in plain CT and evaluate its performance in comparison with a clinical model and radiologist assessment.
MATERIALS AND METHODS: A total of 1048 patients comprised the internal (474 patients with SMA abnormalities, 474 controls) and external testing (50 patients with SMA abnormalities, 50 controls) cohorts. The internal cohort was divided into the training cohort (n = 776), validation cohort (n = 86), and internal testing cohort (n = 86). A total of 5 You Only Look Once version 8 (YOLOv8)-based DL submodels were developed, and the performance of the optimal submodel was compared with that of a clinical model and of experienced radiologists.
RESULTS: Of the submodels, YOLOv8x had the best performance. The area under the curve (AUC) of the YOLOv8x submodel was higher than that of the clinical model (internal test set: 0.990 vs 0.878, P =.002; external test set: 0.967 vs 0.912, P =.140) and that of all radiologists (P <.001). The YOLOv8x submodel, when compared with radiologist assessment, demonstrated higher sensitivity (internal test set: 100.0 % vs 70.7 %, P =.002; external test set: 96.0 % vs 68.8 %, P <.001) and specificity (internal test set: 90.7 % vs 66.0 %, P =.025; external test set: = 88.0 % vs 66.0 %, P <.001).
CONCLUSION: Using plain CT images, YOLOv8x was able to efficiently identify cases of SMA abnormalities. This could potentially improve early diagnosis accuracy and thus improve clinical outcomes.
PMID:38412582 | DOI:10.1016/j.ejrad.2024.111388
A deep learning based holistic diagnosis system for immunohistochemistry interpretation and molecular subtyping
Neoplasia. 2024 Feb 26;50:100976. doi: 10.1016/j.neo.2024.100976. Online ahead of print.
ABSTRACT
BACKGROUND: Breast cancer in different molecular subtypes, which is determined by the overexpression rates of human epidermal growth factor receptor 2 (HER2), estrogen receptor (ER), progesterone receptor (PR), and Ki67, exhibit distinct symptom characteristics and sensitivity to different treatment. The immunohistochemical method, one of the most common detecting tools for tumour markers, is heavily relied on artificial judgment and in clinical practice, with an inherent limitation in interpreting stability and operating efficiency. Here, a holistic intelligent breast tumour diagnosis system has been developed for tumour-markeromic analysis, combining the automatic interpretation and clinical suggestion.
METHODS: The holistic intelligent breast tumour diagnosis system included two main modules. The interpreting modules were constructed based on convolutional neural network, for comprehensively extracting and analyzing the multi-features of immunostaining. Referring to the clinical classification criteria, the interpreting results were encoded in a low-dimensional feature representation in the subtyping module, to efficiently output a holistic detecting result of the critical tumour-markeromic with diagnosis suggestions on molecular subtypes.
RESULTS: The overexpression rates of HER2, ER, PR, and Ki67, as well as an effective determination of molecular subtypes were successfully obtained by this diagnosis system, with an average sensitivity of 97.6 % and an average specificity of 96.1 %, among those, the sensitivity and specificity for interpreting HER2 were up to 99.8 % and 96.9 %.
CONCLUSION: The holistic intelligent breast tumour diagnosis system shows improved performance in the interpretation of immunohistochemical images over pathologist-level, which can be expected to overcome the limitations of conventional manual interpretation in efficiency, precision, and repeatability.
PMID:38412576 | DOI:10.1016/j.neo.2024.100976
Comparison of clinical geneticist and computer visual attention in assessing genetic conditions
PLoS Genet. 2024 Feb 27;20(2):e1011168. doi: 10.1371/journal.pgen.1011168. Online ahead of print.
ABSTRACT
The use of artificial intelligence (AI) for facial diagnostics is increasingly used in the genetics clinic to evaluate patients with potential genetic conditions. Current approaches focus on one type of AI called Deep Learning (DL). While DL- based facial diagnostic platforms have a high accuracy rate for many conditions, less is understood about how this technology assesses and classifies (categorizes) images, and how this compares to humans. To compare human and computer attention, we performed eye-tracking analyses of geneticist clinicians (n = 22) and non-clinicians (n = 22) who viewed images of people with 10 different genetic conditions, as well as images of unaffected individuals. We calculated the Intersection-over-Union (IoU) and Kullback-Leibler divergence (KL) to compare the visual attentions of the two participant groups, and then the clinician group against the saliency maps of our deep learning classifier. We found that human visual attention differs greatly from DL model's saliency results. Averaging over all the test images, IoU and KL metric for the successful (accurate) clinician visual attentions versus the saliency maps were 0.15 and 11.15, respectively. Individuals also tend to have a specific pattern of image inspection, and clinicians demonstrate different visual attention patterns than non-clinicians (IoU and KL of clinicians versus non-clinicians were 0.47 and 2.73, respectively). This study shows that humans (at different levels of expertise) and computer vision model examine images differently. Understanding these differences can improve the design and use of AI tools, and lead to more meaningful interactions between clinicians and AI technologies.
PMID:38412177 | DOI:10.1371/journal.pgen.1011168
Source-Free Domain Adaptation With Domain Generalized Pretraining for Face Anti-Spoofing
IEEE Trans Pattern Anal Mach Intell. 2024 Feb 27;PP. doi: 10.1109/TPAMI.2024.3370721. Online ahead of print.
ABSTRACT
Source-free domain adaptation (SFDA) shows the potential to improve the generalizability of deep learning-based face anti-spoofing (FAS) while preserving the privacy and security of sensitive human faces. However, existing SFDA methods are significantly degraded without accessing source data due to the inability to mitigate domain and identity bias in FAS. In this paper, we propose a novel Source-free Domain Adaptation framework for FAS (SDA-FAS) that systematically addresses the challenges of source model pre-training, source knowledge adaptation, and target data exploration under the source-free setting. Specifically, we develop a generalized method for source model pre-training that leverages a causality-inspired PatchMix data augmentation to diminish domain bias and designs the patch-wise contrastive loss to alleviate identity bias. For source knowledge adaptation, we propose a contrastive domain alignment module to align conditional distribution across domains with a theoretical equivalence to adaptation based on source data. Furthermore, target data exploration is achieved via self-supervised learning with patch shuffle augmentation to identify unseen attack types, which is ignored in existing SFDA methods. To our best knowledge, this paper provides the first full-stack privacy-preserving framework to address the generalization problem in FAS. Extensive experiments on nineteen cross-dataset scenarios show our framework considerably outperforms state-of-the-art methods.
PMID:38412088 | DOI:10.1109/TPAMI.2024.3370721
Scalable Moment Propagation and Analysis of Variational Distributions for Practical Bayesian Deep Learning
IEEE Trans Neural Netw Learn Syst. 2024 Feb 27;PP. doi: 10.1109/TNNLS.2024.3367363. Online ahead of print.
ABSTRACT
Bayesian deep learning is one of the key frameworks employed in handling predictive uncertainty. Variational inference (VI), an extensively used inference method, derives the predictive distributions by Monte Carlo (MC) sampling. The drawback of MC sampling is its extremely high computational cost compared to that of ordinary deep learning. In contrast, the moment propagation (MP)-based approach propagates the output moments of each layer to derive predictive distributions instead of MC sampling. Because of this computational property, it is expected to realize faster inference than MC-based approaches. However, the applicability of the MP-based method in deep models has not been explored sufficiently, even though some studies have demonstrated the effectiveness of MP only in small toy models. One of the reasons is that it is difficult to train deep models by MP because of the large variance in activations. To realize MP in deep models, some normalization layers are required but have not yet been studied. In addition, it is still difficult to design well-calibrated MP-based models, because the effectiveness of MP-based methods under various variational distributions has also not been investigated. In this study, we propose a fast and reliable MP-based Bayesian deep-learning method. First, to train deep-learning models using MP, we introduce a batch normalization layer extended to random variables to prevent increases in the variance of activations. Second, to identify the appropriate variational distribution in MP, we investigate the treatment of moments of several variational distributions and evaluate their uncertainty quality of predictions. Experiments with regression tasks demonstrate that the MP-based method provides qualitatively and quantitatively equivalent predictive performance to MC-based methods regardless of variational distributions. In the classification tasks, we show that we can train MP-based deep models by extended batch normalization. We also show that the MP-based approach realizes 2.0-2.8 times faster inference than the MC-based approach while maintaining the predictive performance. The results of this study can help realize a fast and well-calibrated uncertainty estimation method that can be deployed in a wider range of reliability-aware applications.
PMID:38412086 | DOI:10.1109/TNNLS.2024.3367363
Leveraging Brain Modularity Prior for Interpretable Representation Learning of fMRI
IEEE Trans Biomed Eng. 2024 Feb 27;PP. doi: 10.1109/TBME.2024.3370415. Online ahead of print.
ABSTRACT
Resting-state functional magnetic resonance imaging (rs-fMRI) can reflect spontaneous neural activities in the brain and is widely used for brain disorder analysis. Previous studies focus on extracting fMRI representations using machine/deep learning methods, but these features typically lack biological interpretability. The human brain exhibits a remarkable modular structure in spontaneous brain functional networks, with each module comprised of functionally interconnected brain regions-of-interest (ROIs). However, existing learning-based methods cannot adequately utilize such brain modularity prior. In this paper, we propose a brain modularity-constrained dynamic representation learning framework for interpretable fMRI analysis, consisting of dynamic graph construction, dynamic graph learning via a novel modularity-constrained graph neural network (MGNN), and prediction and biomarker detection. The designed MGNN is constrained by three core neurocognitive modules (i.e., salience network, central executive network, and default mode network), encouraging ROIs within the same module to share similar representations. To further enhance discriminative ability of learned features, we encourage the MGNN to preserve network topology of input graphs via a graph topology reconstruction constraint. Experimental results on 534 subjects with rs-fMRI scans from two datasets validate the effectiveness of the proposed method. The identified discriminative brain ROIs and functional connectivities can be regarded as potential fMRI biomarkers to aid in clinical diagnosis.
PMID:38412079 | DOI:10.1109/TBME.2024.3370415
Real-Time Non-Invasive Imaging and Detection of Spreading Depolarizations through EEG: An Ultra-Light Explainable Deep Learning Approach
IEEE J Biomed Health Inform. 2024 Feb 27;PP. doi: 10.1109/JBHI.2024.3370502. Online ahead of print.
ABSTRACT
A core aim of neurocritical care is to prevent secondary brain injury. Spreading depolarizations (SDs) have been identified as an important independent cause of secondary brain injury. SDs are usually detected using invasive electrocorticography recorded at high sampling frequency. Recent pilot studies suggest a possible utility of scalp electrodes generated electroencephalogram (EEG) for non-invasive SD detection. However, noise and attenuation of EEG signals makes this detection task extremely challenging. Previous methods focus on detecting temporal power change of EEG over a fixed high-density map of scalp electrodes, which is not always clinically feasible. Having a specialized spectrogram as an input to the automatic SD detection model, this study is the first to transform SD identification problem from a detection task on a 1-D time-series wave to a task on a sequential 2-D rendered imaging. This study presented a novel ultra-light-weight multi-modal deep-learning network to fuse EEG spectrogram imaging and temporal power vectors to enhance SD identification accuracy over each single electrode, allowing flexible EEG map and paving the way for SD detection on ultra-low-density EEG with variable electrode positioning. Our proposed model has an ultra-fast processing speed (<0.3 sec). Compared to the conventional methods (2 hours), this is a huge advancement towards early SD detection and to facilitate instant brain injury prognosis. Seeing SDs with a new dimension - frequency on spectrograms, we demonstrated that such additional dimension could improve SD detection accuracy, providing preliminary evidence to support the hypothesis that SDs may show implicit features over the frequency profile.
PMID:38412076 | DOI:10.1109/JBHI.2024.3370502
Revealing the Denoising Principle of Zero-Shot N2N-Based Algorithm from 1D Spectrum to 2D Image
Anal Chem. 2024 Feb 27. doi: 10.1021/acs.analchem.3c04608. Online ahead of print.
ABSTRACT
Denoising is a necessary step in image analysis to extract weak signals, especially those hardly identified by the naked eye. Unlike the data-driven deep-learning denoising algorithms relying on a clean image as the reference, Noise2Noise (N2N) was able to denoise the noise image, providing sufficiently noise images with the same subject but randomly distributed noise. Further, by introducing data augmentation to create a big data set and regularization to prevent model overfitting, zero-shot N2N-based denoising was proposed in which only a single noisy image was needed. Although various N2N-based denoising algorithms have been developed with high performance, their complicated black box operation prevented the lightweight. Therefore, to reveal the working function of the zero-shot N2N-based algorithm, we proposed a lightweight Peak2Peak algorithm (P2P) and qualitatively and quantitatively analyzed its denoising behavior on the 1D spectrum and 2D image. We found that the high-performance denoising originates from the trade-off balance between the loss function and regularization in the denoising module, where regularization is the switch of denoising. Meanwhile, the signal extraction is mainly from the self-supervised characteristic learning in the data augmentation module. Further, the lightweight P2P improved the denoising speed by at least ten times but with little performance loss, compared with that of the current N2N-based algorithms. In general, the visualization of P2P provides a reference for revealing the working function of zero-shot N2N-based algorithms, which would pave the way for the application of these algorithms toward real-time (in situ, in vivo, and operando) research improving both temporal and spatial resolutions. The P2P is open-source at https://github.com/3331822w/Peak2Peakand will be accessible online access at https://ramancloud.xmu.edu.cn/tutorial.
PMID:38412039 | DOI:10.1021/acs.analchem.3c04608
Automated Machine Learning versus Expert-Designed Models in Ocular Toxoplasmosis: Detection and Lesion Localization Using Fundus Images
Ocul Immunol Inflamm. 2024 Feb 27:1-7. doi: 10.1080/09273948.2024.2319281. Online ahead of print.
ABSTRACT
PURPOSE: Automated machine learning (AutoML) allows clinicians without coding experience to build their own deep learning (DL) models. This study assesses the performance of AutoML in detecting and localizing ocular toxoplasmosis (OT) lesions in fundus images and compares it to expert-designed models.
METHODS: Ophthalmology trainees without coding experience designed AutoML models using 304 labelled fundus images. We designed a binary model to differentiate OT from normal and an object detection model to visually identify OT lesions.
RESULTS: The AutoML model had an area under the precision-recall curve (AuPRC) of 0.945, sensitivity of 100%, specificity of 83% and accuracy of 93.5% (vs. 94%, 86% and 91% for the bespoke models). The AutoML object detection model had an AuPRC of 0.600 with a precision of 93.3% and recall of 56%. Using a diversified external validation dataset, our model correctly labeled 15 normal fundus images (100%) and 15 OT fundus images (100%), with a mean confidence score of 0.965 and 0.963, respectively.
CONCLUSION: AutoML models created by ophthalmologists without coding experience were comparable or better than expert-designed bespoke models trained on the same dataset. By creatively using AutoML to identify OT lesions on fundus images, our approach brings the whole spectrum of DL model design into the hands of clinicians.
PMID:38411944 | DOI:10.1080/09273948.2024.2319281