Deep learning
Discrimination of multiple sclerosis using scanning laser ophthalmoscopy images with autoencoder-based feature extraction
Mult Scler Relat Disord. 2024 Jun 21;88:105743. doi: 10.1016/j.msard.2024.105743. Online ahead of print.
ABSTRACT
OBJECTIVE: Optical coherence tomography (OCT) investigations have revealed that the thickness of inner retinal layers becomes decreased in multiple sclerosis (MS) patients, compared to healthy control (HC) individuals. To date, a number of studies have applied machine learning to OCT thickness measurements, aiming to enable accurate and automated diagnosis of the disease. However, there have much less emphasis on other less common retinal imaging modalities, like infrared scanning laser ophthalmoscopy (IR-SLO), for classifying MS. IR-SLO uses laser light to capture high-resolution fundus images, often performed in conjunction with OCT to lock B-scans at a fixed position.
METHODS: We incorporated two independent datasets of IR-SLO images from the Isfahan and Johns Hopkins centers, consisting of 164 MS and 150 HC images. A subject-wise data splitting approach was employed to ensure that there was no leakage between training and test datasets. Several state-of-the-art convolutional neural networks (CNNs), including VGG-16, VGG-19, ResNet-50, and InceptionV3, and a CNN with a custom architecture were employed. In the next step, we designed a convolutional autoencoder (CAE) to extract semantic features subsequently given as inputs to four conventional ML classifiers, including support vector machine (SVM), k-nearest neighbor (K-NN), random forest (RF), and multi-layer perceptron (MLP).
RESULTS: The custom CNN (85 % accuracy, 85 % sensitivity, 87 % specificity, 93 % area under the receiver operating characteristics [AUROC], and 94 % area under the precision-recall curve [AUPRC]) outperformed state-of-the-art models (84 % accuracy, 83 % sensitivity, 87 % specificity, 92 % AUROC, and 94 % AUPRC); however, utilizing a combination of the CAE and MLP yields even superior results (88 % accuracy, 86 % sensitivity, 91 % specificity, 94 % AUROC, and 95 % AUPRC).
CONCLUSIONS: We utilized IR-SLO images to differentiate between MS and HC eyes, with promising results achieved using a combination of CAE and MLP. Future multi-center studies involving more heterogenous data are necessary to assess the feasibility of integrating IR-SLO images into routine clinical practice.
PMID:38945032 | DOI:10.1016/j.msard.2024.105743
Multi-level structural damage characterization using sparse acoustic sensor networks and knowledge transferred deep learning
Ultrasonics. 2024 Jun 22;142:107390. doi: 10.1016/j.ultras.2024.107390. Online ahead of print.
ABSTRACT
Standard structural health monitoring techniques face well-known difficulties for comprehensive defect diagnosis in real-world structures that have structural, material, or geometric complexity. This motivates the exploration of machine-learning-based structural health monitoring methods in complex structures. However, creating sufficient training data sets with various defects is an ongoing challenge for data-driven machine (deep) learning algorithms. The ability to transfer the knowledge of a trained neural network from one component to another or to other sections of the same component would drastically reduce the required training data set. Also, it would facilitate computationally inexpensive machine learning based inspection systems. In this work, a machine-learning-based multi-level damage characterization is demonstrated with the ability to transfer trained knowledge within the sparse sensor network. A novel network spatial assistance and an adaptive convolution technique are proposed for efficient knowledge transfer within the deep learning algorithm. Proposed structural health monitoring method is experimentally evaluated on an aluminum plate with artificially induced defects. It was observed that the method improves the performance of knowledge transferred damage characterization by 50 % during localization and 24 % during severity assessment. Further, experiments using time windows with and without multiple edge reflections are studied. Results reveal that multiply scattered waves contain rich and deterministic defect signatures that can be mined using deep learning neural networks, improving the accuracy of both identification and quantification. In the case of a fixed sensor network, using multiply scattered waves shows 100 % prediction accuracy at all levels of damage characterization.
PMID:38945018 | DOI:10.1016/j.ultras.2024.107390
PepExplainer: An explainable deep learning model for selection-based macrocyclic peptide bioactivity prediction and optimization
Eur J Med Chem. 2024 Jun 25;275:116628. doi: 10.1016/j.ejmech.2024.116628. Online ahead of print.
ABSTRACT
Macrocyclic peptides possess unique features, making them highly promising as a drug modality. However, evaluating their bioactivity through wet lab experiments is generally resource-intensive and time-consuming. Despite advancements in artificial intelligence (AI) for bioactivity prediction, challenges remain due to limited data availability and the interpretability issues in deep learning models, often leading to less-than-ideal predictions. To address these challenges, we developed PepExplainer, an explainable graph neural network based on substructure mask explanation (SME). This model excels at deciphering amino acid substructures, translating macrocyclic peptides into detailed molecular graphs at the atomic level, and efficiently handling non-canonical amino acids and complex macrocyclic peptide structures. PepExplainer's effectiveness is enhanced by utilizing the correlation between peptide enrichment data from selection-based focused library and bioactivity data, and employing transfer learning to improve bioactivity predictions of macrocyclic peptides against IL-17C/IL-17 RE interaction. Additionally, PepExplainer underwent further validation for bioactivity prediction using an additional set of thirteen newly synthesized macrocyclic peptides. Moreover, it enabled the optimization of the IC50 of a macrocyclic peptide, reducing it from 15 nM to 5.6 nM based on the contribution score provided by PepExplainer. This achievement underscores PepExplainer's skill in deciphering complex molecular patterns, highlighting its potential to accelerate the discovery and optimization of macrocyclic peptides.
PMID:38944933 | DOI:10.1016/j.ejmech.2024.116628
Automatic quantification of scapular and glenoid morphology from CT scans using deep learning
Eur J Radiol. 2024 Jun 25;177:111588. doi: 10.1016/j.ejrad.2024.111588. Online ahead of print.
ABSTRACT
OBJECTIVES: To develop and validate an open-source deep learning model for automatically quantifying scapular and glenoid morphology using CT images of normal subjects and patients with glenohumeral osteoarthritis.
MATERIALS AND METHODS: First, we used deep learning to segment the scapula from CT images and then to identify the location of 13 landmarks on the scapula, 9 of them to establish a coordinate system unaffected by osteoarthritis-related changes, and the remaining 4 landmarks on the glenoid cavity to determine the glenoid size and orientation in this scapular coordinate system. The glenoid version, glenoid inclination, critical shoulder angle, glenopolar angle, glenoid height, and glenoid width were subsequently measured in this coordinate system. A 5-fold cross-validation was performed to evaluate the performance of this approach on 60 normal/non-osteoarthritic and 56 pathological/osteoarthritic scapulae.
RESULTS: The Dice similarity coefficient between manual and automatic scapular segmentations exceeded 0.97 in both normal and pathological cases. The average error in automatic scapular and glenoid landmark positioning ranged between 1 and 2.5 mm and was comparable between the automatic method and human raters. The automatic method provided acceptable estimates of glenoid version (R2 = 0.95), glenoid inclination (R2 = 0.93), critical shoulder angle (R2 = 0.95), glenopolar angle (R2 = 0.90), glenoid height (R2 = 0.88) and width (R2 = 0.94). However, a significant difference was found for glenoid inclination between manual and automatic measurements (p < 0.001).
CONCLUSIONS: This open-source deep learning model enables the automatic quantification of scapular and glenoid morphology from CT scans of patients with glenohumeral osteoarthritis, with sufficient accuracy for clinical use.
PMID:38944907 | DOI:10.1016/j.ejrad.2024.111588
VmmScore: An umami peptide prediction and receptor matching program based on a deep learning approach
Comput Biol Med. 2024 Jun 29;179:108814. doi: 10.1016/j.compbiomed.2024.108814. Online ahead of print.
ABSTRACT
Peptides, with recognized physiological and medical implications, such as the ability to lower blood pressure and lipid levels, are central to our research on umami taste perception. This study introduces a computational strategy to tackle the challenge of identifying optimal umami receptors for these peptides. Our VmmScore algorithm includes two integral components: Mlp4Umami, a predictive module that evaluates the umami taste potential of peptides, and mm-Score, which enhances the receptor matching process through a machine learning-optimized molecular docking and scoring system. This system encompasses the optimization of docking structures, clustering of umami peptides, and a comparative analysis of docking energies across peptide clusters, streamlining the receptor identification process. Employing machine learning, our method offers a strategic approach to the intricate task of umami receptor determination. We undertook virtual screening of peptides derived from Lateolabrax japonicus, experimentally verifying the umami taste of three identified peptides and determining their corresponding receptors. This work not only advances our understanding of the mechanisms behind umami taste perception but also provides a rapid and cost-effective method for peptide screening. The source code is publicly accessible at https://github.com/heyigacu/mlp4umami/, encouraging further scientific exploration and collaborative efforts within the research community.
PMID:38944902 | DOI:10.1016/j.compbiomed.2024.108814
Advanced Techniques for MR Neuroimaging
Magn Reson Med Sci. 2024;23(3):249-251. doi: 10.2463/mrms.e.2024-1000.
ABSTRACT
This special issue of Magnetic Resonance in Medical Sciences is dedicated to "Advanced Techniques for MR Neuroimaging," featuring nine review articles authored by leading experts. The reviews cover advancements in reproducible research practices, diffusion tensor imaging along the perivascular space, myelin imaging using magnetic susceptibility source separation, spinal cord quantitative MRI analysis, tractometry of visual white matter pathways, deep learning-based image enhancement, arterial spin labeling, the potential of radiomics, and MRI-based quantification of brain oxygen metabolism. These articles provide a comprehensive update on cutting-edge technologies and their applications in clinical and research settings, highlighting their impact on improving diagnostic accuracy and understanding of neurological disorders.
PMID:38945942 | DOI:10.2463/mrms.e.2024-1000
Direct Positron Emission Imaging Using Ultrafast Timing Performance Detectors
Igaku Butsuri. 2024;44(2):29-35. doi: 10.11323/jjmp.44.2_29.
ABSTRACT
This is an explanatory paper on Sun Il Kwon et al., Nat. Photon. 15: 914-918, 2021 and some parts of this manuscript are translated from the paper. Medical imaging modalities such as X-ray computed tomography, Magnetic resonance imaging, positron emission tomography (PET), and single photon emission computed tomography, require image reconstruction processes, consequently constraining them to form cylindrical shapes. However, among them, only PET can use additional information, so called time of flight, on an event-by-event basis. If coincidence time resolution (CTR) of PET detectors improved to 30 ps, which corresponds to spatial resolution of 4.5 mm, directly localizing electron-positron annihilation point is possible, allowing us to circumvent image reconstruction processes and free us from the geometric constraint. We call this concept direct positron emission imaging (dPEI). We have developed ultrafast radiation detectors by focusing on Cherenkov photon detection. Furthermore, the CTR of 32 ps being equivalent to 4.8 mm spatial resolution is achieved by combining deep learning-based signal processing with the detectors. In this article, we explain how we developed the detectors and demonstrated the first dPEI using different types of phantoms, how we will tackle limitations to be addressed to make the dPEI more practical, and how dPEI will emerge as an imaging modality in nuclear medicine.
PMID:38945880 | DOI:10.11323/jjmp.44.2_29
DeepFace: Deep-learning-based framework to contextualize orofacial-cleft-related variants during human embryonic craniofacial development
HGG Adv. 2024 Jun 29;5(3):100322. doi: 10.1016/j.xhgg.2024.100322. Online ahead of print.
NO ABSTRACT
PMID:38944832 | DOI:10.1016/j.xhgg.2024.100322
Low muscle quality on a procedural computed tomography scan assessed with deep learning as a practical useful predictor of mortality in patients with severe aortic valve stenosis
Clin Nutr ESPEN. 2024 Jun 17;63:142-147. doi: 10.1016/j.clnesp.2024.06.013. Online ahead of print.
ABSTRACT
BACKGROUND & AIMS: Accurate diagnosis of sarcopenia requires evaluation of muscle quality, which refers to the amount of fat infiltration in muscle tissue. In this study, we aim to investigate whether we can independently predict mortality risk in transcatheter aortic valve implantation (TAVI) patients, using automatic deep learning algorithms to assess muscle quality on procedural computed tomography (CT) scans.
METHODS: This study included 1199 patients with severe aortic stenosis who underwent transcatheter aortic valve implantation (TAVI) between January 2010 and January 2020. A procedural CT scan was performed as part of the preprocedural-TAVI evaluation, and the scans were analyzed using deep-learning-based software to automatically determine skeletal muscle density (SMD) and intermuscular adipose tissue (IMAT). The association of SMD and IMAT with all-cause mortality was analyzed using a Cox regression model, adjusted for other known mortality predictors, including muscle mass.
RESULTS: The mean age of the participants was 80 ± 7 years, 53% were female. The median observation time was 1084 days, and the overall mortality rate was 39%. We found that the lowest tertile of muscle quality, as determined by SMD, was associated with an increased risk of mortality (HR 1.40 [95%CI: 1.15-1.70], p < 0.01). Similarly, low muscle quality as defined by high IMAT in the lowest tertile was also associated with increased mortality risk (HR 1.24 [95%CI: 1.01-1.52], p = 0.04).
CONCLUSIONS: Our findings suggest that deep learning-assessed low muscle quality, as indicated by fat infiltration in muscle tissue, is a practical, useful and independent predictor of mortality after TAVI.
PMID:38944828 | DOI:10.1016/j.clnesp.2024.06.013
A novel artificial intelligence model for measuring fetal intracranial markers during the first trimester based on two-dimensional ultrasound image
Int J Gynaecol Obstet. 2024 Jun 30. doi: 10.1002/ijgo.15762. Online ahead of print.
ABSTRACT
OBJECTIVE: To establish reference ranges of fetal intracranial markers during the first trimester and develop the first novel artificial intelligence (AI) model to measure key markers automatically.
METHODS: This retrospective study used two-dimensional (2D) ultrasound images from 4233 singleton normal fetuses scanned at 11+0-13+6 weeks of gestation at the Affiliated Suzhou Hospital of Nanjing Medical University from January 2018 to July 2022. We analyzed 10 key markers in three important planes of the fetal head. Based on these, reference ranges of 10 fetal intracranial markers were established and an AI model was developed for automated marker measurement. AI and manual measurements were compared to evaluate differences, correlations, consistency, and time consumption based on mean error, Pearson correlation analysis, intraclass correlation coefficients (ICCs), and average measurement time.
RESULTS: The results of AI and manual methods had strong consistency and correlation (all ICC values >0.75, all r values >0.75, and all P values <0.001). The average absolute error of both only ranged from 0.124 to 0.178 mm. AI achieved a 100% detection rate for abnormal cases. Additionally, the average measurement time of AI was only 0.49 s, which was more than 65 times faster than the manual measurement method.
CONCLUSION: The present study first established the normal standard reference ranges of fetal intracranial markers based on a large Chinese population data set. Furthermore, the proposed AI model demonstrated its capability to measure multiple fetal intracranial markers automatically, serving as a highly effective tool to streamline sonographer tasks and mitigate manual measurement errors, which can be generalized to first-trimester scanning.
PMID:38944698 | DOI:10.1002/ijgo.15762
A Comparison of CT-Based Pancreatic Segmentation Deep Learning Models
Acad Radiol. 2024 Jun 28:S1076-6332(24)00373-8. doi: 10.1016/j.acra.2024.06.015. Online ahead of print.
ABSTRACT
RATIONALE AND OBJECTIVES: Pancreas segmentation accuracy at CT is critical for the identification of pancreatic pathologies and is essential for the development of imaging biomarkers. Our objective was to benchmark the performance of five high-performing pancreas segmentation models across multiple metrics stratified by scan and patient/pancreatic characteristics that may affect segmentation performance.
MATERIALS AND METHODS: In this retrospective study, PubMed and ArXiv searches were conducted to identify pancreas segmentation models which were then evaluated on a set of annotated imaging datasets. Results (Dice score, Hausdorff distance [HD], average surface distance [ASD]) were stratified by contrast status and quartiles of peri-pancreatic attenuation (5 mm region around pancreas). Multivariate regression was performed to identify imaging characteristics and biomarkers (n = 9) that were significantly associated with Dice score.
RESULTS: Five pancreas segmentation models were identified: Abdomen Atlas [AAUNet, AASwin, trained on 8448 scans], TotalSegmentator [TS, 1204 scans], nnUNetv1 [MSD-nnUNet, 282 scans], and a U-Net based model for predicting diabetes [DM-UNet, 427 scans]. These were evaluated on 352 CT scans (30 females, 25 males, 297 sex unknown; age 58 ± 7 years [ ± 1 SD], 327 age unknown) from 2000-2023. Overall, TS, AAUNet, and AASwin were the best performers, Dice= 80 ± 11%, 79 ± 16%, and 77 ± 18%, respectively (pairwise Sidak test not-significantly different). AASwin and MSD-nnUNet performed worse (for all metrics) on non-contrast scans (vs contrast, P < .001). The worst performer was DM-UNet (Dice=67 ± 16%). All algorithms except TS showed lower Dice scores with increasing peri-pancreatic attenuation (P < .01). Multivariate regression showed non-contrast scans, (P < .001; MSD-nnUNet), smaller pancreatic length (P = .005, MSD-nnUNet), and height (P = .003, DM-UNet) were associated with lower Dice scores.
CONCLUSION: The convolutional neural network-based models trained on a diverse set of scans performed best (TS, AAUnet, and AASwin). TS performed equivalently to AAUnet and AASwin with only 13% of the training set size (8488 vs 1204 scans). Though trained on the same dataset, a transformer network (AASwin) had poorer performance on non-contrast scans whereas its convolutional network counterpart (AAUNet) did not. This study highlights how aggregate assessment metrics of pancreatic segmentation algorithms seen in other literature are not enough to capture differential performance across common patient and scanning characteristics in clinical populations.
PMID:38944630 | DOI:10.1016/j.acra.2024.06.015
Optical Imaging for Diabetic Retinopathy Diagnosis And Detection Using Ensemble Models
Photodiagnosis Photodyn Ther. 2024 Jun 27:104259. doi: 10.1016/j.pdpdt.2024.104259. Online ahead of print.
ABSTRACT
Diabetes, characterized by heightened blood sugar levels, can lead to a condition called Diabetic Retinopathy (DR), which adversely impacts the eyes due to elevated blood sugar affecting the retinal blood vessels. The most common cause of blindness in diabetics is thought to be Diabetic Retinopathy (DR), particularly in working-age individuals living in poor nations. People with type 1 or type 2 diabetes may develop this illness, and the risk rises with the length of diabetes and inadequate blood sugar management. There are limits to traditional approaches for the early identification of diabetic retinopathy (DR). In order to diagnose diabetic retinopathy, a model based on Convolutional neural network (CNN) is used in a unique way in this research. The suggested model uses a number of deep learning (DL) models, such as VGG19, Resnet50, and InceptionV3, to extract features. After concatenation, these characteristics are sent through the CNN algorithm for classification. By combining the advantages of several models, ensemble approaches can be effective tools for detecting diabetic retinopathy and increase overall performance and resilience. Classification and image recognition are just a few of the tasks that may be accomplished with ensemble approaches like combination of VGG19,Inception V3 and Resnet 50 to achieve high accuracy. The proposed model is evaluated using a publicly accessible collection of fundus images.VGG19, ResNet50, and InceptionV3 differ in their neural network architectures, feature extraction capabilities, object detection methods, and approaches to retinal delineation. VGG19 may excel in capturing fine details, ResNet50 in recognizing complex patterns, and InceptionV3 in efficiently capturing multi-scale features. Their combined use in an ensemble approach can provide a comprehensive analysis of retinal images, aiding in the delineation of retinal regions and identification of abnormalities associated with diabetic retinopathy. For instance, micro aneurysms, the earliest signs of DR, often require precise detection of subtle vascular abnormalities. VGG19's proficiency in capturing fine details allows for the identification of these minute changes in retinal morphology. On the other hand, ResNet50's strength lies in recognizing intricate patterns, making it effective in detecting neo neo-vascularization and complex hemorrhagic lesions. Meanwhile, InceptionV3's multi-scale feature extraction enables comprehensive analysis, crucial for assessing macular edema and ischemic changes across different retinal layers.
PMID:38944405 | DOI:10.1016/j.pdpdt.2024.104259
Machine learning for the advancement of genome-scale metabolic modeling
Biotechnol Adv. 2024 Jun 27:108400. doi: 10.1016/j.biotechadv.2024.108400. Online ahead of print.
ABSTRACT
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
PMID:38944218 | DOI:10.1016/j.biotechadv.2024.108400
Measurement of the Acetabular Cup Orientation after Total Hip Arthroplasty Based on Three-Dimensional Reconstruction from a Single X-ray Image Using Generative Adversarial Networks
J Arthroplasty. 2024 Jun 27:S0883-5403(24)00680-6. doi: 10.1016/j.arth.2024.06.059. Online ahead of print.
ABSTRACT
BACKGROUND: The purpose of this study was to reconstruct three-dimensional (3D) computed tomography (CT) images from single anteroposterior (AP) postoperative total hip arthroplasty (THA) X-ray images using a deep learning algorithm known as generative adversarial networks (GANs) and to validate the accuracy of cup angle measurement on GAN-generated CT.
METHODS: We used two GAN-based models, CycleGAN and X2CT-GAN, to generate 3D CT images from X-ray images of 386 patients who underwent primary THAs using a cementless cup. The training dataset consisted of 522 CT images and 2,282 X-ray images. The image quality was validated using the peak signal-to-noise ratio (PSNR) and the structural similarity index measure (SSIM). The cup anteversion and inclination measurements on the GAN-generated CT images were compared with the actual CT measurements. Statistical analyses of absolute measurement errors were performed using Mann-Whitney U tests and nonlinear regression analyses.
RESULTS: The study successfully achieved 3D reconstruction from single AP postoperative THA X-ray images using GANs, exhibiting excellent PSNR (37.40) and SSIM (0.74). The median absolute difference in radiographic anteversion (RA) was 3.45° and the median absolute difference in radiographic inclination (RI) was 3.25°, respectively. Absolute measurement errors tended to be larger in cases with cup malposition than in those with optimal cup orientation.
CONCLUSION: This study demonstrates the potential of GANs for 3D reconstruction from single AP postoperative THA X-ray images to evaluate cup orientation. Further investigation and refinement of this model are required to improve its performance.
PMID:38944061 | DOI:10.1016/j.arth.2024.06.059
Image-based deep learning model using DNA methylation data predicts the origin of cancer of unknown primary
Neoplasia. 2024 Jun 28;55:101021. doi: 10.1016/j.neo.2024.101021. Online ahead of print.
ABSTRACT
Cancer of unknown primary (CUP) is a rare type of metastatic cancer in which the origin of the tumor is unknown. Since the treatment strategy for patients with metastatic tumors depends on knowing the primary site, accurate identification of the origin site is important. Here, we developed an image-based deep-learning model that utilizes a vision transformer algorithm for predicting the origin of CUP. Using DNA methylation dataset of 8,233 primary tumors from The Cancer Genome Atlas (TCGA), we categorized 29 cancer types into 18 organ classes and extracted 2,312 differentially methylated CpG sites (DMCs) from non-squamous cancer group and 420 DMCs from squamous cell cancer group. Using these DMCs, we created organ-specific DNA methylation images and used them for model training and testing. Model performance was evaluated using 394 metastatic cancer samples from TCGA (TCGA-meta) and 995 samples (693 primary and 302 metastatic cancers) obtained from 20 independent external studies. We identified that the DNA methylation image reveals a distinct pattern based on the origin of cancer. Our model achieved an overall accuracy of 96.95 % in the TCGA-meta dataset. In the external validation datasets, our classifier achieved overall accuracies of 96.39 % and 94.37 % in primary and metastatic tumors, respectively. Especially, the overall accuracies for both primary and metastatic samples of non-squamous cell cancer were exceptionally high, with 96.79 % and 96.85 %, respectively.
PMID:38943996 | DOI:10.1016/j.neo.2024.101021
A topological description of loss surfaces based on Betti Numbers
Neural Netw. 2024 Jun 14;178:106465. doi: 10.1016/j.neunet.2024.106465. Online ahead of print.
ABSTRACT
In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts in identifying spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure for evaluating loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds for the complexity of their respective loss functions and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an ℓ2 regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.
PMID:38943863 | DOI:10.1016/j.neunet.2024.106465
Leveraging temporal dependency for cross-subject-MI BCIs by contrastive learning and self-attention
Neural Netw. 2024 Jun 17;178:106470. doi: 10.1016/j.neunet.2024.106470. Online ahead of print.
ABSTRACT
Brain-computer interfaces (BCIs) built based on motor imagery paradigm have found extensive utilization in motor rehabilitation and the control of assistive applications. However, traditional MI-BCI systems often exhibit suboptimal classification performance and require significant time for new users to collect subject-specific training data. This limitation diminishes the user-friendliness of BCIs and presents significant challenges in developing effective subject-independent models. In response to these challenges, we propose a novel subject-independent framework for learning temporal dependency for motor imagery BCIs by Contrastive Learning and Self-attention (CLS). In CLS model, we incorporate self-attention mechanism and supervised contrastive learning into a deep neural network to extract important information from electroencephalography (EEG) signals as features. We evaluate the CLS model using two large public datasets encompassing numerous subjects in a subject-independent experiment condition. The results demonstrate that CLS outperforms six baseline algorithms, achieving a mean classification accuracy improvement of 1.3 % and 4.71 % than the best algorithm on the Giga dataset and OpenBMI dataset, respectively. Our findings demonstrate that CLS can effectively learn invariant discriminative features from training data obtained from non-target subjects, thus showcasing its potential for building models for new users without the need for calibration.
PMID:38943861 | DOI:10.1016/j.neunet.2024.106470
Reducing annotating load: Active learning with synthetic images in surgical instrument segmentation
Med Image Anal. 2024 Jun 22;97:103246. doi: 10.1016/j.media.2024.103246. Online ahead of print.
ABSTRACT
Accurate instrument segmentation in the endoscopic vision of minimally invasive surgery is challenging due to complex instruments and environments. Deep learning techniques have shown competitive performance in recent years. However, deep learning usually requires a large amount of labeled data to achieve accurate prediction, which poses a significant workload. To alleviate this workload, we propose an active learning-based framework to generate synthetic images for efficient neural network training. In each active learning iteration, a small number of informative unlabeled images are first queried by active learning and manually labeled. Next, synthetic images are generated based on these selected images. The instruments and backgrounds are cropped out and randomly combined with blending and fusion near the boundary. The proposed method leverages the advantage of both active learning and synthetic images. The effectiveness of the proposed method is validated on two sinus surgery datasets and one intraabdominal surgery dataset. The results indicate a considerable performance improvement, especially when the size of the annotated dataset is small. All the code is open-sourced at: https://github.com/HaonanPeng/active_syn_generator.
PMID:38943835 | DOI:10.1016/j.media.2024.103246
Quantification of litter in cities using a smartphone application and citizen science in conjunction with deep learning-based image processing
Waste Manag. 2024 Jun 28;186:271-279. doi: 10.1016/j.wasman.2024.06.026. Online ahead of print.
ABSTRACT
Cities are a major source of litter pollution. Determination of the abundance and composition of plastic litter in cities is imperative for effective pollution management, environmental protection, and sustainable urban development. Therefore, here, a multidisciplinary approach to quantify and classify the abundance of litter in urban environments is proposed. In the present study, litter data collection was integrated via the Pirika smartphone application and conducted image analysis based on deep learning. Pirika was launched in May 2018 and, to date, has collected approximately one million images. Visual classification revealed that the most common types of litter were cans, plastic bags, plastic bottles, cigarette butts, cigarette boxes, and sanitary masks, in that order. The top six categories accounted for approximately 80 % of the total, whereas the top three categories accounted for more than 60 % of the total imaged litter. A deep-learning image processing algorithm was developed to automatically identify the top six litter categories. Both precision and recall derived from the model were higher than 75 %, enabling proper litter categorization. The quantity of litter derived from automated image processing was also plotted on a map using location data acquired concurrently with the images by the smartphone application. Conclusively, this study demonstrates that citizen science supported by smartphone applications and deep learning-based image processing can enable the visualization, quantification, and characterization of street litter in cities.
PMID:38943818 | DOI:10.1016/j.wasman.2024.06.026
Utilizing improved YOLOv8 based on SPD-BRSA-AFPN for ultrasonic phased array non-destructive testing
Ultrasonics. 2024 Jun 26;142:107382. doi: 10.1016/j.ultras.2024.107382. Online ahead of print.
ABSTRACT
Non-destructive testing (NDT) is a technique for inspecting materials and their defects without causing damage to the tested components. Phased array ultrasonic testing (PAUT) has emerged as a hot topic in industrial NDT applications. Currently, the collection of ultrasound data is mostly automated, while the analysis of the data is still predominantly carried out manually. Manual analysis of scan image defects is inefficient and prone to instability, prompting the need for computer-based solutions. Deep learning-based object detection methods have shown promise in addressing such challenges recently. This approach typically demands a substantial amount of high-resolution, well-annotated training data, which is challenging to obtain in NDT. Consequently, it becomes difficult to detect low-resolution images and defects with varying positional sizes. This work proposes improvements based on the state-of-the-art YOLOv8 algorithm to enhance the accuracy and efficiency of defect detection in phased-array ultrasonic testing. The space-to-depth convolution (SPD-Conv) is imported to replace strided convolution, mitigating information loss during convolution operations and improving detection performance on low-resolution images. Additionally, this paper constructs and incorporates the bi-level routing and spatial attention module (BRSA) into the backbone, generating multiscale feature maps with richer details. In the neck section, the original structure is replaced by the asymptotic feature pyramid network (AFPN) to reduce model parameters and computational complexity. After testing on public datasets, in comparison to YOLOv8 (the baseline), this algorithm achieves high-quality detection of flat bottom holes (FBH) and aluminium blocks on the simulated dataset. More importantly, for the challenging-to-detect defect side-drilled holes (SDH), it achieves F1 scores (weighted average of precision and recall) of 82.50% and intersection over union (IOU) of 65.96%, representing an improvement of 17.56% and 0.43%. On the experimental dataset, the F1 score and IOU for FBH reach 75.68% (an increase of 9.01%) and 83.79%, respectively. Simultaneously, the proposed algorithm demonstrates robust performance in the presence of external noise, while maintaining exceptionally high computational efficiency and inference speed. These experimental results validate the high detection performance of the proposed intelligent defect detection algorithm for ultrasonic images, which contributes to the advancement of the smart industry.
PMID:38943732 | DOI:10.1016/j.ultras.2024.107382