Deep learning
Challenges and solutions of deep learning-based automated liver segmentation: A systematic review
Comput Biol Med. 2024 Dec 5;185:109459. doi: 10.1016/j.compbiomed.2024.109459. Online ahead of print.
ABSTRACT
The liver is one of the vital organs in the body. Precise liver segmentation in medical images is essential for liver disease treatment. The deep learning-based liver segmentation process faces several challenges. This research aims to analyze the challenges of liver segmentation in prior studies and identify the modifications made to network models and other enhancements implemented by researchers to tackle each challenge. In total, 88 articles from Scopus and ScienceDirect databases published between January 2016 and January 2022 have been studied. The liver segmentation challenges are classified into five main categories, each containing some subcategories. For each challenge, the proposed technique to overcome the challenge is investigated. The provided report details the authors, publication years, dataset types, imaging technologies, and evaluation metrics of all references for comparison. Additionally, a summary table outlines the challenges and solutions.
PMID:39642700 | DOI:10.1016/j.compbiomed.2024.109459
Learning soft tissue deformation from incremental simulations
Med Phys. 2024 Dec 6. doi: 10.1002/mp.17554. Online ahead of print.
ABSTRACT
BACKGROUND: Surgical planning for orthognathic procedures demands swift and accurate biomechanical modeling of facial soft tissues. Efficient simulations are vital in the clinical pipeline, as surgeons may iterate through multiple plans. Biomechanical simulations typically use the finite element method (FEM). Prior works divide FEM simulations into increments to enhance convergence and accuracy. However, this practice elongates simulation time, thereby impeding clinical integration. To accelerate simulations, deep learning (DL) models have been explored. Yet, previous efforts either perform simulations in a single step or neglect the temporal aspects in incremental simulations.
PURPOSE: This study investigates the use of spatiotemporal incremental modeling for biomechanics simulations of facial soft tissue.
METHODS: We implement the method using a graph neural network. Our method synergizes spatial features with temporal aggregation using DL networks trained on incremental FEM simulations from 17 subjects that underwent orthognathic surgery.
RESULTS: Our proposed spatiotemporal incremental method achieved a mean accuracy of 0.37 mm with a mean computation time of 1.52 s. In comparison, a spatial-only incremental method yielded a mean accuracy of 0.44 mm and a mean computation time of 1.60 s, while a spatial-only single-step method yielded a mean accuracy of 0.41 mm and a mean computation time of 0.05 s.
CONCLUSIONS: Statistical analysis demonstrated that the spatiotemporal incremental method reduced mean errors compared to the spatial-only incremental method, emphasizing the importance of incorporating temporal information in incremental simulations. Overall, we successfully implemented spatiotemporal incremental learning tailored to simulate soft tissue deformation while substantially reducing simulation time compared to FEM.
PMID:39642013 | DOI:10.1002/mp.17554
Magnetic resonance image denoising for Rician noise using a novel hybrid transformer-CNN network (HTC-net) and self-supervised pretraining
Med Phys. 2024 Dec 6. doi: 10.1002/mp.17562. Online ahead of print.
ABSTRACT
BACKGROUND: Magnetic resonance imaging (MRI) is a crucial technique for both scientific research and clinical diagnosis. However, noise generated during MR data acquisition degrades image quality, particularly in hyperpolarized (HP) gas MRI. While deep learning (DL) methods have shown promise for MR image denoising, most of them fail to adequately utilize the long-range information which is important to improve denoising performance. Furthermore, the sample size of paired noisy and noise-free MR images also limits denoising performance.
PURPOSE: To develop an effective DL method that enhances denoising performance and reduces the requirement of paired MR images by utilizing the long-range information and pretraining.
METHODS: In this work, a hybrid Transformer-convolutional neural network (CNN) network (HTC-net) and a self-supervised pretraining strategy are proposed, which effectively enhance the denoising performance. In HTC-net, a CNN branch is exploited to extract the local features. Then a Transformer-CNN branch with two parallel encoders is designed to capture the long-range information. Within this branch, a residual fusion block (RFB) with a residual feature processing module and a feature fusion module is proposed to aggregate features at different resolutions extracted by two parallel encoders. After that, HTC-net exploits the comprehensive features from the CNN branch and the Transformer-CNN branch to accurately predict noise-free MR images through a reconstruction module. To further enhance the performance on limited MRI datasets, a self-supervised pretraining strategy is proposed. This strategy employs self-supervised denoising to equip the HTC-net with denoising capabilities during pretraining, and then the pre-trained parameters are transferred to facilitate subsequent supervised training.
RESULTS: Experimental results on the pulmonary HP 129Xe MRI dataset (1059 images) and IXI dataset (5000 images) all demonstrate the proposed method outperforms the state-of-the-art methods, exhibiting superior preservation of edges and structures. Quantitatively, on the pulmonary HP 129Xe MRI dataset, the proposed method outperforms the state-of-the-art methods by 0.254-0.597 dB in PSNR and 0.007-0.013 in SSIM. On the IXI dataset, the proposed method outperforms the state-of-the-art methods by 0.3-0.927 dB in PSNR and 0.003-0.016 in SSIM.
CONCLUSIONS: The proposed method can effectively enhance the quality of MR images, which helps improve the diagnosis accuracy in clinical.
PMID:39641989 | DOI:10.1002/mp.17562
Estimation of fatty acid composition in mammary adipose tissue using deep neural network with unsupervised training
Magn Reson Med. 2024 Dec 6. doi: 10.1002/mrm.30401. Online ahead of print.
ABSTRACT
PURPOSE: To develop a deep learning-based method for robust and rapid estimation of the fatty acid composition (FAC) in mammary adipose tissue.
METHODS: A physics-based unsupervised deep learning network for estimation of fatty acid composition-network (FAC-Net) is proposed to estimate the number of double bonds and number of methylene-interrupted double bonds from multi-echo bipolar gradient-echo data, which are subsequently converted to saturated, mono-unsaturated, and poly-unsaturated fatty acids. The loss function was based on a 10 fat peak signal model. The proposed network was tested with a phantom containing eight oils with different FAC and on post-menopausal women scanned using a whole-body 3T MRI system between February 2022 and January 2024. The post-menopausal women included a control group (n = 8) with average risk for breast cancer and a cancer group (n = 7) with biopsy-proven breast cancer.
RESULTS: The FAC values of eight oils in the phantom showed strong correlations between the measured and reference values (R2 > 0.9 except chain length). The FAC values measured from scan and rescan data of the control group showed no significant difference between the two scans. The FAC measurements of the cancer group conducted before contrast and after contrast showed a significant difference in saturated fatty acid and mono-unsaturated fatty acid. The cancer group has higher saturated fatty acid than the control group, although not statistically significant.
CONCLUSION: The results in this study suggest that the proposed FAC-Net can be used to measure the FAC of mammary adipose tissue from gradient-echo MRI data of the breast.
PMID:39641987 | DOI:10.1002/mrm.30401
[PSI]-CIC: A Deep-Learning Pipeline for the Annotation of Sectored Saccharomyces cerevisiae Colonies
Bull Math Biol. 2024 Dec 6;87(1):12. doi: 10.1007/s11538-024-01379-w.
ABSTRACT
The [ P S I + ] prion phenotype in yeast manifests as a white, pink, or red color pigment. Experimental manipulations destabilize prion phenotypes, and allow colonies to exhibit [ p s i - ] (red) sectored phenotypes within otherwise completely white colonies. Further investigation of the size and frequency of sectors that emerge as a result of experimental manipulation is capable of providing critical information on mechanisms of prion curing, but we lack a way to reliably extract this information. Images of experimental colonies exhibiting sectored phenotypes offer an abundance of data to help uncover molecular mechanisms of sectoring, yet the structure of sectored colonies is ignored in traditional biological pipelines. In this study, we present [PSI]-CIC, the first computational pipeline designed to identify and characterize features of sectored yeast colonies. To overcome the barrier of a lack of manually annotated data of colonies, we develop a neural network architecture that we train on synthetic images of colonies and apply to real images of [ P S I + ] , [ p s i - ] , and sectored colonies. In hand-annotated experimental images, our pipeline correctly predicts the state of approximately 95% of colonies detected and frequency of sectors in approximately 89.5% of colonies detected. The scope of our pipeline could be extended to categorizing colonies grown under different experimental conditions, allowing for more meaningful and detailed comparisons between experiments. Our approach streamlines the analysis of sectored yeast colonies providing a rich set of quantitative metrics and provides insight into mechanisms driving the curing of prion phenotypes.
PMID:39641894 | DOI:10.1007/s11538-024-01379-w
Leveraging a Vision Transformer Model to Improve Diagnostic Accuracy of Cardiac Amyloidosis With Cardiac Magnetic Resonance
JACC Cardiovasc Imaging. 2024 Nov 22:S1936-878X(24)00417-0. doi: 10.1016/j.jcmg.2024.09.010. Online ahead of print.
ABSTRACT
BACKGROUND: Cardiac magnetic resonance (CMR) imaging is an important diagnostic tool for diagnosis of cardiac amyloidosis (CA). However, discrimination of CA from other etiologies of myocardial disease can be challenging.
OBJECTIVES: The aim of this study was to develop and rigorously validate a deep learning (DL) algorithm to aid in the discrimination of CA using cine and late gadolinium enhancement CMR imaging.
METHODS: A DL model using a retrospective cohort of 807 patients who were referred for CMR for suspicion of infiltrative disease or hypertrophic cardiomyopathy (HCM) was developed. Confirmed definitive diagnosis was as follows: 252 patients with CA, 290 patients with HCM, and 265 with neither CA or HCM (other). This cohort was split 70/30 into training and test sets. A vision transformer (ViT) model was trained primarily to identify CA. The model was validated in an external cohort of 157 patients also referred for CMR for suspicion of infiltrative disease or HCM (51 CA, 49 HCM, 57 other).
RESULTS: The ViT model achieved a diagnostic accuracy (84.1%) and an area under the curve of 0.954 in the internal testing data set. The ViT model further demonstrated an accuracy of 82.8% and an area under the curve of 0.957 in the external testing set. The ViT model achieved an accuracy of 90% (n = 55 of 61), among studies with clinical reports with moderate/high confidence diagnosis of CA, and 61.1% (n = 22 of 36) among studies with reported uncertain, missing, or incorrect diagnosis of CA in the internal cohort. DL accuracy of this cohort increased to 79.1% when studies with poor image quality, dual pathologies, or ambiguity of clinically significant CA diagnosis were removed.
CONCLUSIONS: A ViT model using only cine and late gadolinium enhancement CMR images can achieve high accuracy in differentiating CA from other underlying etiologies of suspected cardiomyopathy, especially in cases when reported human diagnostic confidence was uncertain in both a large single state health system and in an external CA cohort.
PMID:39641685 | DOI:10.1016/j.jcmg.2024.09.010
Reveal the potent antidepressant effects of Zhi-Zi-Hou-Pu Decoction based on integrated network pharmacology and DDI analysis by deep learning
Heliyon. 2024 Oct 3;10(22):e38726. doi: 10.1016/j.heliyon.2024.e38726. eCollection 2024 Nov 30.
ABSTRACT
BACKGROUND AND OBJECTIVE: The multi-targets and multi-components of Traditional Chinese medicine (TCM) coincide with the complex pathogenesis of depression. Zhi-Zi-Hou-Pu Decoction (ZZHPD) has been approved in clinical medication with good antidepression effects for centuries, while the mechanisms under the iceberg haven't been addressed systematically. This study explored its inner active ingredients - potent pharmacological mechanism - DDI to explore more comprehensively and deeply understanding of the complicated TCM in treatment.
METHODS: This research utilized network pharmacology combined with molecular docking to identify pharmacological targets and molecular interactions between ZZHPD and depression. Verification of major active compounds was conducted through UPLC-Q-TOF-MS/MS and assays on LPS-induced neuroblastoma cells. Additionally, the DDIMDL model, a deep learning-based approach, was used to predict DDIs, focusing on serum concentration, metabolism, effectiveness, and adverse reactions.
RESULTS: The antidepressant mechanisms of ZZHPD involve the serotonergic synapse, neuroactive ligand-receptor interaction, and dopaminergic synapse signaling pathways. Eighteen active compounds were identified, with honokiol and eriocitrin significantly modulating neuronal inflammation and promoting differentiation of neuroimmune cells through genes like COMT, PI3KCA, PTPN11, and MAPK1. DDI predictions indicated that eriocitrin's serum concentration increases when combined with hesperidin, while hesperetin's metabolism decreases with certain flavonoids. These findings provide crucial insights into the nervous system's effectiveness and potential cardiovascular or nervous system adverse reactions from core compound combinations.
CONCLUSIONS: This study provides insights into the TCM interpretation, drug compatibility or combined medication for further clinical application or potential drug pairs with a cost-effective method of integrated network pharmacology and deep learning.
PMID:39641032 | PMC:PMC11617927 | DOI:10.1016/j.heliyon.2024.e38726
Development and validation of a deep learning pipeline to diagnose ovarian masses using ultrasound screening: a retrospective multicenter study
EClinicalMedicine. 2024 Nov 19;78:102923. doi: 10.1016/j.eclinm.2024.102923. eCollection 2024 Dec.
ABSTRACT
BACKGROUND: Ovarian cancer has the highest mortality rate among gynaecological malignancies and is initially screened using ultrasound. Owing to the high complexity of ultrasound images of ovarian masses and the anatomical characteristics of the deep pelvic cavity, subjective assessment requires extensive experience and skill. Therefore, detecting the ovaries and ovarian masses and diagnose ovarian cancer are challenging. In the present study, we aimed to develop an automated deep learning framework, the Ovarian Multi-Task Attention Network (OvaMTA), for ovary and ovarian mass detection, segmentation, and classification, as well as further diagnosis of ovarian masses based on ultrasound screening.
METHODS: Between June 2020 and May 2022, the OvaMTA model was trained, validated and tested on a training and validation cohort including 6938 images and an internal testing cohort including 1584 images which were recruited from 21 hospitals involving women who underwent ultrasound examinations for ovarian masses. Subsequently, we recruited two external test cohorts from another two hospitals. We obtained 1896 images between February 2024 and April 2024 as image-based external test dataset, and further obtained 159 videos for the video-based external test dataset between April 2024 and May 2024. We developed an artificial intelligence (AI) system (termed OvaMTA) to diagnose ovarian masses using ultrasound screening. It includes two models: an entire image-based segmentation model, OvaMTA-Seg, for ovary detection and a diagnosis model, OvaMTA-Diagnosis, for predicting the pathological type of ovarian mass using image patches cropped by OvaMTA-Seg. The performance of the system was evaluated in one internal and two external validation cohorts, and compared with doctors' assessments in real-world testing. We recruited eight physicians to assess the real-world data. The value of the system in assisting doctors with diagnosis was also evaluated.
FINDINGS: In terms of segmentation, OvaMTA-Seg achieved an average Dice score of 0.887 on the internal test set and 0.819 on the image-based external test set. OvaMTA-Seg also performed well in ovarian mass detection from test images, including healthy ovaries and masses (internal test area under the curve [AUC]: 0.970; external test AUC: 0.877). In terms of classification diagnosis prediction, OvaMTA-Diagnosis demonstrated high performance on image-based internal (AUC: 0.941) and external test sets (AUC: 0.941). In video-based external testing, OvaMTA recognised 159 videos with ovarian masses with AUC of 0.911, and is comparable to the performance of senior radiologists (ACC: 86.2 vs. 88.1, p = 0.50; SEN: 81.8 vs. 88.6, p = 0.16; SPE: 89.2 vs. 87.6, p = 0.68). There was a significant improvement in junior and intermediate radiologists who were assisted by AI compared to those who were not assisted by AI (ACC: 80.8 vs. 75.3, p = 0.00015; SEN: 79.5 vs. 74.6, p = 0.029; SPE: 81.7 vs. 75.8, p = 0.0032). General practitioners assisted by AI achieved an average performance of radiologists (ACC: 82.7 vs. 81.8, p = 0.80; SEN: 84.8 vs. 82.6, p = 0.72; SPE: 81.2 vs. 81.2, p > 0.99).
INTERPRETATION: The OvaMTA system based on ultrasound imaging is a simple and practical auxiliary tool for screening for ovarian cancer, with a diagnostic performance comparable to that of senior radiologists. This provides a potential tool for screening ovarian cancer.
FUNDING: This work was supported by the National Natural Science Foundation of China (Grant Nos. 12090020, 82071929, and 12090025) and the R&D project of the Pazhou Lab (Huangpu) (Grant No. 2023K0605).
PMID:39640935 | PMC:PMC11617315 | DOI:10.1016/j.eclinm.2024.102923
Development and validation of the MRI-based deep learning classifier for distinguishing perianal fistulizing Crohn's disease from cryptoglandular fistula: a multicenter cohort study
EClinicalMedicine. 2024 Nov 22;78:102940. doi: 10.1016/j.eclinm.2024.102940. eCollection 2024 Dec.
ABSTRACT
BACKGROUND: A singular reliable modality for early distinguishing perianal fistulizing Crohn's disease (PFCD) from cryptoglandular fistula (CGF) is currently lacking. We aimed to develop and validate an MRI-based deep learning classifier to effectively discriminate between them.
METHODS: The present study retrospectively enrolled 1054 patients with PFCD or CGF from three Chinese tertiary referral hospitals between January 1, 2015, and December 31, 2021. The patients were divided into four cohorts: training cohort (n = 800), validation cohort (n = 100), internal test cohort (n = 100) and external test cohort (n = 54). Two deep convolutional neural networks (DCNN), namely MobileNetV2 and ResNet50, were respectively trained using the transfer learning strategy on a dataset consisting of 44871 MR images. The performance of the DCNN models was compared to that of radiologists using various metrics, including receiver operating characteristic curve (ROC) analysis, accuracy, sensitivity, and specificity. Delong testing was employed for comparing the area under curves (AUCs). Univariate and multivariate analyses were conducted to explore potential factors associated with classifier performance.
FINDINGS: A total of 532 PFCD and 522 CGF patients were included. Both pre-trained DCNN classifiers achieved encouraging performances in the internal test cohort (MobileNetV2 AUC: 0.962, 95% CI 0.903-0.990; ResNet50 AUC: 0.963, 95% CI 0.905-0.990), as well as external test cohort (MobileNetV2 AUC: 0.885, 95% CI 0.769-0.956; ResNet50 AUC: 0.874, 95% CI 0.756-0.949). They had greater AUCs than the radiologists (all p ≤ 0.001), while had comparable AUCs to each other (p = 0.83 and p = 0.60 in the two test cohorts). None of the potential characteristics had a significant impact on the performance of pre-trained MobileNetV2 classifier in etiologic diagnosis. Previous fistula surgery influenced the performance of the pre-trained ResNet50 classifier in the internal test cohort (OR 0.157, 95% CI 0.025-0.997, p = 0.05).
INTERPRETATION: The developed DCNN classifiers exhibited superior robustness in distinguishing PFCD from CGF compared to artificial visual assessment, showing their potential for assisting in early detection of PFCD. Our findings highlight the promising generalized performance of MobileNetV2 over ResNet50, rendering it suitable for deployment on mobile terminals.
FUNDING: National Natural Science Foundation of China.
PMID:39640934 | PMC:PMC11618046 | DOI:10.1016/j.eclinm.2024.102940
Towards multi-agent system for learning object recommendation
Heliyon. 2024 Oct 11;10(20):e39088. doi: 10.1016/j.heliyon.2024.e39088. eCollection 2024 Oct 30.
ABSTRACT
The rapid increase of online educational content has made it harder for students to find specific information. E-learning recommender systems help students easily find the learning objects they require, improving the learning experience. The effectiveness of these systems is further improved by integrating deep learning with multi-agent systems. Multi-agent systems facilitate adaptable interactions within the system's various parts, and deep learning processes extensive data to understand learners' preferences. This collaboration results in custom-made suggestions that cater to individual learners. Our research introduces a multi-agent system tailored for suggesting learning objects in line with learners' knowledge levels and learning styles. This system uniquely comprises four agents: the learner agent, the tutor agent, the learning object agent, and the recommendation agent. It applies the Felder and Silverman model to pinpoint various student learning styles and organizes educational content based on the newest IEEE Learning Object Metadata standard. The system uses advanced techniques, such as Convolutional Neural Networks (CNN) and Multilayer Perceptrons (MLP), to propose learning objects. In terms of creating personalized learning experiences, this system is a considerable step forward. It effectively suggests learning objects that closely match each learner's personal profile, greatly enhancing student engagement and making the learning process more efficient.
PMID:39640789 | PMC:PMC11620102 | DOI:10.1016/j.heliyon.2024.e39088
Comprehensive analysis of computational approaches in plant transcription factors binding regions discovery
Heliyon. 2024 Oct 10;10(20):e39140. doi: 10.1016/j.heliyon.2024.e39140. eCollection 2024 Oct 30.
ABSTRACT
Transcription factors (TFs) are regulatory proteins which bind to a specific DNA region known as the transcription factor binding regions (TFBRs) to regulate the rate of transcription process. The identification of TFBRs has been made possible by a number of experimental and computational techniques established during the past few years. The process of TFBR identification involves peak identification in the binding data, followed by the identification of motif characteristics. Using the same binding data attempts have been made to raise computational models to identify such binding regions which could save time and resources spent for binding experiments. These computational approaches depend a lot on what way they learn and how. These existing computational approaches are skewed heavily around human TFBRs discovery, while plants have drastically different genomic setup for regulation which these approaches have grossly ignored. Here, we provide a comprehensive study of the current state of the matters in plant specific TF discovery algorithms. While doing so, we encountered several software tools' issues rendering the tools not useable to researches. We fixed them and have also provided the corrected scripts for such tools. We expect this study to serve as a guide for better understanding of software tools' approaches for plant specific TFBRs discovery and the care to be taken while applying them, especially during cross-species applications. The corrected scripts of these software tools are made available at https://github.com/SCBB-LAB/Comparative-analysis-of-plant-TFBS-software.
PMID:39640721 | PMC:PMC11620080 | DOI:10.1016/j.heliyon.2024.e39140
State-of-health estimation and classification of series-connected batteries by using deep learning based hybrid decision approach
Heliyon. 2024 Oct 9;10(20):e39121. doi: 10.1016/j.heliyon.2024.e39121. eCollection 2024 Oct 30.
ABSTRACT
In rechargeable battery control and operation, one of the primary obstacles is safety concerns where the battery degradation poses a significant factor. Therefore, in recent years, state-of-health assessment of lithium-ion batteries has become a noteworthy issue. On the other hand, it is challenging to ensure robustness and generalization because most state-of-health assessment techniques are implemented for a specific characteristic, operating situation, and battery material system. In most studies, health status of single cell batteries is assessed by using analytical or computer-aided deep learning methods. But, the state-of-health characteristics of series-connected battery systems should be also focused with advances of technology and usage, especially electric vehicles. This study presents a data-driven, deep learning-based hybrid decision approach for predicting the state-of-health of series-connected lithium-ion batteries with different characteristics. The paper consists of generating series-connected battery degradation dataset by using of some mostly used datasets. Also, by employing deep learning-based networks along with hybrid-classification aided by performance metrics, it is shown that estimating and predicting the state-of-health can be achieved not only by using sole deep-learning algorithms but also hybrid-classification techniques. The results demonstrate the high accuracy and simplicity of the proposed novel approach on datasets from Oxford University and Calce battery group. The best estimated mean squared error, root mean square error and mean-absolute percentage error values are not more than 0.0500, 0.2236 and 0.7065, respectively which shows the efficiency not only by accuracy but also error indicators. The results show that the proposed approach can be implemented in offline or online systems with best average accuracy of 98.33 % and classification time of 58 ms per sample.
PMID:39640714 | PMC:PMC11620055 | DOI:10.1016/j.heliyon.2024.e39121
Deep learning neural network-assisted badminton movement recognition and physical fitness training optimization
Heliyon. 2024 Oct 2;10(20):e38865. doi: 10.1016/j.heliyon.2024.e38865. eCollection 2024 Oct 30.
ABSTRACT
This work aims to solve the problem of low accuracy in recognizing the trajectory of badminton movement. This work focuses on the visual system in badminton robots and conducts side detection and tracking of flying badminton in two-dimensional image plane video streams. Then, the cropped video images are input into a convolutional neural network frame by frame. By adding an attention mechanism, it helps identify the badminton movement trajectory. Finally, to address the detection challenge of flying badminton as a small target in video streams, the deep learning one-stage detection network, Tiny YOLOv2, is improved from both the loss function and network structure perspectives. Moreover, it is combined with the Unscented Kalman Filter algorithm to predict the trajectory of badminton movement. Simulation results show that the improved algorithm performs excellently in tracking and predicting badminton trajectories compared with the existing algorithms. The average accuracy of the proposed method for tracking badminton trajectories is 91.40 %, and the recall rate is 84.60 %. The average precision, recall, and frame rate of the measured trajectories in four simple and complex scenarios of badminton flight video streams are 96.7 %, 95.7 %, and 29.2 frames/second, respectively. They are all superior to other classic algorithms. It is evident that the proposed method can provide powerful support for badminton trajectory recognition and help improve the accuracy of badminton movement recognition.
PMID:39640697 | PMC:PMC11620146 | DOI:10.1016/j.heliyon.2024.e38865
Introducing a novel dataset for facial emotion recognition and demonstrating significant enhancements in deep learning performance through pre-processing techniques
Heliyon. 2024 Oct 4;10(20):e38913. doi: 10.1016/j.heliyon.2024.e38913. eCollection 2024 Oct 30.
ABSTRACT
Facial expression recognition (FER) plays a pivotal role in various applications, ranging from human-computer interaction to psychoanalysis. To improve the accuracy of facial emotion recognition (FER) models, this study focuses on enhancing and augmenting FER datasets. It comprehensively analyzes the Facial Emotion Recognition dataset (FER13) to identify defects and correct misclassifications. The FER13 dataset represents a crucial resource for researchers developing Deep Learning (DL) models aimed at recognizing emotions based on facial features. Subsequently, this article develops a new facial dataset by expanding upon the original FER13 dataset. Similar to the FER + dataset, the expanded dataset incorporates a wider range of emotions while maintaining data accuracy. To further improve the dataset, it will be integrated with the extended Cohn-Kanade (CK+) dataset. This paper investigates the application of modern DL models to enhance emotion recognition in human faces. By training a new dataset, the study demonstrates significant performance gains compared with its counterparts. Furthermore, the article examines recent advances in FER technology and identifies critical requirements for DL models to overcome the inherent challenges of this task effectively. The study explores several DL architectures for emotion recognition in facial image datasets, with a particular focus on convolutional neural networks (CNNs). Our findings indicate that complex architecture, such as EfficientNetB7, outperforms other DL architectures, achieving a test accuracy of 78.9 %. Notably, the model surpassed the EfficientNet-XGBoost model, especially when used with the new dataset. Our approach leverages EfficientNetB7 as a backbone to build a model capable of efficiently recognizing emotions from facial images. Our proposed model, EfficientNetB7-CNN, achieved a peak accuracy of 81 % on the test set despite facing challenges such as GPU memory limitations. This demonstrates the model's robustness in handling complex facial expressions. Furthermore, to enhance feature extraction and attention mechanisms, we propose a new hybrid model, CBAM-4CNN, which integrates the convolutional block attention module (CBAM) with a custom 4-layer CNN architecture. The results showed that the CBAM-4CNN model outperformed existing models, achieving higher accuracy, precision, and recall metrics across multiple emotion classes. The results highlight the critical role of comprehensive and diverse data in enhancing model performance for facial emotion recognition.
PMID:39640693 | PMC:PMC11620061 | DOI:10.1016/j.heliyon.2024.e38913
3D microstructure reconstruction and characterization of porous materials using a cross-sectional SEM image and deep learning
Heliyon. 2024 Oct 10;10(20):e39185. doi: 10.1016/j.heliyon.2024.e39185. eCollection 2024 Oct 30.
ABSTRACT
Accurate assessment of the three-dimensional (3D) pore characteristics within porous materials and devices holds significant importance. Compared to high-cost experimental approaches, this study introduces an alternative method: utilizing a generative adversarial network (GAN) to reconstruct a 3D pore microstructure. Unlike some existing GAN models that require 3D images as training data, the proposed model only requires a single cross-sectional image for 3D reconstruction. Using porous ceramic electrode materials as a case study, a comparison between the GAN-generated microstructures and those reconstructed through focused ion beam-scanning electron microscopy (FIB-SEM) reveals promising consistency. The GAN-based reconstruction technique demonstrates its effectiveness by successfully characterizing pore attributes in porous ceramics, with measurements of porosity, pore size, and tortuosity factor exhibiting notable agreement with the results obtained from mercury intrusion porosimetry.
PMID:39640653 | PMC:PMC11620251 | DOI:10.1016/j.heliyon.2024.e39185
Transformer-based models for chemical SMILES representation: A comprehensive literature review
Heliyon. 2024 Oct 9;10(20):e39038. doi: 10.1016/j.heliyon.2024.e39038. eCollection 2024 Oct 30.
ABSTRACT
Pre-trained chemical language models (CLMs) have attracted increasing attention within the domains of cheminformatics and bioinformatics, inspired by their remarkable success in the natural language processing (NLP) domain such as speech recognition, text analysis, translation, and other objectives associated with language. Furthermore, the vast amount of unlabeled data associated with chemical compounds or molecules has emerged as a crucial research focus, prompting the need for CLMs with reasoning capabilities over such data. Molecular graphs and molecular descriptors are the predominant approaches to representing molecules for property prediction in machine learning (ML). However, Transformer-based LMs have recently emerged as de-facto powerful tools in deep learning (DL), showcasing outstanding performance across various NLP downstream tasks, particularly in text analysis. Within the realm of pre-trained transformer-based LMs such as, BERT (and its variants) and GPT (and its variants) have been extensively explored in the chemical informatics domain. Various learning tasks in cheminformatics such as the text analysis that necessitate handling of chemical SMILES data which contains intricate relations among elements or atoms, have become increasingly prevalent. Whether the objective is predicting molecular reactions or molecular property prediction, there is a growing demand for LMs capable of learning molecular contextual information within SMILES sequences or strings from text inputs (i.e., SMILES). This review provides an overview of the current state-of-the-art of chemical language Transformer-based LMs in chemical informatics for de novo design, and analyses current limitations, challenges, and advantages. Finally, a perspective on future opportunities is provided in this evolving field.
PMID:39640612 | PMC:PMC11620068 | DOI:10.1016/j.heliyon.2024.e39038
Deep learning-based overall survival prediction in patients with glioblastoma: An automatic end-to-end workflow using pre-resection basic structural multiparametric MRIs
Comput Biol Med. 2024 Dec 4;185:109436. doi: 10.1016/j.compbiomed.2024.109436. Online ahead of print.
ABSTRACT
PURPOSE: Accurate and automated early survival prediction is critical for patients with glioblastoma (GBM) as their poor prognosis requires timely treatment decision-making. To address this need, we developed a deep learning (DL)-based end-to-end workflow for GBM overall survival (OS) prediction using pre-resection basic structural multiparametric magnetic resonance images (Bas-mpMRI) with a multi-institutional public dataset and evaluated it with an independent dataset of patients on a prospective institutional clinical trial.
MATERIALS AND METHODS: The proposed end-to-end workflow includes a skull-stripping model, a GBM sub-region segmentation model and an ensemble learning-based OS prediction model. The segmentation model utilizes skull-stripped Bas-mpMRIs to segment three GBM sub-regions. The segmented GBM is fed into the contrastive learning-based OS prediction model to classify the patients into different survival groups. Our datasets include both a multi-institutional public dataset from Medical Image Computing and Computer Assisted Intervention (MICCAI) Brain Tumor Segmentation (BraTS) challenge 2020 with 235 patients, and an institutional dataset from a 5-fraction SRS clinical trial with 19 GBM patients. Each data entry consists of pre-operative Bas-mpMRIs, survival days and patient ages. Basic clinical characteristics are also available for SRS clinical trial data. The multi-institutional public dataset was used for workflow establishing (90% of data) and initial validation (10% of data). The validated workflow was then evaluated on the institutional clinical trial data.
RESULTS: Our proposed OS prediction workflow achieved an area under the curve (AUC) of 0.86 on the public dataset and 0.72 on the institutional clinical trial dataset to classify patients into 2 OS classes as long-survivors (>12 months) and short-survivors (<12 months), despite the large variation in Bas-mpMRI protocols. In addition, as part of the intermediate results, the proposed workflow can also provide detailed GBM sub-regions auto-segmentation with a whole tumor Dice score of 0.91.
CONCLUSION: Our study demonstrates the feasibility of employing this DL-based end-to-end workflow to predict the OS of patients with GBM using only the pre-resection Bas-mpMRIs. This DL-based workflow can be potentially applied to assist timely clinical decision-making.
PMID:39637462 | DOI:10.1016/j.compbiomed.2024.109436
Progress on the development of prediction tools for detecting disease causing mutations in proteins
Comput Biol Med. 2024 Dec 4;185:109510. doi: 10.1016/j.compbiomed.2024.109510. Online ahead of print.
ABSTRACT
Proteins are involved in a variety of functions in living organisms. The mutation of amino acid residues in a protein alters its structure, stability, binding, and function, with some mutations leading to diseases. Understanding the influence of mutations on protein structure and function help to gain deep insights on the molecular mechanism of diseases and devising therapeutic strategies. Hence, several generic and disease-specific methods have been proposed to reveal pathogenic effects on mutations. In this review, we focus on the development of prediction methods for identifying disease causing mutations in proteins. We briefly outline the existing databases for disease-causing mutations, followed by a discussion on sequence- and structure-based features used for prediction. Further, we discuss computational tools based on machine learning, deep learning and large language models for detecting disease-causing mutations. Specifically, we emphasize the advances in predicting hotspots and mutations for targets involved in cancer, neurodegenerative and infectious diseases as well as in membrane proteins. The computational resources including databases and algorithms understanding/predicting the effect of mutations will be listed. Moreover, limitations of existing methods and possible improvements will be discussed.
PMID:39637461 | DOI:10.1016/j.compbiomed.2024.109510
Predicting cancer content in tiles of lung squamous cell carcinoma tumours with validation against pathologist labels
Comput Biol Med. 2024 Dec 4;185:109489. doi: 10.1016/j.compbiomed.2024.109489. Online ahead of print.
ABSTRACT
BACKGROUND: A growing body of research is using deep learning to explore the relationship between treatment biomarkers for lung cancer patients and cancer tissue morphology on digitized whole slide images (WSIs) of tumour resections. However, these WSIs typically contain non-cancer tissue, introducing noise during model training. As digital pathology models typically start with splitting WSIs into tiles, we propose a model that can be used to exclude non-cancer tiles from the WSIs of lung squamous cell carcinoma (SqCC) tumours.
METHODS: We obtained 116 WSIs of tumours from 35 different centres from the Cancer Genome Atlas. A pathologist completed or reviewed cancer contours in four regions of interest (ROIs) within each WSIs. We then split the ROIs into tiles labelled with the percentage of cancer tissue within them and trained VGG16 to predict this value, and then we calculated regression error. To measure classification performance and visualize the classification results, we thresholded the predictions and calculated the area under the receiver operating characteristic curve (AUC).
RESULTS: The model's median regression error was 4% with a standard deviation of 35%. At a cancer threshold of 50%, the model had an AUC of 0.83. False positives tended to be in tissues that surround cancer, tiles with <50% cancer, and areas with high immune activity. False negatives tended to be microtomy defects.
CONCLUSIONS: With further validation for each specific research application, the model we describe in this paper could facilitate the development of more effective research pipelines for predicting treatment biomarkers for lung SqCC.
PMID:39637460 | DOI:10.1016/j.compbiomed.2024.109489
Artificial intelligence for identification of candidates for device-aided therapy in Parkinson's disease: DELIST-PD study
Comput Biol Med. 2024 Dec 4;185:109504. doi: 10.1016/j.compbiomed.2024.109504. Online ahead of print.
ABSTRACT
INTRODUCTION: In Parkinson's Disease (PD), despite available treatments focusing on symptom alleviation, the effectiveness of conventional therapies decreases over time. This study aims to enhance the identification of candidates for device-aided therapies (DAT) using artificial intelligence (AI), addressing the need for improved treatment selection in advanced PD stages.
METHODS: This national, multicenter, cross-sectional, observational study involved 1086 PD patients across Spain. Machine learning (ML) algorithms, including CatBoost, support vector machine (SVM), and logistic regression (LR), were evaluated for their ability to identify potential DAT candidates based on clinical and demographic data.
RESULTS: The CatBoost algorithm demonstrated superior performance in identifying DAT candidates, with an area under the curve (AUC) of 0.95, sensitivity of 0.91, and specificity of 0.88. It outperformed other ML models in balanced accuracy and negative predictive value. The model identified 23 key features as predictors for suitability for DAT, highlighting the importance of daily "off" time, doses of oral levodopa/day, and PD duration. Considering the 5-2-1 criteria, the algorithm identified a decision threshold for DAT candidates as > 4 times levodopa tablets taken daily and/or ≥1.8 h in daily "off" time.
CONCLUSION: The study developed a highly discriminative CatBoost model for identifying PD patients candidates for DAT, potentially improving timely and accurate treatment selection. This AI approach offers a promising tool for neurologists, particularly those less experienced with DAT, to optimize referral to Movement Disorder Units.
PMID:39637457 | DOI:10.1016/j.compbiomed.2024.109504