Deep learning
Whole-cell multi-target single-molecule super-resolution imaging in 3D with microfluidics and a single-objective tilted light sheet
Nat Commun. 2024 Nov 24;15(1):10187. doi: 10.1038/s41467-024-54609-z.
ABSTRACT
Multi-target single-molecule super-resolution fluorescence microscopy offers a powerful means of understanding the distributions and interplay between multiple subcellular structures at the nanoscale. However, single-molecule super-resolution imaging of whole mammalian cells is often hampered by high fluorescence background and slow acquisition speeds, especially when imaging multiple targets in 3D. In this work, we have mitigated these issues by developing a steerable, dithered, single-objective tilted light sheet for optical sectioning to reduce fluorescence background and a pipeline for 3D nanoprinting microfluidic systems for reflection of the light sheet into the sample. This easily adaptable microfluidic fabrication pipeline allows for the incorporation of reflective optics into microfluidic channels without disrupting efficient and automated solution exchange. We combine these innovations with point spread function engineering for nanoscale localization of individual molecules in 3D, deep learning for analysis of overlapping emitters, active 3D stabilization for drift correction and long-term imaging, and Exchange-PAINT for sequential multi-target imaging without chromatic offsets. We then demonstrate that this platform, termed soTILT3D, enables whole-cell multi-target 3D single-molecule super-resolution imaging with improved precision and imaging speed.
PMID:39582043 | DOI:10.1038/s41467-024-54609-z
VGAE-CCI: variational graph autoencoder-based construction of 3D spatial cell-cell communication network
Brief Bioinform. 2024 Nov 22;26(1):bbae619. doi: 10.1093/bib/bbae619.
ABSTRACT
Cell-cell communication plays a critical role in maintaining normal biological functions, regulating development and differentiation, and controlling immune responses. The rapid development of single-cell RNA sequencing and spatial transcriptomics sequencing (ST-seq) technologies provides essential data support for in-depth and comprehensive analysis of cell-cell communication. However, ST-seq data often contain incomplete data and systematic biases, which may reduce the accuracy and reliability of predicting cell-cell communication. Furthermore, other methods for analyzing cell-cell communication mainly focus on individual tissue sections, neglecting cell-cell communication across multiple tissue layers, and fail to comprehensively elucidate cell-cell communication networks within three-dimensional tissues. To address the aforementioned issues, we propose VGAE-CCI, a deep learning framework based on the Variational Graph Autoencoder, capable of identifying cell-cell communication across multiple tissue layers. Additionally, this model can be applied to spatial transcriptomics data with missing or partially incomplete data and can clustered cells at single-cell resolution based on spatial encoding information within complex tissues, thereby enabling more accurate inference of cell-cell communication. Finally, we tested our method on six datasets and compared it with other state of art methods for predicting cell-cell communication. Our method outperformed other methods across multiple metrics, demonstrating its efficiency and reliability in predicting cell-cell communication.
PMID:39581873 | DOI:10.1093/bib/bbae619
RNADiffFold: generative RNA secondary structure prediction using discrete diffusion models
Brief Bioinform. 2024 Nov 22;26(1):bbae618. doi: 10.1093/bib/bbae618.
ABSTRACT
Ribonucleic acid (RNA) molecules are essential macromolecules that perform diverse biological functions in living beings. Precise prediction of RNA secondary structures is instrumental in deciphering their complex three-dimensional architecture and functionality. Traditional methodologies for RNA structure prediction, including energy-based and learning-based approaches, often depict RNA secondary structures from a static perspective and rely on stringent a priori constraints. Inspired by the success of diffusion models, in this work, we introduce RNADiffFold, an innovative generative prediction approach of RNA secondary structures based on multinomial diffusion. We reconceptualize the prediction of contact maps as akin to pixel-wise segmentation and accordingly train a denoising model to refine the contact maps starting from a noise-infused state progressively. We also devise a potent conditioning mechanism that harnesses features extracted from RNA sequences to steer the model toward generating an accurate secondary structure. These features encompass one-hot encoded sequences, probabilistic maps generated from a pre-trained scoring network, and embeddings and attention maps derived from RNA foundation model. Experimental results on both within- and cross-family datasets demonstrate RNADiffFold's competitive performance compared with current state-of-the-art methods. Additionally, RNADiffFold has shown a notable proficiency in capturing the dynamic aspects of RNA structures, a claim corroborated by its performance on datasets comprising multiple conformations.
PMID:39581872 | DOI:10.1093/bib/bbae618
Automatic Segmentation of Quadriceps Femoris Cross-Sectional Area in Ultrasound Images: Development and Validation of Convolutional Neural Networks in People With Anterior Cruciate Ligament Injury and Surgery
Ultrasound Med Biol. 2024 Nov 23:S0301-5629(24)00431-9. doi: 10.1016/j.ultrasmedbio.2024.11.004. Online ahead of print.
ABSTRACT
OBJECTIVE: Deep learning approaches such as DeepACSA enable automated segmentation of muscle ultrasound cross-sectional area (CSA). Although they provide fast and accurate results, most are developed using data from healthy populations. The changes in muscle size and quality following anterior cruciate ligament (ACL) injury challenges the validity of these automated approaches in the ACL population. Quadriceps muscle CSA is an important outcome following ACL injury; therefore, our aim was to validate DeepACSA, a convolutional neural network (CNN) approach for ACL injury.
METHODS: Quadriceps panoramic CSA ultrasound images (vastus lateralis [VL] n = 430, rectus femoris [RF] n = 349, and vastus medialis [VM] n = 723) from 124 participants with an ACL injury (age 22.8 ± 7.9 y, 61 females) were used to train CNN models. For VL and RF, combined models included extra images from healthy participants (n = 153, age 38.2, range 13-78) that the DeepACSA was developed from. All models were tested on unseen external validation images (n = 100) from ACL-injured participants. Model predicted CSA results were compared to manual segmentation results.
RESULTS: All models showed good comparability (ICC > 0.81, < 14.1% standard error of measurement, mean differences of <1.56 cm2) to manual segmentation. Removal of the erroneous predictions resulted in excellent comparability (ICC > 0.94, < 7.40% standard error of measurement, mean differences of <0.57 cm2). Erroneous predictions were 17% for combined VL, 11% for combined RF, and 20% for ACL-only VM models.
CONCLUSION: The new CNN models provided can be used in ACL-injured populations to measure CSA of VL, RF, and VM muscles automatically. The models yield high comparability to manual segmentation results and reduce the burden of manual segmentation.
PMID:39581823 | DOI:10.1016/j.ultrasmedbio.2024.11.004
Pulmonary <sup>129</sup>Xe MRI: CNN Registration and Segmentation to Generate Ventilation Defect Percent with Multi-center Validation
Acad Radiol. 2024 Nov 23:S1076-6332(24)00789-X. doi: 10.1016/j.acra.2024.10.029. Online ahead of print.
ABSTRACT
RATIONALE AND OBJECTIVES: Hyperpolarized 129Xe MRI quantifies ventilation-defect-percent (VDP), the ratio of 129Xe signal-void to the anatomic 1H MRI thoracic-cavity-volume. VDP is associated with airway inflammation and disease control and serves as a treatable trait in therapy studies. Semi-automated VDP pipelines require time-intensive observer interactions. Current convolutional neural network (CNN) approaches for quantifying VDP lack external validation, which limits multicenter utilization. Our objective was to develop an automated and externally validated deep-learning pipeline to quantify pulmonary 129Xe MRI VDP.
MATERIALS AND METHODS: 1H and 129Xe MRI data from the primary site (Site1) were used to train and test a CNN segmentation and registration pipeline, while two independent sites (Site2 and Site3) provided external validation. Semi-automated and CNN-based registration error was measured using mean-absolute-error (MAE) while segmentation error was measured using generalized-Dice-similarity coefficient (gDSC). CNN and semi-automated VDP were compared using linear regression and Bland-Altman analysis.
RESULTS: Training/testing used data from 205 participants (healthy volunteers, asthma, COPD, long-COVID; mean age=54 ± 16y; 119 females) from Site1. External validation used data from 71 participants. CNN and semi-automated 1H and 129Xe registrations agreed (MAE=0.3°, R2 =0.95 rotation; 1.1%, R2 =0.79 scaling; 0.2/0.5px, R2 =0.96/0.95, x/y-translation; all p < .001). Thoracic-cavity and ventilation segmentations were also spatially corresponding (gDSC=0.92 and 0.88, respectively). CNN VDP correlated with semi-automated VDP (Site1 R2/ρ = .97/.95, bias=-0.5%; Site2 R2/ρ = .85/.93, bias=-0.9%; Site3 R2/ρ = .95/.89, bias=-0.8%, all p < .001).
CONCLUSION: An externally validated CNN registration/segmentation model demonstrated strong agreement with low error compared to the semi-automated method. CNN and semi-automated registrations, thoracic-cavity-volume and ventilation-volume segmentations were highly correlated with high gDSC for the datasets.
PMID:39581785 | DOI:10.1016/j.acra.2024.10.029
A Multicenter Evaluation of the Impact of Therapies on Deep Learning-based Electrocardiographic Hypertrophic Cardiomyopathy Markers
Am J Cardiol. 2024 Nov 22:S0002-9149(24)00828-2. doi: 10.1016/j.amjcard.2024.11.028. Online ahead of print.
ABSTRACT
Artificial intelligence-enhanced electrocardiography (AI-ECG) can identify hypertrophic cardiomyopathy (HCM) on 12-lead ECGs and offers a novel way to monitor treatment response. While the surgical or percutaneous reduction of the interventricular septum (SRT) represented initial HCM therapies, mavacamten offers an oral alternative. We aimed to assess the use of AI-ECG as a strategy to evaluate biological response to SRT and mavacamten. We applied an AI-ECG model for HCM detection to ECG images from patients who underwent SRT across 3 sites: Yale New Haven Health System (YNHHS), Cleveland Clinic Foundation (CCF), and Atlantic Health System (AHS); and to ECG images from patients receiving mavacamten at YNHHS. A total of 70 patients underwent SRT at YNHHS, 100 at CCF, and 145 at AHS. At YNHHS, there was no significant change in the AI-ECG HCM score before versus after SRT (pre-SRT: median 0.55 [IQR 0.24-0.77] vs post-SRT: 0.59 [0.40-0.75]). The AI-ECG HCM scores also did not improve post SRT at CCF (0.61 [0.32-0.79] vs 0.69 [0.52-0.79]) and AHS (0.52 [0.35-0.69] vs 0.61 [0.49-0.70]). Among 36 YNHHS patients on mavacamten therapy, the median AI-ECG score before starting mavacamten was 0.41 (0.22-0.77), which decreased significantly to 0.28 (0.11-0.50, p <0.001 by Wilcoxon signed-rank test) at the end of a median follow-up period of 237 days. In conclusion, we observed a lack of improvement in AI-based HCM score with SRT, in contrast to a significant decrease with mavacamten. Our approach suggests the potential role of AI-ECG for serial point-of-care monitoring of pathophysiological improvement following medical therapy in HCM using ECG images.
PMID:39581517 | DOI:10.1016/j.amjcard.2024.11.028
Deep learning-based multiple-CT optimization: An adaptive treatment planning approach to account for anatomical changes in intensity-modulated proton therapy for head and neck cancers
Radiother Oncol. 2024 Nov 22:110650. doi: 10.1016/j.radonc.2024.110650. Online ahead of print.
ABSTRACT
BACKGROUNDS: Intensity-modulated proton therapy (IMPT) is particularly susceptible to range and setup uncertainties, as well as anatomical changes.
PURPOSE: We present a framework for IMPT planning that employs a deep learning method for dose prediction based on multiple-CT (MCT). The extra CTs are created from cone-beam CT (CBCT) using deformable registration with the primary planning CT (PCT). Our method also includes a dose mimicking algorithm.
METHODS: The MCT IMPT planning pipeline involves prediction of robust dose from input images using a deep learning model with a U-net architecture. Deliverable plans may then be created by solving a dose mimicking problem with the predictions as reference dose. Model training, dose prediction and plan generation are performed using a dataset of 55 patients with head and neck cancer in this retrospective study. Among them, 38 patients were used as training set, 7 patients were used as validation set, and 10 patients were reserved as test set for final evaluation.
RESULTS: We demonstrated that the deliverable plans generated through subsequent MCT dose mimicking exhibited greater robustness than the robust plans produced by the PCT, as well as enhanced dose sparing for organs at risk. MCT plans had lower D2% (76.1 Gy vs. 82.4 Gy), better homogeneity index (7.7 % vs. 16.4 %) of CTV1 and better conformity index (70.5 % vs. 61.5 %) of CTV2 than the robust plans produced by the primary planning CT for all test patients.
CONCLUSIONS: We demonstrated the feasibility and advantages of incorporating daily CBCT images into MCT optimization. This approach improves plan robustness against anatomical changes and may reduce the need for plan adaptations in head and neck cancer treatments.
PMID:39581351 | DOI:10.1016/j.radonc.2024.110650
In silico identification of Histone Deacetylase inhibitors using Streamlined Masked Transformer-based Pretrained features
Methods. 2024 Nov 22:S1046-2023(24)00246-9. doi: 10.1016/j.ymeth.2024.11.009. Online ahead of print.
ABSTRACT
Histone Deacetylases (HDACs) are enzymes that regulate gene expression by removing acetyl groups from histones. They are involved in various diseases, including neurodegenerative, cardiovascular, inflammatory, and metabolic disorders, as well as fibrosis in the liver, lungs, and kidneys. Successfully identifying potent HDAC inhibitors may offer a promising approach to treating these diseases. In addition to experimental techniques, researchers have introduced several in silico methods for identifying HDAC inhibitors. However, these existing computer-aided methods have shortcomings in their modeling stages, which limit their applications. In our study, we present a Streamlined Masked Transformer-based Pretrained (SMTP) encoder, which can be used to generate features for downstream tasks. The training process of the SMTP encoder was directed by masked attention-based learning, enhancing the model's generalizability in encoding molecules. The SMTP features were used to develop 11 classification models identifying 11 HDAC isoforms. We trained SMTP, a lightweight encoder, with only 1.9 million molecules, a smaller number than other known molecular encoders, yet its discriminant ability remains competitive. The results revealed that machine learning models developed using the SMTP feature set outperformed those developed using other feature sets in 8 out of 11 classification tasks. Additionally, chemical diversity analysis confirmed the encoder's effectiveness in distinguishing between two classes of molecules.
PMID:39581247 | DOI:10.1016/j.ymeth.2024.11.009
MoAGL-SA: a multi-omics adaptive integration method with graph learning and self attention for cancer subtype classification
BMC Bioinformatics. 2024 Nov 23;25(1):364. doi: 10.1186/s12859-024-05989-y.
ABSTRACT
BACKGROUND: The integration of multi-omics data through deep learning has greatly improved cancer subtype classification, particularly in feature learning and multi-omics data integration. However, key challenges remain in embedding sample structure information into the feature space and designing flexible integration strategies.
RESULTS: We propose MoAGL-SA, an adaptive multi-omics integration method based on graph learning and self-attention, to address these challenges. First, patient relationship graphs are generated from each omics dataset using graph learning. Next, three-layer graph convolutional networks are employed to extract omic-specific graph embeddings. Self-attention is then used to focus on the most relevant omics, adaptively assigning weights to different graph embeddings for multi-omics integration. Finally, cancer subtypes are classified using a softmax classifier.
CONCLUSIONS: Experimental results show that MoAGL-SA outperforms several popular algorithms on datasets for breast invasive carcinoma, kidney renal papillary cell carcinoma, and kidney renal clear cell carcinoma. Additionally, MoAGL-SA successfully identifies key biomarkers for breast invasive carcinoma.
PMID:39580382 | DOI:10.1186/s12859-024-05989-y
Accelerated spine MRI with deep learning based image reconstruction: a prospective comparison with standard MRI
Acad Radiol. 2024 Nov 22:S1076-6332(24)00850-X. doi: 10.1016/j.acra.2024.11.004. Online ahead of print.
ABSTRACT
RATIONALE AND OBJECTIVES: To evaluate the performance of deep learning (DL) reconstructed MRI in terms of image acquisition time, overall image quality and diagnostic interchangeability compared to standard-of-care (SOC) MRI.
MATERIALS AND METHODS: This prospective study recruited participants between July 2023 and August 2023 who had spinal discomfort. All participants underwent two separate MRI examinations (Standard and accelerated scanning). Signal-to-noise ratios (SNR), contrast-to-noise ratios (CNR) and similarity metrics were calculated for quantitative evaluation. Four radiologists performed subjective quality and lesion characteristic assessment. Wilcoxon test was used to assess the differences of SNR, CNR and subjective image quality between DL and SOC. Various lesions of spine were also tested for interchangeability using individual equivalence index. Interreader and intrareader agreement and concordance (κ and Kendall τ and W statistics) were computed and McNemar tests were performed for comprehensive evaluation.
RESULTS: 200 participants (107 male patients, mean age 46.56 ± 17.07 years) were included. Compared with SOC, DL enabled scan time reduced by approximately 40%. The SNR and CNR of DL were significantly higher than those of SOC (P < 0.001). DL showed varying degrees of improvement (0-0.35) in each of similarity metrics. All absolute individual equivalence indexes were less than 4%, indicating interchangeability between SOC and DL. Kappa and Kendall showed a good to near-perfect agreement in range of 0.72-0.98. There is no difference between SOC and DL regarding subjective scoring and frequency of lesion detection.
CONCLUSION: Compared to SOC, DL provided high-quality image for diagnosis and reduced examination time for patients. DL was found to be interchangeable with SOC in detecting various spinal abnormalities.
PMID:39580249 | DOI:10.1016/j.acra.2024.11.004
Enhancing decision confidence in AI using Monte Carlo dropout for Raman spectra classification
Anal Chim Acta. 2024 Dec 15;1332:343346. doi: 10.1016/j.aca.2024.343346. Epub 2024 Oct 16.
ABSTRACT
BACKGROUND: Machine learning algorithms for bacterial strain identification using Raman spectroscopy have been widely used in microbiology. During the training phase, existing datasets are augmented and used to optimize model architecture and hyperparameters. After training, it is presumed that the models have reached their peak performance and are used for inference without being further enhanced. Our methodology combines Monte Carlo Dropout (MCD) with convolutional neural networks (CNNs) by utilizing dropout during the inference phase, which enables to measure the model uncertainty, a critical but often ignored aspect in deep learning models.
RESULTS: We categorize unseen input data into two subsets based on the uncertainty of their prediction by employing MCD and defining the threshold using the Gaussian Mixture Model (GMM). The final prediction is obtained on the subset of testing data that exhibits lower model uncertainty, thereby enhancing the reliability of the results. To validate our method, we applied it to two Raman spectra datasets. As a result, we have observed an increase in accuracy of 9 % for Dataset 1 (from 83.10 % to 92.10 %) and 12.82 % for Dataset 2 (from 83.86 % to 96.68 %). These improvements were observed within specific subsets of the data: 826 out of 1206 spectra in Dataset 1 and 1700 out of 3000 spectra in Dataset 2. This demonstrates the effectiveness of our approach in improving prediction accuracy by focusing on data with lower uncertainty.
SIGNIFICANCE: Different from routine prediction based on mere probabilities, we believe this uncertainty-guided prediction is more effective to ensure a high prediction rate rather than the prediction on the entire dataset. By guiding the decision-making of a model on higher-confidence subsets, our methodology can enhance the accuracy of classification in critical areas like disease diagnosis and safety monitoring. This targeted approach is to advance microbial identification and produces more trustworthy predictions.
PMID:39580162 | DOI:10.1016/j.aca.2024.343346
Rapid and accurate bacteria identification through deep-learning-based two-dimensional Raman spectroscopy
Anal Chim Acta. 2024 Dec 15;1332:343376. doi: 10.1016/j.aca.2024.343376. Epub 2024 Oct 29.
ABSTRACT
Surface-enhanced Raman spectroscopy (SERS) offers a distinctive vibrational fingerprint of the molecules and has led to widespread applications in medical diagnosis, biochemistry, and virology. With the rapid development of artificial intelligence (AI) technology, AI-enabled Raman spectroscopic techniques, as a promising avenue for biosensing applications, have significantly boosted bacteria identification. By converting spectra into images, the dataset is enriched with more detailed information, allowing AI to identify bacterial isolates with enhanced precision. However, previous studies usually suffer from a trade-off between high-resolution spectrograms for high-accuracy identification and short training time for data processing. Here, we present an efficient bacteria identification strategy that combines deep learning models with a spectrogram encoding algorithm based on wavelet packet transform and Gramian angular field techniques. In contrast to the direct analysis of raw Raman spectra, our approach utilizes wavelet packet transform techniques to compress the spectra by a factor of 1/15, while concurrently maintaining state-of-the-art accuracy by amplifying the subtle differences via Gramian angular field techniques. The results demonstrate that our approach can achieve a 99.64 % and a 90.55 % identification accuracy for two types of bacterial isolates and thirty types of bacterial isolates, respectively, while a 90 % reduction in training time compared to the conventional methods. To verify the model's stability, Gaussian noises were superimposed on the testing dataset, showing a specific generalization ability and superior performance. This algorithm has the potential for integration into on-site testing protocols and is readily updatable with new bacterial isolates. This study provides profound insights and contributes to the current understanding of spectroscopy, paving the way for accurate and rapid bacteria identification in diverse applications of environment monitoring, food safety, microbiology, and public health.
PMID:39580159 | DOI:10.1016/j.aca.2024.343376
Contextualizing predictive minds
Neurosci Biobehav Rev. 2024 Nov 21:105948. doi: 10.1016/j.neubiorev.2024.105948. Online ahead of print.
ABSTRACT
The structure of human memory seems to be optimized for efficient prediction, planning, and behavior. We propose that these capacities rely on a tripartite structure of memory that includes concepts, events, and contexts-three layers that constitute the mental world model. We suggest that the mechanism that critically increases adaptivity and flexibility is the tendency to contextualize. This tendency promotes local, context-encoding abstractions, which focus event- and concept-based planning and inference processes on the task and situation at hand. As a result, cognitive contextualization offers a solution to the frame problem-the need to select relevant features of the environment from the rich stream of sensorimotor signals. We draw evidence for our proposal from developmental psychology and neuroscience. Adopting a computational stance, we present evidence from cognitive modeling research which suggests that context sensitivity is a feature that is critical for maximizing the efficiency of cognitive processes. Finally, we turn to recent deep-learning architectures which independently demonstrate how context-sensitive memory can emerge in a self-organized learning system constrained with cognitively-inspired inductive biases.
PMID:39580009 | DOI:10.1016/j.neubiorev.2024.105948
Implementing deep learning on edge devices for snoring detection and reduction
Comput Biol Med. 2024 Nov 22;184:109458. doi: 10.1016/j.compbiomed.2024.109458. Online ahead of print.
ABSTRACT
This study introduces MinSnore, a novel deep learning model tailored for real-time snoring detection and reduction, specifically designed for deployment on low-configuration edge devices. By integrating MobileViTV3 blocks into the Dynamic MobileNetV3 backbone model architecture, MinSnore leverages both Convolutional Neural Networks (CNNs) and transformers to deliver enhanced feature representations with minimal computational overhead. The model was pre-trained on a diverse dataset of 46,349 audio files using the Self-Supervised Learning with Barlow Twins (SSL-BT) method, followed by fine-tuning on 17,355 segmented clips extracted from this dataset. MinSnore represents a significant breakthrough in snoring detection, achieving an accuracy of 96.37 %, precision of 96.31 %, recall of 94.12 %, and an F1-score of 95.02 %. When deployed on a single-board computer like a Raspberry Pi, the system demonstrated a reduction in snoring duration during real-world experiments. These results underscore the importance of this work in addressing sleep-related health issues through an efficient, low-cost, and highly accurate snoring mitigation solution.
PMID:39579667 | DOI:10.1016/j.compbiomed.2024.109458
Spatial resolution enhancement using deep learning improves chest disease diagnosis based on thick slice CT
NPJ Digit Med. 2024 Nov 23;7(1):335. doi: 10.1038/s41746-024-01338-8.
ABSTRACT
CT is crucial for diagnosing chest diseases, with image quality affected by spatial resolution. Thick-slice CT remains prevalent in practice due to cost considerations, yet its coarse spatial resolution may hinder accurate diagnoses. Our multicenter study develops a deep learning synthetic model with Convolutional-Transformer hybrid encoder-decoder architecture for generating thin-slice CT from thick-slice CT on a single center (1576 participants) and access the synthetic CT on three cross-regional centers (1228 participants). The qualitative image quality of synthetic and real thin-slice CT is comparable (p = 0.16). Four radiologists' accuracy in diagnosing community-acquired pneumonia using synthetic thin-slice CT surpasses thick-slice CT (p < 0.05), and matches real thin-slice CT (p > 0.99). For lung nodule detection, sensitivity with thin-slice CT outperforms thick-slice CT (p < 0.001) and comparable to real thin-slice CT (p > 0.05). These findings indicate the potential of our model to generate high-quality synthetic thin-slice CT as a practical alternative when real thin-slice CT is preferred but unavailable.
PMID:39580609 | DOI:10.1038/s41746-024-01338-8
Improved facial emotion recognition model based on a novel deep convolutional structure
Sci Rep. 2024 Nov 23;14(1):29050. doi: 10.1038/s41598-024-79167-8.
ABSTRACT
Facial Emotion Recognition (FER) is a very challenging task due to the varying nature of facial expressions, occlusions, illumination, pose variations, cultural and gender differences, and many other aspects that cause a drastic degradation in quality of facial images. In this paper, an anti-aliased deep convolution network (AA-DCN) model has been developed and proposed to explore how anti-aliasing can increase and improve recognition fidelity of facial emotions. The AA-DCN model detects eight distinct emotions from image data. Furthermore, their features have been extracted using the proposed model and numerous classical deep learning algorithms. The proposed AA-DCN model has been applied to three different datasets to evaluate its performance: The Cohn-Kanade Extending (CK+) database has been utilized, achieving an ultimate accuracy of 99.26% in (5 min, 25 s), the Japanese female facial expressions (JAFFE) obtained 98% accuracy in (8 min, 13 s), and on one of the most challenging FER datasets; the Real-world Affective Face (RAF) dataset; reached 82%, in low training time (12 min, 2s). The experimental results demonstrate that the anti-aliased DCN model is significantly increasing emotion recognition while improving the aliasing artifacts caused by the down-sampling layers.
PMID:39580589 | DOI:10.1038/s41598-024-79167-8
Medical language model specialized in extracting cardiac knowledge
Sci Rep. 2024 Nov 23;14(1):29059. doi: 10.1038/s41598-024-80165-z.
ABSTRACT
The advent of the Transformer has significantly altered the course of research in Natural Language Processing (NLP) within the domain of deep learning, making Transformer-based studies the mainstream in subsequent NLP research. There has also been considerable advancement in domain-specific NLP research, including the development of specialized language models for medical. These medical-specific language models were trained on medical data and demonstrated high performance. While these studies have treated the medical field as a single domain, in reality, medical is divided into multiple departments, each requiring a high level of expertise and treated as a unique domain. Recognizing this, our research focuses on constructing a model specialized for cardiology within the medical sector. Our study encompasses the creation of open-source datasets, training, and model evaluation in this nuanced domain.
PMID:39580531 | DOI:10.1038/s41598-024-80165-z
A 3D dental model dataset with pre/post-orthodontic treatment for automatic tooth alignment
Sci Data. 2024 Nov 23;11(1):1277. doi: 10.1038/s41597-024-04138-7.
ABSTRACT
Traditional orthodontic treatment relies on subjective estimations of orthodontists and iterative communication with technicians to achieve desired tooth alignments. This process is time-consuming, complex, and highly dependent on the orthodontist's experience. With the development of artificial intelligence, there's a growing interest in leveraging deep learning methods to achieve tooth alignment automatically. However, the absence of publicly available datasets containing pre/post-orthodontic 3D dental models has impeded the advancement of intelligent orthodontic solutions. To address this limitation, this paper proposes the first public 3D orthodontic dental dataset, comprising 1,060 pairs of pre/post-treatment dental models sourced from 435 patients. The proposed dataset encompasses 3D dental models with diverse malocclusion, e.g., tooth crowding, deep overbite, and deep overjet; and comprehensive professional annotations, including tooth segmentation labels, tooth position information, and crown landmarks. We also present technical validations for tooth alignment and orthodontic effect evaluation. The proposed dataset is expected to contribute to improving the efficiency and quality of target tooth position design in clinical orthodontic treatment utilizing deep learning methods.
PMID:39580508 | DOI:10.1038/s41597-024-04138-7
An ultrasonography of thyroid nodules dataset with pathological diagnosis annotation for deep learning
Sci Data. 2024 Nov 23;11(1):1272. doi: 10.1038/s41597-024-04156-5.
ABSTRACT
Ultrasonography (US) of thyroid nodules is often time consuming and may be inconsistent between observers, with a low positivity rate for malignancy in biopsies. Even after determining the ultrasound Thyroid Imaging Reporting and Data System (TIRADS) stage, Fine needle aspiration biopsy (FNAB) is still required to obtain a definitive diagnosis. Although various deep learning methods were developed in medical field, they tend to be trained using TI-RADS reports as image labels. Here, we present a large US dataset with pathological diagnosis annotation for each case, designed for developing deep learning algorithms to directly infer histological status from thyroid ultrasound images. The dataset was collected from two retrospective cohorts, which consists of 8508 US images from 842 cases. Additionally, we explained three deep learning models used as validation examples using this dataset.
PMID:39580501 | DOI:10.1038/s41597-024-04156-5
Enhancing advanced cervical cell categorization with cluster-based intelligent systems by a novel integrated CNN approach with skip mechanisms and GAN-based augmentation
Sci Rep. 2024 Nov 23;14(1):29040. doi: 10.1038/s41598-024-80260-1.
ABSTRACT
Cervical cancer is one of the biggest challenges in global health, thus it forms a critical need for early detection technologies that could improve patient prognosis and inform treatment decisions. This development in the form of an early detection mechanism increases the chances of successful treatment and survival, as early diagnosis promptly offers interventions that can dramatically reduce the rate of deaths attributed to this disease. Here, a customized Convolutional Neural Network (CNN) model is proposed for cervical cancerous cell detection. It includes three convolutional layers with increasing filter sizes and max-pooling layers, followed by dropout and dense layers for improved feature extraction and robust learning. By using ResNet models as inspiration, the model further innovates by incorporating skip connections into the CNN design. By enabling direct feature transmission from earlier to later layers, skip links enhance gradient flow and help preserve important spatial information. By boosting feature propagation, this integration increases the model's ability to recognize minute patterns in cervical cell images, hence increasing classification accuracy. In our methodology, the SIPaKMeD dataset has been employed which contains 4049 cervical cell images that are arranged into five different categories. To address class imbalance, Generative Adversarial Networks (GANs) have been applied for data augmentation; that is, synthetic images have been created, that improve the diversity of the dataset and further enhance the robustness of the same. The present model is astonishingly accurate in classifying five cervical cell types: koilocytes, superficial-intermediate, parabasal, dyskeratotic, and metaplastic, thus significantly enhancing early detection and diagnosis of cervical cancer. The model gives an excellent performance because it has a validation accuracy of 99.11% and a training accuracy of 99.82%. It is a reliable model in the diagnosis of cervical cancerous cells because it ensures advancement in the computer-assisted cervical cancer detection system.
PMID:39580498 | DOI:10.1038/s41598-024-80260-1