Deep learning
Deep learning denoising diffusion probabilistic model applied to holographic data synthesis
Opt Lett. 2024 Feb 1;49(3):514-517. doi: 10.1364/OL.504427.
ABSTRACT
In this Letter, we demonstrate for the first time, to our knowledge, a holographic data synthesis based on a deep learning probabilistic diffusion model (DDPM). Several different datasets of color images corresponding to different types of objects are converted to complex-valued holographic data through backpropagation. Then, we train a DDPM using the resulting holographic datasets. The diffusion model is composed of a noise scheduler, which gradually adds Gaussian noise to each hologram in the dataset, and a U-Net convolutional neural network that is trained to reverse this process. Once the U-Net is trained, any number of holograms with similar features as those of the datasets can be generated just by inputting a Gaussian random noise to the model. We demonstrate the synthesis of holograms containing color images of 2D characters, vehicles, and 3D scenes with different characters at different propagation distances.
PMID:38300047 | DOI:10.1364/OL.504427
Deep Learning Enabled SERS Identification of Gaseous Molecules on Flexible Plasmonic MOF Nanowire Films
ACS Sens. 2024 Feb 1. doi: 10.1021/acssensors.3c02519. Online ahead of print.
ABSTRACT
Through the capture of a target molecule at the metal surface with a highly confined electromagnetic field induced by surface plasmon, surface enhanced Raman spectroscopy (SERS) emerges as a spectral analysis technology with high sensitivity. However, accurate SERS identification of a gaseous molecule with low density and high velocity is still a challenge due to its difficulty in capture. In this work, a flexible paper-based plasmonic metal-organic framework (MOF) film consisting of Ag nanowires@ZIF-8 (AgNWs@ZIF-8) is fabricated for SERS detection of gaseous molecules. Benefiting from its micronanopores generated by the nanowire network and ZIF-8 shell, the effective capture of the gaseous molecule is achieved, and its SERS spectrum is obtained in this paper-based flexible plasmonic MOF nanowire film. With optimal structure parameters, spectra of gaseous 4-aminothiophenol, 4-mercaptophenol, and dithiohydroquinone demonstrate that this film has good SERS performance, which could maintain obvious Raman signals within 30 days during reproducible detection. To realize SERS identification of gaseous molecules, deep learning is performed based on the SERS spectra of the mixed gaseous analyte obtained in this flexible porous film. The results point out that an artificial neural network algorithm could identify gaseous aldehydes (gaseous biomarker of colorectal cancer) in simulated exhaled breath with high accuracy at 93.7%. The integration of the flexible paper-based film sensors with deep learning offers a promising new approach for noninvasive colorectal cancer screening. Our work explores SERS applications in gaseous analyte detection and has broad potential in clinical medicine, food safety, environmental monitoring, etc.
PMID:38299870 | DOI:10.1021/acssensors.3c02519
Retrospective validation of MetaSystems' deep-learning-based digital microscopy platform with assistance compared to manual fluorescence microscopy for detection of mycobacteria
J Clin Microbiol. 2024 Feb 1:e0106923. doi: 10.1128/jcm.01069-23. Online ahead of print.
ABSTRACT
This study aimed to validate Metasystems' automated acid-fast bacilli (AFB) smear microscopy scanning and deep-learning-based image analysis module (Neon Metafer) with assistance on respiratory and pleural samples, compared to conventional manual fluorescence microscopy (MM). Analytical parameters were assessed first, followed by a retrospective validation study. In all, 320 archived auramine-O-stained slides selected non-consecutively [85 originally reported as AFB-smear-positive, 235 AFB-smear-negative slides; with an overall mycobacterial culture positivity rate of 24.1% (77/320)] underwent whole-slide imaging and were analyzed by the Metafer Neon AFB Module (version 4.3.130) using a predetermined probability threshold (PT) for AFB detection of 96%. Digital slides were then examined by a trained reviewer blinded to previous AFB smear and culture results, for the final interpretation of assisted digital microscopy (a-DM). Paired results from both microscopic methods were compared to mycobacterial culture. A scanning failure rate of 10.6% (34/320) was observed, leaving 286 slides for analysis. After discrepant analysis, concordance, positive and negative agreements were 95.5% (95%CI, 92.4%-97.6%), 96.2% (95%CI, 89.2%-99.2%), and 95.2% (95%CI, 91.3%-97.7%), respectively. Using mycobacterial culture as reference standard, a-DM and MM had comparable sensitivities: 90.7% (95%CI, 81.7%-96.2%) versus 92.0% (95%CI, 83.4%-97.0%) (P-value = 1.00); while their specificities differed 91.9% (95%CI, 87.4%-95.2%) versus 95.7% (95%CI, 92.1%-98.0%), respectively (P-value = 0.03). Using a PT of 96%, MetaSystems' platform shows acceptable performance. With a national laboratory staff shortage and a local low mycobacterial infection rate, this instrument when combined with culture, can reliably triage-negative AFB-smear respiratory slides and identify positive slides requiring manual confirmation and semi-quantification.IMPORTANCEThis manuscript presents a full validation of MetaSystems' automated acid-fast bacilli (AFB) smear microscopy scanning and deep-learning-based image analysis module using a probability threshold of 96% including accuracy, precision studies, and evaluation of limit of AFB detection on respiratory samples when the technology is used with assistance. This study is complementary to the conversation started by Tomasello et al. on the use of image analysis artificial intelligence software in routine mycobacterial diagnostic activities within the context of high-throughput laboratories with low incidence of tuberculosis.
PMID:38299829 | DOI:10.1128/jcm.01069-23
Prediction of Anti-rheumatoid Arthritis Natural Products of Xanthocerais Lignum Based on LC-MS and Artificial Intelligence
Comb Chem High Throughput Screen. 2024 Jan 30. doi: 10.2174/0113862073282138240116112348. Online ahead of print.
ABSTRACT
AIMS: Employing the technique of liquid chromatography-mass spectrometry (LCMS) in conjunction with artificial intelligence (AI) technology to predict and screen for antirheumatoid arthritis (RA) active compounds in Xanthocerais lignum.
BACKGROUND: Natural products have become an important source of new drug discovery. RA is a chronic autoimmune disease characterized by joint inflammation and systemic inflammation. Although there are many drugs available for the treatment of RA, they still have many side effects and limitations. Therefore, finding more effective and safer natural products for the treatment of RA has become an important issue.
METHODS: In this study, a collection of inhibitors targeting RA-related specific targets was gathered. Machine learning models and deep learning models were constructed using these inhibitors. The performance of the models was evaluated using a test set and ten-fold cross-validation, and the most optimal model was selected for integration. A total of five commonly used machine learning algorithms (logistic regression, k-nearest neighbors, support vector machines, random forest, XGBoost) and one deep learning algorithm (GCN) were employed in this research. Subsequently, a Xanthocerais lignum compound library was established through HPLC-Q-Exactive- MS analysis and relevant literature. The integrated model was utilized to predict and screen for anti-RA active compounds in Xanthocerais lignum.
RESULTS: The integrated model exhibited an AUC greater than 0.94 for all target datasets, demonstrating improved stability and accuracy compared to individual models. This enhancement enables better activity prediction for unknown compounds. By employing the integrated model, the activity of 69 identified compounds in Xanthocerais lignum was predicted. The results indicated that isorhamnetin-3-O-glucoside, myricetin, rutinum, cinnamtannin B1, and dihydromyricetin exhibited inhibitory effects on multiple targets. Furthermore, myricetin and dihydromyricetin were found to have relatively higher relative abundances in Xanthocerais lignum, suggesting that they may serve as the primary active components contributing to its anti-RA effects.
CONCLUSION: In this study, we utilized AI technology to learn from a large number of compounds and predict the activity of natural products from Xanthocerais lignum on specific targets. By combining AI technology and the LC-MS approach, rapid screening and prediction of the activity of natural products based on specific targets can be achieved, significantly enhancing the efficiency of discovering new bioactive molecules from medicinal plants.
PMID:38299408 | DOI:10.2174/0113862073282138240116112348
The prediction of single-molecule magnet properties via deep learning
IUCrJ. 2024 Mar 1. doi: 10.1107/S2052252524000770. Online ahead of print.
ABSTRACT
This paper uses deep learning to present a proof-of-concept for data-driven chemistry in single-molecule magnets (SMMs). Previous discussions within SMM research have proposed links between molecular structures (crystal structures) and single-molecule magnetic properties; however, these have only interpreted the results. Therefore, this study introduces a data-driven approach to predict the properties of SMM structures using deep learning. The deep-learning model learns the structural features of the SMM molecules by extracting the single-molecule magnetic properties from the 3D coordinates presented in this paper. The model accurately determined whether a molecule was a single-molecule magnet, with an accuracy rate of approximately 70% in predicting the SMM properties. The deep-learning model found SMMs from 20 000 metal complexes extracted from the Cambridge Structural Database. Using deep-learning models for predicting SMM properties and guiding the design of novel molecules is promising.
PMID:38299376 | DOI:10.1107/S2052252524000770
Rice pest dataset supports the construction of smart farming systems
Data Brief. 2024 Jan 10;52:110046. doi: 10.1016/j.dib.2024.110046. eCollection 2024 Feb.
ABSTRACT
Rice holds a significant position in the global food supply chain, particularly in Asian, African, and Latin American countries. However, rice pests and diseases cause significant damage to the supply and growth of the rice cultivation industry. Therefore, this article provides a high-quality dataset that has been reviewed by agricultural experts. The dataset is well-suited to support the development of automation systems and smart farming practices. It plays a vital role in facilitating the automatic construction, detection, and classification of rice diseases. However, challenges arise due to the diversity of the dataset collected from various sources, varying in terms of disease types and sizes. This necessitates support for upgrading and enhancing the dataset through various operations in data processing, preprocessing, and statistical analysis. The dataset is provided completely free of charge and has been rigorously evaluated by agricultural experts, making it a reliable resource for system development, research, and communication needs.
PMID:38299106 | PMC:PMC10828557 | DOI:10.1016/j.dib.2024.110046
Railway track surface faults dataset
Data Brief. 2024 Jan 9;52:110050. doi: 10.1016/j.dib.2024.110050. eCollection 2024 Feb.
ABSTRACT
Railway infrastructure maintenance is critical for ensuring safe and efficient transportation networks. Railway track surface defects such as cracks, flakings, joints, spallings, shellings, squats, grooves pose substantial challenges to the integrity and longevity of the tracks. To address these challenges and facilitate further research, a novel dataset of railway track surface faults has been presented in this paper. It is collected using the EKENH9R cameras mounted on a railway inspection vehicle. This dataset represents a valuable resource for the railway maintenance and computer vision related scientific communities. This dataset includes a diverse range of real-world track surface faults under various environmental conditions and lighting scenarios. This makes it an important asset for the development and evaluation of Machine Learning (ML), Deep Learning (DL), and image processing algorithms. This paper also provides detailed annotations and metadata for each image class, enabling precise fault classification and severity assessment of the defects. Furthermore, this paper discusses the data collection process, highlights the significance of railway track maintenance, emphasizes the potential applications of this dataset in fault identification and predictive maintenance, and development of automated inspection systems. We encourage the research community to utilize this dataset for advancing the state-of-the-art research related to railway track surface condition monitoring.
PMID:38299101 | PMC:PMC10828558 | DOI:10.1016/j.dib.2024.110050
Laplacian-Former: Overcoming the Limitations of Vision Transformers in Local Texture Detection
Med Image Comput Comput Assist Interv. 2023 Oct;14222:736-746. doi: 10.1007/978-3-031-43898-1_70. Epub 2023 Oct 1.
ABSTRACT
Vision Transformer (ViT) models have demonstrated a breakthrough in a wide range of computer vision tasks. However, compared to the Convolutional Neural Network (CNN) models, it has been observed that the ViT models struggle to capture high-frequency components of images, which can limit their ability to detect local textures and edge information. As abnormalities in human tissue, such as tumors and lesions, may greatly vary in structure, texture, and shape, high-frequency information such as texture is crucial for effective semantic segmentation tasks. To address this limitation in ViT models, we propose a new technique, Laplacian-Former, that enhances the self-attention map by adaptively re-calibrating the frequency information in a Laplacian pyramid. More specifically, our proposed method utilizes a dual attention mechanism via efficient attention and frequency attention while the efficient attention mechanism reduces the complexity of self-attention to linear while producing the same output, selectively intensifying the contribution of shape and texture features. Furthermore, we introduce a novel efficient enhancement multi-scale bridge that effectively transfers spatial information from the encoder to the decoder while preserving the fundamental features. We demonstrate the efficacy of Laplacian-former on multi-organ and skin lesion segmentation tasks with +1.87% and +0.76% dice scores compared to SOTA approaches, respectively. Our implementation is publically available at GitHub.
PMID:38299070 | PMC:PMC10830169 | DOI:10.1007/978-3-031-43898-1_70
Retracted: Deep-Learning-Based 3D Reconstruction: A Review and Applications
Appl Bionics Biomech. 2024 Jan 24;2024:9870540. doi: 10.1155/2024/9870540. eCollection 2024.
ABSTRACT
[This retracts the article DOI: 10.1155/2022/3458717.].
PMID:38299053 | PMC:PMC10830202 | DOI:10.1155/2024/9870540
Research progress on ocular complications caused by type 2 diabetes mellitus and the function of tears and blepharons
Open Life Sci. 2024 Jan 27;19(1):20220773. doi: 10.1515/biol-2022-0773. eCollection 2024.
ABSTRACT
The purpose of this study was to explore the related research progress of ocular complications (OCs) caused by type 2 diabetes mellitus (T2DM), tear and tarsal function, and the application of deep learning (DL) in the diagnosis of diabetes and OCs caused by it, to provide reference for the prevention and control of OCs in T2DM patients. This study reviewed the pathogenesis and treatment of diabetes retinopathy, keratopathy, dry eye disease, glaucoma, and cataract, analyzed the relationship between OCs and tear function and tarsal function, and discussed the application value of DL in the diagnosis of diabetes and OCs. Diabetes retinopathy is related to hyperglycemia, angiogenic factors, oxidative stress, hypertension, hyperlipidemia, and other factors. The increase in water content in the corneal stroma leads to corneal relaxation, loss of transparency, and elasticity, and can lead to the occurrence of corneal lesions. Dry eye syndrome is related to abnormal stability of the tear film and imbalance in neural and immune regulation. Elevated intraocular pressure, inflammatory reactions, atrophy of the optic nerve head, and damage to optic nerve fibers are the causes of glaucoma. Cataract is a common eye disease in the elderly, which is a visual disorder caused by lens opacity. Oxidative stress is an important factor in the occurrence of cataracts. In clinical practice, blood sugar control, laser therapy, and drug therapy are used to control the above eye complications. The function of tear and tarsal plate will be affected by eye diseases. Retinopathy and dry eye disease caused by diabetes will cause dysfunction of tear and tarsal plate, which will affect the eye function of patients. Furthermore, DL can automatically diagnose and classify eye diseases, automatically analyze fundus images, and accurately diagnose diabetes retinopathy, macular degeneration, and other diseases by analyzing and processing eye images and data. The treatment of T2DM is difficult and prone to OCs, which seriously threatens the normal life of patients. The occurrence of OCs is closely related to abnormal tear and tarsal function. Based on DL, clinical diagnosis and treatment of diabetes and its OCs can be carried out, which has positive application value.
PMID:38299009 | PMC:PMC10828665 | DOI:10.1515/biol-2022-0773
ERTNet: an interpretable transformer-based framework for EEG emotion recognition
Front Neurosci. 2024 Jan 17;18:1320645. doi: 10.3389/fnins.2024.1320645. eCollection 2024.
ABSTRACT
BACKGROUND: Emotion recognition using EEG signals enables clinicians to assess patients' emotional states with precision and immediacy. However, the complexity of EEG signal data poses challenges for traditional recognition methods. Deep learning techniques effectively capture the nuanced emotional cues within these signals by leveraging extensive data. Nonetheless, most deep learning techniques lack interpretability while maintaining accuracy.
METHODS: We developed an interpretable end-to-end EEG emotion recognition framework rooted in the hybrid CNN and transformer architecture. Specifically, temporal convolution isolates salient information from EEG signals while filtering out potential high-frequency noise. Spatial convolution discerns the topological connections between channels. Subsequently, the transformer module processes the feature maps to integrate high-level spatiotemporal features, enabling the identification of the prevailing emotional state.
RESULTS: Experiments' results demonstrated that our model excels in diverse emotion classification, achieving an accuracy of 74.23% ± 2.59% on the dimensional model (DEAP) and 67.17% ± 1.70% on the discrete model (SEED-V). These results surpass the performances of both CNN and LSTM-based counterparts. Through interpretive analysis, we ascertained that the beta and gamma bands in the EEG signals exert the most significant impact on emotion recognition performance. Notably, our model can independently tailor a Gaussian-like convolution kernel, effectively filtering high-frequency noise from the input EEG data.
DISCUSSION: Given its robust performance and interpretative capabilities, our proposed framework is a promising tool for EEG-driven emotion brain-computer interface.
PMID:38298914 | PMC:PMC10827927 | DOI:10.3389/fnins.2024.1320645
Emergence of number sense through the integration of multimodal information: developmental learning insights from neural network models
Front Neurosci. 2024 Jan 17;18:1330512. doi: 10.3389/fnins.2024.1330512. eCollection 2024.
ABSTRACT
INTRODUCTION: Associating multimodal information is essential for human cognitive abilities including mathematical skills. Multimodal learning has also attracted attention in the field of machine learning, and it has been suggested that the acquisition of better latent representation plays an important role in enhancing task performance. This study aimed to explore the impact of multimodal learning on representation, and to understand the relationship between multimodal representation and the development of mathematical skills.
METHODS: We employed a multimodal deep neural network as the computational model for multimodal associations in the brain. We compared the representations of numerical information, that is, handwritten digits and images containing a variable number of geometric figures learned through single- and multimodal methods. Next, we evaluated whether these representations were beneficial for downstream arithmetic tasks.
RESULTS: Multimodal training produced better latent representation in terms of clustering quality, which is consistent with previous findings on multimodal learning in deep neural networks. Moreover, the representations learned using multimodal information exhibited superior performance in arithmetic tasks.
DISCUSSION: Our novel findings experimentally demonstrate that changes in acquired latent representations through multimodal association learning are directly related to cognitive functions, including mathematical skills. This supports the possibility that multimodal learning using deep neural network models may offer novel insights into higher cognitive functions.
PMID:38298912 | PMC:PMC10828047 | DOI:10.3389/fnins.2024.1330512
Resolution-enhanced multi-core fiber imaging learned on a digital twin for cancer diagnosis
Neurophotonics. 2024 Sep;11(Suppl 1):S11505. doi: 10.1117/1.NPh.11.S1.S11505. Epub 2024 Jan 31.
ABSTRACT
SIGNIFICANCE: Deep learning enables label-free all-optical biopsies and automated tissue classification. Endoscopic systems provide intraoperative diagnostics to deep tissue and speed up treatment without harmful tissue removal. However, conventional multi-core fiber (MCF) endoscopes suffer from low resolution and artifacts, which hinder tumor diagnostics.
AIM: We introduce a method to enable unpixelated, high-resolution tumor imaging through a given MCF with a diameter of around 0.65 mm and arbitrary core arrangement and inhomogeneous transmissivity.
APPROACH: Image reconstruction is based on deep learning and the digital twin concept of the single-reference-based simulation with inhomogeneous optical properties of MCF and transfer learning on a small experimental dataset of biological tissue. The reference provided physical information about the MCF during the training processes.
RESULTS: For the simulated data, hallucination caused by the MCF inhomogeneity was eliminated, and the averaged peak signal-to-noise ratio and structural similarity were increased from 11.2 dB and 0.20 to 23.4 dB and 0.74, respectively. By transfer learning, the metrics of independent test images experimentally acquired on glioblastoma tissue ex vivo can reach up to 31.6 dB and 0.97 with 14 fps computing speed.
CONCLUSIONS: With the proposed approach, a single reference image was required in the pre-training stage and laborious acquisition of training data was bypassed. Validation on glioblastoma cryosections with transfer learning on only 50 image pairs showed the capability for high-resolution deep tissue retrieval and high clinical feasibility.
PMID:38298866 | PMC:PMC10828892 | DOI:10.1117/1.NPh.11.S1.S11505
ARTIFICIAL INTELLIGENCE IN OPHTHALMOLOGY
Harefuah. 2024 Jan;163(1):37-42.
ABSTRACT
Artificial intelligence (AI) was first introduced in 1956, and effectively represents the fourth industrial revolution in human history. Over time, this medium has evolved to be the preferred method of medical imagery interpretation. Today, the implementation of AI in the medical field as a whole, and the ophthalmological field in particular, is diverse and includes diagnose, follow-up and monitoring of the progression of ocular diseases. For example, AI algorithms can identify ectasia, and pre-clinical signs of keratoconus, using images and information computed from various corneal maps. Machine learning (ML) is a specific technique for implementing AI. It is defined as a series of automated methods that identify patterns and templates in data and leverage these to perform predictions on new data. This technology was first applied in the 1980s. Deep learning is an advanced form of ML inspired by and designed to imitate the human brain process, constructed of layers, each responsible for identifying patterns, thereby successfully modeling complex scenarios. The significant advantage of ML in medicine is in its' ability to monitor and follow patients with efficiency at a low cost. Deep learning is utilized to monitor ocular diseases such as diabetic retinopathy, age-related macular degeneration, glaucoma, cataract, and retinopathy of prematurity. These conditions, as well as others, require frequent follow-up in order to track changes over time. Though computer technology is important for identifying and grading various ocular diseases, it still necessitates additional clinical validation and does not entirely replace human diagnostic skill.
PMID:38297419
O2 supplementation disambiguation in clinical narratives to support retrospective COVID-19 studies
BMC Med Inform Decis Mak. 2024 Jan 31;24(1):29. doi: 10.1186/s12911-024-02425-2.
ABSTRACT
BACKGROUND: Oxygen saturation, a key indicator of COVID-19 severity, poses challenges, especially in cases of silent hypoxemia. Electronic health records (EHRs) often contain supplemental oxygen information within clinical narratives. Streamlining patient identification based on oxygen levels is crucial for COVID-19 research, underscoring the need for automated classifiers in discharge summaries to ease the manual review burden on physicians.
METHOD: We analysed text lines extracted from anonymised COVID-19 patient discharge summaries in German to perform a binary classification task, differentiating patients who received oxygen supplementation and those who did not. Various machine learning (ML) algorithms, including classical ML to deep learning (DL) models, were compared. Classifier decisions were explained using Local Interpretable Model-agnostic Explanations (LIME), which visualize the model decisions.
RESULT: Classical ML to DL models achieved comparable performance in classification, with an F-measure varying between 0.942 and 0.955, whereas the classical ML approaches were faster. Visualisation of embedding representation of input data reveals notable variations in the encoding patterns between classic and DL encoders. Furthermore, LIME explanations provide insights into the most relevant features at token level that contribute to these observed differences.
CONCLUSION: Despite a general tendency towards deep learning, these use cases show that classical approaches yield comparable results at lower computational cost. Model prediction explanations using LIME in textual and visual layouts provided a qualitative explanation for the model performance.
PMID:38297364 | DOI:10.1186/s12911-024-02425-2
Detection of periodontal bone loss patterns and furcation defects from panoramic radiographs using deep learning algorithm: a retrospective study
BMC Oral Health. 2024 Jan 31;24(1):155. doi: 10.1186/s12903-024-03896-5.
ABSTRACT
BACKGROUND: This retrospective study aimed to develop a deep learning algorithm for the interpretation of panoramic radiographs and to examine the performance of this algorithm in the detection of periodontal bone losses and bone loss patterns.
METHODS: A total of 1121 panoramic radiographs were used in this study. Bone losses in the maxilla and mandibula (total alveolar bone loss) (n = 2251), interdental bone losses (n = 25303), and furcation defects (n = 2815) were labeled using the segmentation method. In addition, interdental bone losses were divided into horizontal (n = 21839) and vertical (n = 3464) bone losses according to the defect patterns. A Convolutional Neural Network (CNN)-based artificial intelligence (AI) system was developed using U-Net architecture. The performance of the deep learning algorithm was statistically evaluated by the confusion matrix and ROC curve analysis.
RESULTS: The system showed the highest diagnostic performance in the detection of total alveolar bone losses (AUC = 0.951) and the lowest in the detection of vertical bone losses (AUC = 0.733). The sensitivity, precision, F1 score, accuracy, and AUC values were found as 1, 0.995, 0.997, 0.994, 0.951 for total alveolar bone loss; found as 0.947, 0.939, 0.943, 0.892, 0.910 for horizontal bone losses; found as 0.558, 0.846, 0.673, 0.506, 0.733 for vertical bone losses and found as 0.892, 0.933, 0.912, 0.837, 0.868 for furcation defects (respectively).
CONCLUSIONS: AI systems offer promising results in determining periodontal bone loss patterns and furcation defects from dental radiographs. This suggests that CNN algorithms can also be used to provide more detailed information such as automatic determination of periodontal disease severity and treatment planning in various dental radiographs.
PMID:38297288 | DOI:10.1186/s12903-024-03896-5
ResNet incorporating the fusion data of RGB & hyperspectral images improves classification accuracy of vegetable soybean freshness
Sci Rep. 2024 Jan 31;14(1):2568. doi: 10.1038/s41598-024-51668-6.
ABSTRACT
The freshness of vegetable soybean (VS) is an important indicator for quality evaluation. Currently, deep learning-based image recognition technology provides a fast, efficient, and low-cost method for analyzing the freshness of food. The RGB (red, green, and blue) image recognition technology is widely used in the study of food appearance evaluation. In addition, the hyperspectral image has outstanding performance in predicting the nutrient content of samples. However, there are few reports on the research of classification models based on the fusion data of these two sources of images. We collected RGB and hyperspectral images at four different storage times of VS. The ENVI software was adopted to extract the hyperspectral information, and the RGB images were reconstructed based on the downsampling technology. Then, the one-dimensional hyperspectral data was transformed into a two-dimensional space, which allows it to be overlaid and concatenated with the RGB image data in the channel direction, thereby generating fused data. Compared with four commonly used machine learning models, the deep learning model ResNet18 has higher classification accuracy and computational efficiency. Based on the above results, a novel classification model named ResNet-R &H, which is based on the residual networks (ResNet) structure and incorporates the fusion data of RGB and hyperspectral images, was proposed. The ResNet-R &H can achieve a testing accuracy of 97.6%, which demonstrates a significant enhancement of 4.0% and 7.2% compared to the distinct utilization of hyperspectral data and RGB data, respectively. Overall, this research is significant in providing a unique, efficient, and more accurate classification approach in evaluating the freshness of vegetable soybean. The method proposed in this study can provide a theoretical reference for classifying the freshness of fruits and vegetables to improve classification accuracy and reduce human error and variability.
PMID:38297076 | DOI:10.1038/s41598-024-51668-6
Automated deep learning model for estimating intraoperative blood loss using gauze images
Sci Rep. 2024 Jan 31;14(1):2597. doi: 10.1038/s41598-024-52524-3.
ABSTRACT
The intraoperative estimated blood loss (EBL), an essential parameter for perioperative management, has been evaluated by manually weighing blood in gauze and suction bottles, a process both time-consuming and labor-intensive. As the novel EBL prediction platform, we developed an automated deep learning EBL prediction model, utilizing the patch-wise crumpled state (P-W CS) of gauze images with texture analysis. The proposed algorithm was developed using animal data obtained from a porcine experiment and validated on human intraoperative data prospectively collected from 102 laparoscopic gastric cancer surgeries. The EBL prediction model involves gauze area detection and subsequent EBL regression based on the detected areas, with each stage optimized through comparative model performance evaluations. The selected gauze detection model demonstrated a sensitivity of 96.5% and a specificity of 98.0%. Based on this detection model, the performance of EBL regression stage models was compared. Comparative evaluations revealed that our P-W CS-based model outperforms others, including one reliant on convolutional neural networks and another analyzing the gauze's overall crumpled state. The P-W CS-based model achieved a mean absolute error (MAE) of 0.25 g and a mean absolute percentage error (MAPE) of 7.26% in EBL regression. Additionally, per-patient assessment yielded an MAE of 0.58 g, indicating errors < 1 g/patient. In conclusion, our algorithm provides an objective standard and streamlined approach for EBL estimation during surgery without the need for perioperative approximation and additional tasks by humans. The robust performance of the model across varied surgical conditions emphasizes its clinical potential for real-world application.
PMID:38297011 | DOI:10.1038/s41598-024-52524-3
Incorporating longitudinal history of risk factors into atherosclerotic cardiovascular disease risk prediction using deep learning
Sci Rep. 2024 Jan 31;14(1):2554. doi: 10.1038/s41598-024-51685-5.
ABSTRACT
It is increasingly clear that longitudinal risk factor levels and trajectories are related to risk for atherosclerotic cardiovascular disease (ASCVD) above and beyond single measures. Currently used in clinical care, the Pooled Cohort Equations (PCE) are based on regression methods that predict ASCVD risk based on cross-sectional risk factor levels. Deep learning (DL) models have been developed to incorporate longitudinal data for risk prediction but its benefit for ASCVD risk prediction relative to the traditional Pooled Cohort Equations (PCE) remain unknown. Our study included 15,565 participants from four cardiovascular disease cohorts free of baseline ASCVD who were followed for adjudicated ASCVD. Ten-year ASCVD risk was calculated in the training set using our benchmark, the PCE, and a longitudinal DL model, Dynamic-DeepHit. Predictors included those incorporated in the PCE: sex, race, age, total cholesterol, high density lipid cholesterol, systolic and diastolic blood pressure, diabetes, hypertension treatment and smoking. The discrimination and calibration performance of the two models were evaluated in an overall hold-out testing dataset. Of the 15,565 participants in our dataset, 2170 (13.9%) developed ASCVD. The performance of the longitudinal DL model that incorporated 8 years of longitudinal risk factor data improved upon that of the PCE [AUROC: 0.815 (CI 0.782-0.844) vs 0.792 (CI 0.760-0.825)] and the net reclassification index was 0.385. The brier score for the DL model was 0.0514 compared with 0.0542 in the PCE. Incorporating longitudinal risk factors in ASCVD risk prediction using DL can improve model discrimination and calibration.
PMID:38296982 | DOI:10.1038/s41598-024-51685-5
Multimodality Risk Assessment of Patients with Ischemic Heart Disease Using Deep Learning Models Applied to Electrocardiograms and Chest X-rays
Int Heart J. 2024;65(1):29-38. doi: 10.1536/ihj.23-402.
ABSTRACT
Comprehensive management approaches for patients with ischemic heart disease (IHD) are important aids for prognostication and treatment planning. While single-modality deep neural networks (DNNs) have shown promising performance for detecting cardiac abnormalities, the potential benefits of using DNNs for multimodality risk assessment in patients with IHD have not been reported. The purpose of this study was to investigate the effectiveness of multimodality risk assessment in patients with IHD using a DNN that utilizes 12-lead electrocardiograms (ECGs) and chest X-rays (CXRs), with the prediction of major adverse cardiovascular events (MACEs) being of particular concern.DNN models were applied to detection of left ventricular systolic dysfunction (LVSD) on ECGs and identification of cardiomegaly findings on CXRs. A total of 2107 patients who underwent elective percutaneous coronary intervention were categorized into 4 groups according to the models' outputs: Dual-modality high-risk (n = 105), ECG high-risk (n = 181), CXR high-risk (n = 392), and No-risk (n = 1,429).A total of 342 MACEs were observed. The incidence of a MACE was the highest in the Dual-modality high-risk group (P < 0.001). Multivariate Cox hazards analysis for predicting MACE revealed that the Dual-modality high-risk group had a significantly higher risk of MACE than the No-risk group (hazard ratio (HR): 2.370, P < 0.001), the ECG high-risk group (HR: 1.906, P = 0.010), and the CXR high-risk group (HR: 1.624, P = 0.018), after controlling for confounding factors.The results suggest the usefulness of multimodality risk assessment using DNN models applied to 12-lead ECG and CXR data from patients with IHD.
PMID:38296576 | DOI:10.1536/ihj.23-402