Deep learning

Orbital angular momentum-mediated machine learning for high-accuracy mode-feature encoding

Wed, 2024-02-14 06:00

Light Sci Appl. 2024 Feb 14;13(1):49. doi: 10.1038/s41377-024-01386-5.

ABSTRACT

Machine learning with optical neural networks has featured unique advantages of the information processing including high speed, ultrawide bandwidths and low energy consumption because the optical dimensions (time, space, wavelength, and polarization) could be utilized to increase the degree of freedom. However, due to the lack of the capability to extract the information features in the orbital angular momentum (OAM) domain, the theoretically unlimited OAM states have never been exploited to represent the signal of the input/output nodes in the neural network model. Here, we demonstrate OAM-mediated machine learning with an all-optical convolutional neural network (CNN) based on Laguerre-Gaussian (LG) beam modes with diverse diffraction losses. The proposed CNN architecture is composed of a trainable OAM mode-dispersion impulse as a convolutional kernel for feature extraction, and deep-learning diffractive layers as a classifier. The resultant OAM mode-dispersion selectivity can be applied in information mode-feature encoding, leading to an accuracy as high as 97.2% for MNIST database through detecting the energy weighting coefficients of the encoded OAM modes, as well as a resistance to eavesdropping in point-to-point free-space transmission. Moreover, through extending the target encoded modes into multiplexed OAM states, we realize all-optical dimension reduction for anomaly detection with an accuracy of 85%. Our work provides a deep insight to the mechanism of machine learning with spatial modes basis, which can be further utilized to improve the performances of various machine-vision tasks by constructing the unsupervised learning-based auto-encoder.

PMID:38355566 | DOI:10.1038/s41377-024-01386-5

Categories: Literature Watch

InsightSleepNet: the interpretable and uncertainty-aware deep learning network for sleep staging using continuous Photoplethysmography

Wed, 2024-02-14 06:00

BMC Med Inform Decis Mak. 2024 Feb 14;24(1):50. doi: 10.1186/s12911-024-02437-y.

ABSTRACT

BACKGROUND: This study was conducted to address the existing drawbacks of inconvenience and high costs associated with sleep monitoring. In this research, we performed sleep staging using continuous photoplethysmography (PPG) signals for sleep monitoring with wearable devices. Furthermore, our aim was to develop a more efficient sleep monitoring method by considering both the interpretability and uncertainty of the model's prediction results, with the goal of providing support to medical professionals in their decision-making process.

METHOD: The developed 4-class sleep staging model based on continuous PPG data incorporates several key components: a local attention module, an InceptionTime module, a time-distributed dense layer, a temporal convolutional network (TCN), and a 1D convolutional network (CNN). This model prioritizes both interpretability and uncertainty estimation in its prediction results. The local attention module is introduced to provide insights into the impact of each epoch within the continuous PPG data. It achieves this by leveraging the TCN structure. To quantify the uncertainty of prediction results and facilitate selective predictions, an energy score estimation is employed. By enhancing both the performance and interpretability of the model and taking into consideration the reliability of its predictions, we developed the InsightSleepNet for accurate sleep staging.

RESULT: InsightSleepNet was evaluated using three distinct datasets: MESA, CFS, and CAP. Initially, we assessed the model's classification performance both before and after applying an energy score threshold. We observed a significant improvement in the model's performance with the implementation of the energy score threshold. On the MESA dataset, prior to applying the energy score threshold, the accuracy was 84.2% with a Cohen's kappa of 0.742 and weighted F1 score of 0.842. After implementing the energy score threshold, the accuracy increased to a range of 84.8-86.1%, Cohen's kappa values ranged from 0.75 to 0.78 and weighted F1 scores ranged from 0.848 to 0.861. In the case of the CFS dataset, we also noted enhanced performance. Before the application of the energy score threshold, the accuracy stood at 80.6% with a Cohen's kappa of 0.72 and weighted F1 score of 0.808. After thresholding, the accuracy improved to a range of 81.9-85.6%, Cohen's kappa values ranged from 0.74 to 0.79 and weighted F1 scores ranged from 0.821 to 0.857. Similarly, on the CAP dataset, the initial accuracy was 80.6%, accompanied by a Cohen's kappa of 0.73 and weighted F1 score was 0.805. Following the application of the threshold, the accuracy increased to a range of 81.4-84.3%, Cohen's kappa values ranged from 0.74 to 0.79 and weighted F1 scores ranged from 0.813 to 0.842. Additionally, by interpreting the model's predictions, we obtained results indicating a correlation between the peak of the PPG signal and sleep stage classification.

CONCLUSION: InsightSleepNet is a 4-class sleep staging model that utilizes continuous PPG data, serves the purpose of continuous sleep monitoring with wearable devices. Beyond its primary function, it might facilitate in-depth sleep analysis by medical professionals and empower them with interpretability for intervention-based predictions. This capability can also support well-informed clinical decision-making, providing valuable insights and serving as a reliable second opinion in medical settings.

PMID:38355559 | DOI:10.1186/s12911-024-02437-y

Categories: Literature Watch

Enhancing surgical performance in cardiothoracic surgery with innovations from computer vision and artificial intelligence: a narrative review

Wed, 2024-02-14 06:00

J Cardiothorac Surg. 2024 Feb 14;19(1):94. doi: 10.1186/s13019-024-02558-5.

ABSTRACT

When technical requirements are high, and patient outcomes are critical, opportunities for monitoring and improving surgical skills via objective motion analysis feedback may be particularly beneficial. This narrative review synthesises work on technical and non-technical surgical skills, collaborative task performance, and pose estimation to illustrate new opportunities to advance cardiothoracic surgical performance with innovations from computer vision and artificial intelligence. These technological innovations are critically evaluated in terms of the benefits they could offer the cardiothoracic surgical community, and any barriers to the uptake of the technology are elaborated upon. Like some other specialities, cardiothoracic surgery has relatively few opportunities to benefit from tools with data capture technology embedded within them (as is possible with robotic-assisted laparoscopic surgery, for example). In such cases, pose estimation techniques that allow for movement tracking across a conventional operating field without using specialist equipment or markers offer considerable potential. With video data from either simulated or real surgical procedures, these tools can (1) provide insight into the development of expertise and surgical performance over a surgeon's career, (2) provide feedback to trainee surgeons regarding areas for improvement, (3) provide the opportunity to investigate what aspects of skill may be linked to patient outcomes which can (4) inform the aspects of surgical skill which should be focused on within training or mentoring programmes. Classifier or assessment algorithms that use artificial intelligence to 'learn' what expertise is from expert surgical evaluators could further assist educators in determining if trainees meet competency thresholds. With collaborative efforts between surgical teams, medical institutions, computer scientists and researchers to ensure this technology is developed with usability and ethics in mind, the developed feedback tools could improve cardiothoracic surgical practice in a data-driven way.

PMID:38355499 | DOI:10.1186/s13019-024-02558-5

Categories: Literature Watch

Inflamed immune phenotype predicts favorable clinical outcomes of immune checkpoint inhibitor therapy across multiple cancer types

Wed, 2024-02-14 06:00

J Immunother Cancer. 2024 Feb 14;12(2):e008339. doi: 10.1136/jitc-2023-008339.

ABSTRACT

BACKGROUND: The inflamed immune phenotype (IIP), defined by enrichment of tumor-infiltrating lymphocytes (TILs) within intratumoral areas, is a promising tumor-agnostic biomarker of response to immune checkpoint inhibitor (ICI) therapy. However, it is challenging to define the IIP in an objective and reproducible manner during manual histopathologic examination. Here, we investigate artificial intelligence (AI)-based immune phenotypes capable of predicting ICI clinical outcomes in multiple solid tumor types.

METHODS: Lunit SCOPE IO is a deep learning model which determines the immune phenotype of the tumor microenvironment based on TIL analysis. We evaluated the correlation between the IIP and ICI treatment outcomes in terms of objective response rates (ORR), progression-free survival (PFS), and overall survival (OS) in a cohort of 1,806 ICI-treated patients representing over 27 solid tumor types retrospectively collected from multiple institutions.

RESULTS: We observed an overall IIP prevalence of 35.2% and significantly more favorable ORRs (26.3% vs 15.8%), PFS (median 5.3 vs 3.1 months, HR 0.68, 95% CI 0.61 to 0.76), and OS (median 25.3 vs 13.6 months, HR 0.66, 95% CI 0.57 to 0.75) after ICI therapy in IIP compared with non-IIP patients, respectively (p<0.001 for all comparisons). On subgroup analysis, the IIP was generally prognostic of favorable PFS across major patient subgroups, with the exception of the microsatellite unstable/mismatch repair deficient subgroup.

CONCLUSION: The AI-based IIP may represent a practical, affordable, clinically actionable, and tumor-agnostic biomarker prognostic of ICI therapy response across diverse tumor types.

PMID:38355279 | DOI:10.1136/jitc-2023-008339

Categories: Literature Watch

Integrating artificial intelligence into lung cancer screening: a randomised controlled trial protocol

Wed, 2024-02-14 06:00

BMJ Open. 2024 Feb 13;14(2):e074680. doi: 10.1136/bmjopen-2023-074680.

ABSTRACT

INTRODUCTION: Lung cancer (LC) is the most common cause of cancer-related deaths worldwide. Its early detection can be achieved with a CT scan. Two large randomised trials proved the efficacy of low-dose CT (LDCT)-based lung cancer screening (LCS) in high-risk populations. The decrease in specific mortality is 20%-25%.Nonetheless, implementing LCS on a large scale faces obstacles due to the low number of thoracic radiologists and CT scans available for the eligible population and the high frequency of false-positive screening results and the long period of indeterminacy of nodules that can reach up to 24 months, which is a source of prolonged anxiety and multiple costly examinations with possible side effects.Deep learning, an artificial intelligence solution has shown promising results in retrospective trials detecting lung nodules and characterising them. However, until now no prospective studies have demonstrated their importance in a real-life setting.

METHODS AND ANALYSIS: This open-label randomised controlled study focuses on LCS for patients aged 50-80 years, who smoked more than 20 pack-years, whether active or quit smoking less than 15 years ago. Its objective is to determine whether assisting a multidisciplinary team (MDT) with a 3D convolutional network-based analysis of screening chest CT scans accelerates the definitive classification of nodules into malignant or benign. 2722 patients will be included with the aim to demonstrate a 3-month reduction in the delay between lung nodule detection and its definitive classification into benign or malignant.

ETHICS AND DISSEMINATION: The sponsor of this study is the University Hospital of Nice. The study was approved for France by the ethical committee CPP (Comités de Protection des Personnes) Sud-Ouest et outre-mer III (No. 2022-A01543-40) and the Agence Nationale du Medicament et des produits de Santé (Ministry of Health) in December 2023. The findings of the trial will be disseminated through peer-reviewed journals and national and international conference presentations.

TRIAL REGISTRATION NUMBER: NCT05704920.

PMID:38355174 | DOI:10.1136/bmjopen-2023-074680

Categories: Literature Watch

High-Precision Microscale Particulate Matter Prediction in Diverse Environments Using a Long Short-Term Memory Neural Network and Street View Imagery

Wed, 2024-02-14 06:00

Environ Sci Technol. 2024 Feb 14. doi: 10.1021/acs.est.3c06511. Online ahead of print.

ABSTRACT

In this study, we propose a novel long short-term memory (LSTM) neural network model that leverages color features (HSV: hue, saturation, value) extracted from street images to estimate air quality with particulate matter (PM) in four typical European environments: urban, suburban, villages, and the harbor. To evaluate its performance, we utilize concentration data for eight parameters of ambient PM (PM1.0, PM2.5, and PM10, particle number concentration, lung-deposited surface area, equivalent mass concentrations of ultraviolet PM, black carbon, and brown carbon) collected from a mobile monitoring platform during the nonheating season in downtown Augsburg, Germany, along with synchronized street view images. Experimental comparisons were conducted between the LSTM model and other deep learning models (recurrent neural network and gated recurrent unit). The results clearly demonstrate a better performance of the LSTM model compared with other statistically based models. The LSTM-HSV model achieved impressive interpretability rates above 80%, for the eight PM metrics mentioned above, indicating the expected performance of the proposed model. Moreover, the successful application of the LSTM-HSV model in other seasons of Augsburg city and various environments (suburbs, villages, and harbor cities) demonstrates its satisfactory generalization capabilities in both temporal and spatial dimensions. The successful application of the LSTM-HSV model underscores its potential as a versatile tool for the estimation of air pollution after presampling of the studied area, with broad implications for urban planning and public health initiatives.

PMID:38355131 | DOI:10.1021/acs.est.3c06511

Categories: Literature Watch

Global domain adaptation attention with data-dependent regulator for scene segmentation

Wed, 2024-02-14 06:00

PLoS One. 2024 Feb 14;19(2):e0295263. doi: 10.1371/journal.pone.0295263. eCollection 2024.

ABSTRACT

Most semantic segmentation works have obtained accurate segmentation results through exploring the contextual dependencies. However, there are several major limitations that need further investigation. For example, most approaches rarely distinguish different types of contextual dependencies, which may pollute the scene understanding. Moreover, local convolutions are commonly used in deep learning models to learn attention and capture local patterns in the data. These convolutions operate on a small neighborhood of the input, focusing on nearby information and disregarding global structural patterns. To address these concerns, we propose a Global Domain Adaptation Attention with Data-Dependent Regulator (GDAAR) method to explore the contextual dependencies. Specifically, to effectively capture both the global distribution information and local appearance details, we suggest using a stacked relation approach. This involves incorporating the feature node itself and its pairwise affinities with all other feature nodes within the network, arranged in raster scan order. By doing so, we can learn a global domain adaptation attention mechanism. Meanwhile, to improve the features similarity belonging to the same segment region while keeping the discriminative power of features belonging to different segments, we design a data-dependent regulator to adjust the global domain adaptation attention on the feature map during inference. Extensive ablation studies demonstrate that our GDAAR better captures the global distribution information for the contextual dependencies and achieves the state-of-the-art performance on several popular benchmarks.

PMID:38354116 | DOI:10.1371/journal.pone.0295263

Categories: Literature Watch

Using spatio-temporal graph neural networks to estimate fleet-wide photovoltaic performance degradation patterns

Wed, 2024-02-14 06:00

PLoS One. 2024 Feb 14;19(2):e0297445. doi: 10.1371/journal.pone.0297445. eCollection 2024.

ABSTRACT

Accurate estimation of photovoltaic (PV) system performance is crucial for determining its feasibility as a power generation technology and financial asset. PV-based energy solutions offer a viable alternative to traditional energy resources due to their superior Levelized Cost of Energy (LCOE). A significant challenge in assessing the LCOE of PV systems lies in understanding the Performance Loss Rate (PLR) for large fleets of PV systems. Estimating the PLR of PV systems becomes increasingly important in the rapidly growing PV industry. Precise PLR estimation benefits PV users by providing real-time monitoring of PV module performance, while explainable PLR estimation assists PV manufacturers in studying and enhancing the performance of their products. However, traditional PLR estimation methods based on statistical models have notable drawbacks. Firstly, they require user knowledge and decision-making. Secondly, they fail to leverage spatial coherence for fleet-level analysis. Additionally, these methods inherently assume the linearity of degradation, which is not representative of real world degradation. To overcome these challenges, we propose a novel graph deep learning-based decomposition method called the Spatio-Temporal Graph Neural Network for fleet-level PLR estimation (PV-stGNN-PLR). PV-stGNN-PLR decomposes the power timeseries data into aging and fluctuation components, utilizing the aging component to estimate PLR. PV-stGNN-PLR exploits spatial and temporal coherence to derive PLR estimation for all systems in a fleet and imposes flatness and smoothness regularization in loss function to ensure the successful disentanglement between aging and fluctuation. We have evaluated PV-stGNN-PLR on three simulated PV datasets consisting of 100 inverters from 5 sites. Experimental results show that PV-stGNN-PLR obtains a reduction of 33.9% and 35.1% on average in Mean Absolute Percent Error (MAPE) and Euclidean Distance (ED) in PLR degradation pattern estimation compared to the state-of-the-art PLR estimation methods.

PMID:38354115 | DOI:10.1371/journal.pone.0297445

Categories: Literature Watch

Sequential Point Clouds: A Survey

Wed, 2024-02-14 06:00

IEEE Trans Pattern Anal Mach Intell. 2024 Feb 14;PP. doi: 10.1109/TPAMI.2024.3365970. Online ahead of print.

ABSTRACT

Point clouds have garnered increasing research attention and found numerous practical applications. However, many of these applications, such as autonomous driving and robotic manipulation, rely on sequential point clouds, essentially adding a temporal dimension to the data (i.e., four dimensions) because the information of the static point cloud data could provide is still limited. Recent research efforts have been directed towards enhancing the understanding and utilization of sequential point clouds. This paper offers a comprehensive review of deep learning methods applied to sequential point cloud research, encompassing dynamic flow estimation, object detection & tracking, point cloud segmentation, and point cloud forecasting. This paper further summarizes and compares the quantitative results of the reviewed methods over the public benchmark datasets. Ultimately, the paper concludes by addressing the challenges in current sequential point cloud research and pointing towards promising avenues for future research.

PMID:38354073 | DOI:10.1109/TPAMI.2024.3365970

Categories: Literature Watch

Genetic and Clinical Correlates of AI-Based Brain Aging Patterns in Cognitively Unimpaired Individuals

Wed, 2024-02-14 06:00

JAMA Psychiatry. 2024 Feb 14. doi: 10.1001/jamapsychiatry.2023.5599. Online ahead of print.

ABSTRACT

IMPORTANCE: Brain aging elicits complex neuroanatomical changes influenced by multiple age-related pathologies. Understanding the heterogeneity of structural brain changes in aging may provide insights into preclinical stages of neurodegenerative diseases.

OBJECTIVE: To derive subgroups with common patterns of variation in participants without diagnosed cognitive impairment (WODCI) in a data-driven manner and relate them to genetics, biomedical measures, and cognitive decline trajectories.

DESIGN, SETTING, AND PARTICIPANTS: Data acquisition for this cohort study was performed from 1999 to 2020. Data consolidation and harmonization were conducted from July 2017 to July 2021. Age-specific subgroups of structural brain measures were modeled in 4 decade-long intervals spanning ages 45 to 85 years using a deep learning, semisupervised clustering method leveraging generative adversarial networks. Data were analyzed from July 2021 to February 2023 and were drawn from the Imaging-Based Coordinate System for Aging and Neurodegenerative Diseases (iSTAGING) international consortium. Individuals WODCI at baseline spanning ages 45 to 85 years were included, with greater than 50 000 data time points.

EXPOSURES: Individuals WODCI at baseline scan.

MAIN OUTCOMES AND MEASURES: Three subgroups, consistent across decades, were identified within the WODCI population. Associations with genetics, cardiovascular risk factors (CVRFs), amyloid β (Aβ), and future cognitive decline were assessed.

RESULTS: In a sample of 27 402 individuals (mean [SD] age, 63.0 [8.3] years; 15 146 female [55%]) WODCI, 3 subgroups were identified in contrast with the reference group: a typical aging subgroup, A1, with a specific pattern of modest atrophy and white matter hyperintensity (WMH) load, and 2 accelerated aging subgroups, A2 and A3, with characteristics that were more distinct at age 65 years and older. A2 was associated with hypertension, WMH, and vascular disease-related genetic variants and was enriched for Aβ positivity (ages ≥65 years) and apolipoprotein E (APOE) ε4 carriers. A3 showed severe, widespread atrophy, moderate presence of CVRFs, and greater cognitive decline. Genetic variants associated with A1 were protective for WMH (rs7209235: mean [SD] B = -0.07 [0.01]; P value = 2.31 × 10-9) and Alzheimer disease (rs72932727: mean [SD] B = 0.1 [0.02]; P value = 6.49 × 10-9), whereas the converse was observed for A2 (rs7209235: mean [SD] B = 0.1 [0.01]; P value = 1.73 × 10-15 and rs72932727: mean [SD] B = -0.09 [0.02]; P value = 4.05 × 10-7, respectively); variants in A3 were associated with regional atrophy (rs167684: mean [SD] B = 0.08 [0.01]; P value = 7.22 × 10-12) and white matter integrity measures (rs1636250: mean [SD] B = 0.06 [0.01]; P value = 4.90 × 10-7).

CONCLUSIONS AND RELEVANCE: The 3 subgroups showed distinct associations with CVRFs, genetics, and subsequent cognitive decline. These subgroups likely reflect multiple underlying neuropathologic processes and affect susceptibility to Alzheimer disease, paving pathways toward patient stratification at early asymptomatic stages and promoting precision medicine in clinical trials and health care.

PMID:38353984 | DOI:10.1001/jamapsychiatry.2023.5599

Categories: Literature Watch

A Q-transform-based deep learning model for the classification of atrial fibrillation types

Wed, 2024-02-14 06:00

Phys Eng Sci Med. 2024 Feb 14. doi: 10.1007/s13246-024-01391-3. Online ahead of print.

ABSTRACT

According to the World Health Organization (WHO), Atrial Fibrillation (AF) is emerging as a global epidemic, which has resulted in a need for techniques to accurately diagnose AF and its various subtypes. While the classification of cardiac arrhythmias with AF is common, distinguishing between AF subtypes is not. Accurate classification of AF subtypes is important for making better clinical decisions and for timely management of the disease. AI techniques are increasingly being considered for image classification and detection in various ailments, as they have shown promising results in improving diagnosis and treatment outcomes. This paper reports the development of a custom 2D Convolutional Neural Network (CNN) model with six layers to automatically differentiate Non-Atrial Fibrillation (Non-AF) rhythm from Paroxysmal Atrial Fibrillation (PAF) and Persistent Atrial Fibrillation (PsAF) rhythms from ECG images. ECG signals were obtained from a publicly available database and segmented into 10-second segments. Applying Constant Q-Transform (CQT) to the segmented ECG signals created a time-frequency depiction, yielding 98,966 images for Non-AF, 16,497 images for PAF, and 52,861 images for PsAF. Due to class imbalance in the PAF and PsAF classes, data augmentation techniques were utilized to increase the number of PAF and PsAF images to match the count of Non-AF images. The training, validation, and testing ratios were 0.7, 0.15, and 0.15, respectively. The training set consisted of 207,828 images, whereas the testing and validation set consisted of 44,538 images and 44,532 images, respectively. The proposed model achieved accuracy, precision, sensitivity, specificity, and F1 score values of 0.98, 0.98, 0.98, 0.97, and 0.98, respectively. This model has the potential to assist physicians in selecting personalized AF treatment and reducing misdiagnosis.

PMID:38353927 | DOI:10.1007/s13246-024-01391-3

Categories: Literature Watch

Triticale field phenotyping using RGB camera for ear counting and yield estimation

Wed, 2024-02-14 06:00

J Appl Genet. 2024 Feb 14. doi: 10.1007/s13353-024-00835-6. Online ahead of print.

ABSTRACT

Triticale (X Triticosecale Wittmack), a wheat-rye small grain crop hybrid, combines wheat and rye attributes in one hexaploid genome. It is characterized by high adaptability to adverse environmental conditions: drought, soil acidity, salinity and heavy metal ions, poorer soil quality, and waterlogging. So that its cultivation is prospective in a changing climate. Here, we describe RGB on-ground phenotyping of field-grown eighteen triticale market-available cultivars, made in naturally changing light conditions, in two consecutive winter cereals growing seasons: 2018-2019 and 2019-2020. The number of ears was counted on top-down images with an accuracy of 95% and mean average precision (mAP) of 0.71 using advanced object detection algorithm YOLOv4, with ensemble modeling of field imaging captured in two different illumination conditions. A correlation between the number of ears and yield was achieved at the statistical importance of 0.16 for data from 2019. Results are discussed from the perspective of modern breeding and phenotyping bottleneck.

PMID:38353850 | DOI:10.1007/s13353-024-00835-6

Categories: Literature Watch

Comparing ARIMA and various deep learning models for long-term water quality index forecasting in Dez River, Iran

Wed, 2024-02-14 06:00

Environ Sci Pollut Res Int. 2024 Feb 14. doi: 10.1007/s11356-024-32228-x. Online ahead of print.

ABSTRACT

Water scarcity poses a significant global challenge, particularly in developing nations like Iran. Consequently, there is a pressing requirement for ongoing monitoring and prediction of water quality, utilizing advanced techniques characterized by low implementation costs, shorter timeframes, and high accuracy. In the present study, the investigation and forecasting of the monthly time series of a single-variable river water quality index have been addressed using ten water quality parameters. Daily monitoring data from four stations in the Dez River from 2010 to 2020 have been utilized to obtain the river water quality index value from the dataset. The Shannon entropy method has been employed to assign weights to each water quality parameter. Utilizing the integrated autoregressive integrated moving average (ARIMA) model, which ranks among the most extensively employed models for time series forecasting, and five deep learning models including Simple_RNN, LSTM, CNN, GRU, and MLP, the water quality index for the following year is predicted. The performance of the prediction models is evaluated using RMSE, MAE, MSE, and MAPE as evaluation metrics. The results indicate that the ARIMA model performs worse than the deep learning models, with the MSE, RMSE, MAE, and MAPE values for this model being 81.66, 9.037, 6.376, and 6.749, respectively. The deep learning models show results close to each other, demonstrating similar statistical index values. The outcomes of this study assist relevant decision-makers in planning and implementing necessary actions to enhance water quality, particularly freshwater resources in rivers.

PMID:38353815 | DOI:10.1007/s11356-024-32228-x

Categories: Literature Watch

A deep learning approach for projection and body-side classification in musculoskeletal radiographs

Wed, 2024-02-14 06:00

Eur Radiol Exp. 2024 Feb 14;8(1):23. doi: 10.1186/s41747-023-00417-x.

ABSTRACT

BACKGROUND: The growing prevalence of musculoskeletal diseases increases radiologic workload, highlighting the need for optimized workflow management and automated metadata classification systems. We developed a large-scale, well-characterized dataset of musculoskeletal radiographs and trained deep learning neural networks to classify radiographic projection and body side.

METHODS: In this IRB-approved retrospective single-center study, a dataset of musculoskeletal radiographs from 2011 to 2019 was retrieved and manually labeled for one of 45 possible radiographic projections and the depicted body side. Two classification networks were trained for the respective tasks using the Xception architecture with a custom network top and pretrained weights. Performance was evaluated on a hold-out test sample, and gradient-weighted class activation mapping (Grad-CAM) heatmaps were computed to visualize the influential image regions for network predictions.

RESULTS: A total of 13,098 studies comprising 23,663 radiographs were included with a patient-level dataset split, resulting in 19,183 training, 2,145 validation, and 2,335 test images. Focusing on paired body regions, training for side detection included 16,319 radiographs (13,284 training, 1,443 validation, and 1,592 test images). The models achieved an overall accuracy of 0.975 for projection and 0.976 for body-side classification on the respective hold-out test sample. Errors were primarily observed in projections with seamless anatomical transitions or non-orthograde adjustment techniques.

CONCLUSIONS: The deep learning neural networks demonstrated excellent performance in classifying radiographic projection and body side across a wide range of musculoskeletal radiographs. These networks have the potential to serve as presorting algorithms, optimizing radiologic workflow and enhancing patient care.

RELEVANCE STATEMENT: The developed networks excel at classifying musculoskeletal radiographs, providing valuable tools for research data extraction, standardized image sorting, and minimizing misclassifications in artificial intelligence systems, ultimately enhancing radiology workflow efficiency and patient care.

KEY POINTS: • A large-scale, well-characterized dataset was developed, covering a broad spectrum of musculoskeletal radiographs. • Deep learning neural networks achieved high accuracy in classifying radiographic projection and body side. • Grad-CAM heatmaps provided insight into network decisions, contributing to their interpretability and trustworthiness. • The trained models can help optimize radiologic workflow and manage large amounts of data.

PMID:38353812 | DOI:10.1186/s41747-023-00417-x

Categories: Literature Watch

Radiation reduction for interventional radiology imaging: a video frame interpolation solution

Wed, 2024-02-14 06:00

Insights Imaging. 2024 Feb 14;15(1):42. doi: 10.1186/s13244-024-01620-z.

ABSTRACT

PURPOSE: The aim of this study was to diminish radiation exposure in interventional radiology (IR) imaging while maintaining image quality. This was achieved by decreasing the acquisition frame rate and employing a deep neural network to interpolate the reduced frames.

METHODS: This retrospective study involved the analysis of 1634 IR sequences from 167 pediatric patients (March 2014 to January 2022). The dataset underwent a random split into training and validation subsets (at a 9:1 ratio) for model training and evaluation. Our approach proficiently synthesized absent frames in simulated low-frame-rate sequences by excluding intermediate frames from the validation subset. Accuracy assessments encompassed both objective experiments and subjective evaluations conducted by nine radiologists.

RESULTS: The deep learning model adeptly interpolated the eliminated frames within IR sequences, demonstrating encouraging peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) results. The average PSNR values for angiographic, subtraction, and fluoroscopic modes were 44.94 dB, 34.84 dB, and 33.82 dB, respectively, while the corresponding SSIM values were 0.9840, 0.9194, and 0.7752. Subjective experiments conducted with experienced interventional radiologists revealed minimal discernible differences between interpolated and authentic sequences.

CONCLUSION: Our method, which interpolates low-frame-rate IR sequences, has shown the capability to produce high-quality IR images. Additionally, the model exhibits potential for reducing the frame rate during IR image acquisition, consequently mitigating radiation exposure.

CRITICAL RELEVANCE STATEMENT: This study presents a critical advancement in clinical radiology by demonstrating the effectiveness of a deep neural network in reducing radiation exposure during pediatric interventional radiology while maintaining image quality, offering a potential solution to enhance patient safety.

KEY POINTS: • Reducing radiation: cutting IR image to reduce radiation. • Accurate frame interpolation: our model effectively interpolates missing frames. • High visual quality in terms of PSNR and SSIM, making IR procedures safer without sacrificing quality.

PMID:38353771 | DOI:10.1186/s13244-024-01620-z

Categories: Literature Watch

Deep convolutional-neural-network-based metal artifact reduction for CT-guided interventional oncology procedures (MARIO)

Wed, 2024-02-14 06:00

Med Phys. 2024 Feb 14. doi: 10.1002/mp.16980. Online ahead of print.

ABSTRACT

BACKGROUND: Computed tomography (CT) is routinely used to guide cryoablation procedures. Notably, CT-guidance provides 3D localization of cryoprobes and can be used to delineate frozen tissue during ablation. However, metal-induced artifacts from ablation probes can make accurate probe placement challenging and degrade the ice ball conspicuity, which in combination could lead to undertreatment of potentially curable lesions.

PURPOSE: In this work, we propose an image-based neural network (CNN) model for metal artifact reduction for CT-guided interventional procedures.

METHODS: An image domain metal artifact simulation framework was developed and validated for deep-learning-based metal artifact reduction for interventional oncology (MARIO). CT scans were acquired for 19 different cryoablation probe configurations. The probe configurations varied in the number of probes and the relative orientations. A combination of intensity thresholding and masking based on maximum intensity projections (MIPs) was used to segment both the probes only and probes + artifact in each phantom image. Each of the probe and probe + artifact images were then inserted into 19 unique patient exams, in the image domain, to simulate metal artifact appearance for CT-guided interventional oncology procedures. The resulting 361 pairs of simulated image volumes were partitioned into disjoint training and test datasets of 304 and 57 volumes, respectively. From the training partition, 116 600 image patches with a shape of 128 × 128 × 5 pixels were randomly extracted to be used for training data. The input images consisted of a superposition of the patient and probe + artifact images. The target images consisted of a superposition of the patient and probe only images. This dataset was used to optimize a U-Net type model. The trained model was then applied to 50 independent, previously unseen CT images obtained during renal cryoablations. Three board-certified radiologists with experience in CT-guided ablations performed a blinded review of the MARIO images. A total of 100 images (50 original, 50 MARIO processed) were assessed across different aspects of image quality on a 4-point likert-type item. Statistical analyses were performed using Wilcoxon signed-rank test for paired samples.

RESULTS: Reader scores were significantly higher for MARIO processed images compared to the original images across all metrics (all p < 0.001). The average scores of the overall image quality, iceball conspicuity, overall metal artifact, needle tip visualization, target region confidence, and worst metal artifact, needle tip visualization, iceball conspicuity, and target region confidence improved by 34.91%, 36.29%, 39.94%, 34.17%, 35.13%, and 45.70%, respectively.

CONCLUSIONS: The proposed method of image-based metal artifact simulation can be used to train a MARIO algorithm to effectively reduce probe-related metal artifacts in CT-guided cryoablation procedures.

PMID:38353644 | DOI:10.1002/mp.16980

Categories: Literature Watch

Multicenter Evaluation of a Weakly Supervised Deep Learning Model for Lymph Node Diagnosis in Rectal Cancer on MRI

Wed, 2024-02-14 06:00

Radiol Artif Intell. 2024 Feb 14:e230152. doi: 10.1148/ryai.230152. Online ahead of print.

ABSTRACT

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop a Weakly supervISed model DevelOpment fraMework (WISDOM) to construct a lymph node (LN) diagnosis model for patients with rectal cancer (RC) that uses preoperative MRI data coupled with postoperative patient-level pathologic information. Materials and Methods In this retrospective study, the WISDOM model was built using MRI (T2-weighted and diffusion-weighted imaging) and patient-level pathologic information (the number of postoperative-confirmed metastatic LNs and resected LNs) based on the data of patients with RC between January 2016 and November 2017. The incremental value of the model in assisting radiologists was investigated. The performances in binary and ternary N-staging were evaluated using area under the receiver operating curve (AUC) and the concordance index (C-index), respectively. Results A total of 1014 patients (median age, 62 years; IQR, 54-68 years; 590 male) were analyzed, including the training cohort (n = 589) and internal test cohort (n = 146) from center I, and two external test cohorts (cohort 1: n = 117; cohort 2: n = 162) from centers II and III, respectively. The WISDOM model yielded an overall AUC of 0.81 and C-index of 0.765, significantly outperforming junior radiologists (AUC = 0.69, P < .001; C-index = 0.689, P < .001), and performing comparably with senior radiologists (AUC = 0.79, P = .21; C-index = 0.788, P = .22). Moreover, the model significantly improved the performance of junior radiologists (AUC = 0.80, P < .001; C-index = 0.798, P < .001) and senior radiologists (AUC = 0.88, P < .001; C-index = 0.869, P < .001). Conclusion This study demonstrates the potential of WISDOM as a useful LN diagnosis method using routine rectal MRI data. The improved radiologist performance observed with model assistance highlights the potential clinical utility of WISDOM in practice. Published under a CC BY 4.0 license.

PMID:38353633 | DOI:10.1148/ryai.230152

Categories: Literature Watch

Training of a deep learning based digital subtraction angiography method using synthetic data

Wed, 2024-02-14 06:00

Med Phys. 2024 Feb 14. doi: 10.1002/mp.16973. Online ahead of print.

ABSTRACT

BACKGROUND: Digital subtraction angiography (DSA) is a fluoroscopy method primarily used for the diagnosis of cardiovascular diseases (CVDs). Deep learning-based DSA (DDSA) is developed to extract DSA-like images directly from fluoroscopic images, which helps in saving dose while improving image quality. It can also be applied where C-arm or patient motion is present and conventional DSA cannot be applied. However, due to the lack of clinical training data and unavoidable artifacts in DSA targets, current DDSA models still cannot satisfactorily display specific structures, nor can they predict noise-free images.

PURPOSE: In this study, we propose a strategy for producing abundant synthetic DSA image pairs in which synthetic DSA targets are free of typical artifacts and noise commonly found in conventional DSA targets for DDSA model training.

METHODS: More than 7,000 forward-projected computed tomography (CT) images and more than 25,000 synthetic vascular projection images were employed to create contrast-enhanced fluoroscopic images and corresponding DSA images, which were utilized as DSA image pairs for training of the DDSA networks. The CT projection images and vascular projection images were generated from eight whole-body CT scans and 1,584 3D vascular skeletons, respectively. All vessel skeletons were generated with stochastic Lindenmayer systems. We trained DDSA models on this synthetic dataset and compared them to the trainings on a clinical DSA dataset, which contains nearly 4,000 fluoroscopic x-ray images obtained from different models of C-arms.

RESULTS: We evaluated DDSA models on clinical fluoroscopic data of different anatomies, including the leg, abdomen, and heart. The results on leg data showed for different methods that training on synthetic data performed similarly and sometimes outperformed training on clinical data. The results on abdomen and cardiac data demonstrated that models trained on synthetic data were able to extract clearer DSA-like images than conventional DSA and models trained on clinical data. The models trained on synthetic data consistently outperformed their clinical data counterparts, achieving higher scores in the quantitative evaluation of peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) metrics for DDSA images, as well as accuracy, precision, and Dice scores for segmentation of the DDSA images.

CONCLUSIONS: We proposed an approach to train DDSA networks with synthetic DSA image pairs and extract DSA-like images from contrast-enhanced x-ray images directly. This is a potential tool to aid in diagnosis.

PMID:38353632 | DOI:10.1002/mp.16973

Categories: Literature Watch

UDRSNet: An unsupervised deformable registration module based on image structure similarity

Wed, 2024-02-14 06:00

Med Phys. 2024 Feb 14. doi: 10.1002/mp.16986. Online ahead of print.

ABSTRACT

BACKGROUND: Image registration is a challenging problem in many clinical tasks, but deep learning has made significant progress in this area over the past few years. Real-time and robust registration has been made possible by supervised transformation estimation. However, the quality of registrations using this framework depends on the quality of ground truth labels such as displacement field.

PURPOSE: To propose a simple and reliable method for registering medical images based on image structure similarity in a completely unsupervised manner.

METHODS: We proposed a deep cascade unsupervised deformable registration approach to align images without reliable clinical data labels. Our basic network was composed of a displacement estimation module (ResUnet) and a deformation module (spatial transformer layers). We adopted l2 -norm to regularize the deformation field instead of the traditional l1 -norm regularization. Additionally, we utilized structural similarity (ssim) estimation during the training stage to enhance the structural consistency between the deformed images and the reference images.

RESULTS: Experiments results indicated that by incorporating ssim loss, our cascaded methods not only achieved higher dice score of 0.9873, ssim score of 0.9559, normalized cross-correlation (NCC) score of 0.9950, and lower relative sum of squared difference (SSD) error of 0.0313 on CT images, but also outperformed the comparative methods on ultrasound dataset. The statistical t-test results also proved that these improvements of our method have statistical significance.

CONCLUSIONS: In this study, the promising results based on diverse evaluation metrics have demonstrated that our model is simple and effective in deformable image registration (DIR). The generalization ability of the model was also verified through experiments on liver CT images and cardiac ultrasound images.

PMID:38353628 | DOI:10.1002/mp.16986

Categories: Literature Watch

Large language models assisted multi-effect variants mining on cerebral cavernous malformation familial whole genome sequencing

Wed, 2024-02-14 06:00

Comput Struct Biotechnol J. 2024 Feb 1;23:843-858. doi: 10.1016/j.csbj.2024.01.014. eCollection 2024 Dec.

ABSTRACT

Cerebral cavernous malformation (CCM) is a polygenic disease with intricate genetic interactions contributing to quantitative pathogenesis across multiple factors. The principal pathogenic genes of CCM, specifically KRIT1, CCM2, and PDCD10, have been reported, accompanied by a growing wealth of genetic data related to mutations. Furthermore, numerous other molecules associated with CCM have been unearthed. However, tackling such massive volumes of unstructured data remains challenging until the advent of advanced large language models. In this study, we developed an automated analytical pipeline specialized in single nucleotide variants (SNVs) related biomedical text analysis called BRLM. To facilitate this, BioBERT was employed to vectorize the rich information of SNVs, while a deep residue network was used to discriminate the classes of the SNVs. BRLM was initially constructed on mutations from 12 different types of TCGA cancers, achieving an accuracy exceeding 99%. It was further examined for CCM mutations in familial sequencing data analysis, highlighting an upstream master regulator gene fibroblast growth factor 1 (FGF1). With multi-omics characterization and validation in biological function, FGF1 demonstrated to play a significant role in the development of CCMs, which proved the effectiveness of our model. The BRLM web server is available at http://1.117.230.196.

PMID:38352937 | PMC:PMC10861960 | DOI:10.1016/j.csbj.2024.01.014

Categories: Literature Watch

Pages