Deep learning
Using natural language processing for automated classification of disease and to identify misclassified ICD codes in cardiac disease
Eur Heart J Digit Health. 2024 Feb 9;5(3):229-234. doi: 10.1093/ehjdh/ztae008. eCollection 2024 May.
ABSTRACT
AIMS: ICD codes are used for classification of hospitalizations. The codes are used for administrative, financial, and research purposes. It is known, however, that errors occur. Natural language processing (NLP) offers promising solutions for optimizing the process. To investigate methods for automatic classification of disease in unstructured medical records using NLP and to compare these to conventional ICD coding.
METHODS AND RESULTS: Two datasets were used: the open-source Medical Information Mart for Intensive Care (MIMIC)-III dataset (n = 55.177) and a dataset from a hospital in Belgium (n = 12.706). Automated searches using NLP algorithms were performed for the diagnoses 'atrial fibrillation (AF)' and 'heart failure (HF)'. Four methods were used: rule-based search, logistic regression, term frequency-inverse document frequency (TF-IDF), Extreme Gradient Boosting (XGBoost), and Bio-Bidirectional Encoder Representations from Transformers (BioBERT). All algorithms were developed on the MIMIC-III dataset. The best performing algorithm was then deployed on the Belgian dataset. After preprocessing a total of 1438 reports was retained in the Belgian dataset. XGBoost on TF-IDF matrix resulted in an accuracy of 0.94 and 0.92 for AF and HF, respectively. There were 211 mismatches between algorithm and ICD codes. One hundred and three were due to a difference in data availability or differing definitions. In the remaining 108 mismatches, 70% were due to incorrect labelling by the algorithm and 30% were due to erroneous ICD coding (2% of total hospitalizations).
CONCLUSION: A newly developed NLP algorithm attained a high accuracy for classifying disease in medical records. XGBoost outperformed the deep learning technique BioBERT. NLP algorithms could be used to identify ICD-coding errors and optimize and support the ICD-coding process.
PMID:38774372 | PMC:PMC11104467 | DOI:10.1093/ehjdh/ztae008
Predicting heart failure outcomes by integrating breath-by-breath measurements from cardiopulmonary exercise testing and clinical data through a deep learning survival neural network
Eur Heart J Digit Health. 2024 Jan 31;5(3):324-334. doi: 10.1093/ehjdh/ztae005. eCollection 2024 May.
ABSTRACT
AIMS: Mathematical models previously developed to predict outcomes in patients with heart failure (HF) generally have limited performance and have yet to integrate complex data derived from cardiopulmonary exercise testing (CPET), including breath-by-breath data. We aimed to develop and validate a time-to-event prediction model using a deep learning framework using the DeepSurv algorithm to predict outcomes of HF.
METHODS AND RESULTS: Inception cohort of 2490 adult patients with high-risk cardiac conditions or HF underwent CPET with breath-by-breath measurements. Potential predictive features included known clinical indicators, standard summary statistics from CPETs, and mathematical features extracted from the breath-by-breath time series of 13 measurements. The primary outcome was a composite of death, heart transplant, or mechanical circulatory support treated as a time-to-event outcomes. Predictive features ranked as most important included many of the features engineered from the breath-by-breath data in addition to traditional clinical risk factors. The prediction model showed excellent performance in predicting the composite outcome with an area under the curve of 0.93 in the training and 0.87 in the validation data sets. Both the predicted vs. actual freedom from the composite outcome and the calibration of the prediction model were excellent. Model performance remained stable in multiple subgroups of patients.
CONCLUSION: Using a combined deep learning and survival algorithm, integrating breath-by-breath data from CPETs resulted in improved predictive accuracy for long-term (up to 10 years) outcomes in HF. DeepSurv opens the door for future prediction models that are both highly performing and can more fully use the large and complex quantity of data generated during the care of patients with HF.
PMID:38774366 | PMC:PMC11104469 | DOI:10.1093/ehjdh/ztae005
Artificial intelligence-assisted evaluation of cardiac function by oncology staff in chemotherapy patients
Eur Heart J Digit Health. 2024 Feb 27;5(3):278-287. doi: 10.1093/ehjdh/ztae017. eCollection 2024 May.
ABSTRACT
AIMS: Left ventricular ejection fraction (LVEF) calculation by echocardiography is pivotal in evaluating cancer patients' cardiac function. Artificial intelligence (AI) can facilitate the acquisition of optimal images and automated LVEF (autoEF) calculation. We sought to evaluate the feasibility and accuracy of LVEF calculation by oncology staff using an AI-enabled handheld ultrasound device (HUD).
METHODS AND RESULTS: We studied 115 patients referred for echocardiographic LVEF estimation. All patients were scanned by a cardiologist using standard echocardiography (SE), and biplane Simpson's LVEF was the reference standard. Hands-on training using the Kosmos HUD was provided to the oncology staff before the study. Each patient was scanned by a cardiologist, a senior oncologist, an oncology resident, and a nurse using the TRIO AI and KOSMOS EF deep learning algorithms to obtain autoEF. The correlation between autoEF and SE-ejection fraction (EF) was excellent for the cardiologist (r = 0.90), the junior oncologist (r = 0.82), and the nurse (r = 0.84), and good for the senior oncologist (r = 0.79). The Bland-Altman analysis showed a small underestimation by autoEF compared with SE-EF. Detection of impaired LVEF < 50% was feasible with a sensitivity of 95% and specificity of 94% for the cardiologist; sensitivity of 86% and specificity of 93% for the senior oncologist; sensitivity of 95% and specificity of 91% for the junior oncologist; and sensitivity of 94% and specificity of 87% for the nurse.
CONCLUSION: Automated LVEF calculation by oncology staff was feasible using AI-enabled HUD in a selected patient population. Detection of LVEF < 50% was possible with good accuracy. These findings show the potential to expedite the clinical workflow of cancer patients and speed up a referral when necessary.
PMID:38774364 | PMC:PMC11104473 | DOI:10.1093/ehjdh/ztae017
Colour fusion effect on deep learning classification of uveal melanoma
Eye (Lond). 2024 May 21. doi: 10.1038/s41433-024-03148-4. Online ahead of print.
ABSTRACT
BACKGROUND: Reliable differentiation of uveal melanoma and choroidal nevi is crucial to guide appropriate treatment, preventing unnecessary procedures for benign lesions and ensuring timely treatment for potentially malignant cases. The purpose of this study is to validate deep learning classification of uveal melanoma and choroidal nevi, and to evaluate the effect of colour fusion options on the classification performance.
METHODS: A total of 798 ultra-widefield retinal images of 438 patients were included in this retrospective study, comprising 157 patients diagnosed with UM and 281 patients diagnosed with choroidal naevus. Colour fusion options, including early fusion, intermediate fusion and late fusion, were tested for deep learning image classification with a convolutional neural network (CNN). F1-score, accuracy and the area under the curve (AUC) of a receiver operating characteristic (ROC) were used to evaluate the classification performance.
RESULTS: Colour fusion options were observed to affect the deep learning performance significantly. For single-colour learning, the red colour image was observed to have superior performance compared to green and blue channels. For multi-colour learning, the intermediate fusion is better than early and late fusion options.
CONCLUSION: Deep learning is a promising approach for automated classification of uveal melanoma and choroidal nevi. Colour fusion options can significantly affect the classification performance.
PMID:38773261 | DOI:10.1038/s41433-024-03148-4
Stacked neural network for predicting polygenic risk score
Sci Rep. 2024 May 21;14(1):11632. doi: 10.1038/s41598-024-62513-1.
ABSTRACT
In recent years, the utility of polygenic risk scores (PRS) in forecasting disease susceptibility from genome-wide association studies (GWAS) results has been widely recognised. Yet, these models face limitations due to overfitting and the potential overestimation of effect sizes in correlated variants. To surmount these obstacles, we devised the Stacked Neural Network Polygenic Risk Score (SNPRS). This novel approach synthesises outputs from multiple neural network models, each calibrated using genetic variants chosen based on diverse p-value thresholds. By doing so, SNPRS captures a broader array of genetic variants, enabling a more nuanced interpretation of the combined effects of these variants. We assessed the efficacy of SNPRS using the UK Biobank data, focusing on the genetic risks associated with breast and prostate cancers, as well as quantitative traits like height and BMI. We also extended our analysis to the Korea Genome and Epidemiology Study (KoGES) dataset. Impressively, our results indicate that SNPRS surpasses traditional PRS models and an isolated deep neural network in terms of accuracy, highlighting its promise in refining the efficacy and relevance of PRS in genetic studies.
PMID:38773257 | DOI:10.1038/s41598-024-62513-1
Smart diabetic foot ulcer scoring system
Sci Rep. 2024 May 21;14(1):11588. doi: 10.1038/s41598-024-62076-1.
ABSTRACT
Current assessment methods for diabetic foot ulcers (DFUs) lack objectivity and consistency, posing a significant risk to diabetes patients, including the potential for amputations, highlighting the urgent need for improved diagnostic tools and care standards in the field. To address this issue, the objective of this study was to develop and evaluate the Smart Diabetic Foot Ulcer Scoring System, ScoreDFUNet, which incorporates artificial intelligence (AI) and image analysis techniques, aiming to enhance the precision and consistency of diabetic foot ulcer assessment. ScoreDFUNet demonstrates precise categorization of DFU images into "ulcer," "infection," "normal," and "gangrene" areas, achieving a noteworthy accuracy rate of 95.34% on the test set, with elevated levels of precision, recall, and F1 scores. Comparative evaluations with dermatologists affirm that our algorithm consistently surpasses the performance of junior and mid-level dermatologists, closely matching the assessments of senior dermatologists, and rigorous analyses including Bland-Altman plots and significance testing validate the robustness and reliability of our algorithm. This innovative AI system presents a valuable tool for healthcare professionals and can significantly improve the care standards in the field of diabetic foot ulcer assessment.
PMID:38773207 | DOI:10.1038/s41598-024-62076-1
COVID‑19 detection from chest X-ray images using transfer learning
Sci Rep. 2024 May 21;14(1):11639. doi: 10.1038/s41598-024-61693-0.
ABSTRACT
COVID-19 is a kind of coronavirus that appeared in China in the Province of Wuhan in December 2019. The most significant influence of this virus is its very highly contagious characteristic which may lead to death. The standard diagnosis of COVID-19 is based on swabs from the throat and nose, their sensitivity is not high enough and so they are prone to errors. Early diagnosis of COVID-19 disease is important to provide the chance of quick isolation of the suspected cases and to decrease the opportunity of infection in healthy people. In this research, a framework for chest X-ray image classification tasks based on deep learning is proposed to help in early diagnosis of COVID-19. The proposed framework contains two phases which are the pre-processing phase and classification phase which uses pre-trained convolution neural network models based on transfer learning. In the pre-processing phase, different image enhancements have been applied to full and segmented X-ray images to improve the classification performance of the CNN models. Two CNN pre-trained models have been used for classification which are VGG19 and EfficientNetB0. From experimental results, the best model achieved a sensitivity of 0.96, specificity of 0.94, precision of 0.9412, F1 score of 0.9505 and accuracy of 0.95 using enhanced full X-ray images for binary classification of chest X-ray images into COVID-19 or normal with VGG19. The proposed framework is promising and achieved a classification accuracy of 0.935 for 4-class classification.
PMID:38773161 | DOI:10.1038/s41598-024-61693-0
Deep learning of left atrial structure and function provides link to atrial fibrillation risk
Nat Commun. 2024 May 21;15(1):4304. doi: 10.1038/s41467-024-48229-w.
ABSTRACT
Increased left atrial volume and decreased left atrial function have long been associated with atrial fibrillation. The availability of large-scale cardiac magnetic resonance imaging data paired with genetic data provides a unique opportunity to assess the genetic contributions to left atrial structure and function, and understand their relationship with risk for atrial fibrillation. Here, we use deep learning and surface reconstruction models to measure left atrial minimum volume, maximum volume, stroke volume, and emptying fraction in 40,558 UK Biobank participants. In a genome-wide association study of 35,049 participants without pre-existing cardiovascular disease, we identify 20 common genetic loci associated with left atrial structure and function. We find that polygenic contributions to increased left atrial volume are associated with atrial fibrillation and its downstream consequences, including stroke. Through Mendelian randomization, we find evidence supporting a causal role for left atrial enlargement and dysfunction on atrial fibrillation risk.
PMID:38773065 | DOI:10.1038/s41467-024-48229-w
Clinical applications of deep learning in neuroinflammatory diseases: A scoping review
Rev Neurol (Paris). 2024 May 20:S0035-3787(24)00522-8. doi: 10.1016/j.neurol.2024.04.004. Online ahead of print.
ABSTRACT
BACKGROUND: Deep learning (DL) is an artificial intelligence technology that has aroused much excitement for predictive medicine due to its ability to process raw data modalities such as images, text, and time series of signals.
OBJECTIVES: Here, we intend to give the clinical reader elements to understand this technology, taking neuroinflammatory diseases as an illustrative use case of clinical translation efforts. We reviewed the scope of this rapidly evolving field to get quantitative insights about which clinical applications concentrate the efforts and which data modalities are most commonly used.
METHODS: We queried the PubMed database for articles reporting DL algorithms for clinical applications in neuroinflammatory diseases and the radiology.healthairegister.com website for commercial algorithms.
RESULTS: The review included 148 articles published between 2018 and 2024 and five commercial algorithms. The clinical applications could be grouped as computer-aided diagnosis, individual prognosis, functional assessment, the segmentation of radiological structures, and the optimization of data acquisition. Our review highlighted important discrepancies in efforts. The segmentation of radiological structures and computer-aided diagnosis currently concentrate most efforts with an overrepresentation of imaging. Various model architectures have addressed different applications, relatively low volume of data, and diverse data modalities. We report the high-level technical characteristics of the algorithms and synthesize narratively the clinical applications. Predictive performances and some common a priori on this topic are finally discussed.
CONCLUSION: The currently reported efforts position DL as an information processing technology, enhancing existing modalities of paraclinical investigations and bringing perspectives to make innovative ones actionable for healthcare.
PMID:38772806 | DOI:10.1016/j.neurol.2024.04.004
APEX-pHLA: A novel method for accurate prediction of the binding between exogenous short peptides and HLA class I molecules
Methods. 2024 May 19:S1046-2023(24)00132-4. doi: 10.1016/j.ymeth.2024.05.013. Online ahead of print.
ABSTRACT
Human leukocyte antigen (HLA) molecules play critically significant role within the realm of immunotherapy due to their capacities to recognize and bind exogenous antigens such as peptides, subsequently delivering them to immune cells. Predicting the binding between peptides and HLA molecules (pHLA) can expedite the screening of immunogenic peptides and facilitate vaccine design. However, traditional experimental methods are time-consuming and inefficient. In this study, an efficient method based on deep learning was developed for predicting peptide-HLA binding, which treated peptide sequences as linguistic entities. It combined the architectures of textCNN and BiLSTM to create a deep neural network model called APEX-pHLA. This model operated without limitations related to HLA class I allele variants and peptide segment lengths, enabling efficient encoding of sequence features for both HLA and peptide segments. On the independent test set, the model achieved Accuracy, ROC_AUC, F1, and MCC is 0.9449, 0.9850, 0.9453, and 0.8899, respectively. Similarly, on an external test set, the results were 0.9803, 0.9574, 0.8835, and 0.7863, respectively. These findings outperformed fifteen methods previously reported in the literature. The accurate prediction capability of the APEX-pHLA model in peptide-HLA binding might provide valuable insights for future HLA vaccine design.
PMID:38772499 | DOI:10.1016/j.ymeth.2024.05.013
Interpretable deep learning insights: Unveiling the role of 1 Gy volume on lymphopenia after radiotherapy in breast cancer
Radiother Oncol. 2024 May 19:110333. doi: 10.1016/j.radonc.2024.110333. Online ahead of print.
ABSTRACT
BACKGROUND: Lymphopenia is known for its significance on poor survivals in breast cancer (BC) patients. Considering full dosimetric data, this study aimed to develop and validate predictive models for lymphopenia after radiotherapy (RT) in BC.
MATERIAL AND METHODS: BC patients treated with adjuvant RT were eligible in this multicenter study. The study endpoint was lympopenia, defined as the reduction in absolute lymphocytes and graded lymphopenia after RT. The dose-volume histogram (DVH) data of related critical structures and clinical factors were taken into account for the development of dense neural network (DNN) predictive models. The developed DNN models were validated using external patient cohorts.
RESULTS: Totally 918 consecutive patients with invasive BC enrolled. The training, testing, and external validating datasets consisted of 589, 203, and 126 patients, respectively. Treatment volumes at nearly all dose levels of the DVH were significant predictors for lymphopenia following RT, including volumes at very low-dose 1 Gy (V1) of organs at risk (OARs) including lung, heart and body, especially ipsilateral-lung V1. A final DNN model, combining full DVH dosimetric parameters of OARs and three key clinical factors, achieved a predictive accuracy of 75 % or higher.
CONCLUSION: This study demonstrated and externally validated the significance of full dosimetric data, particularly the volume of low dose at as low as 1 Gy of critical structures on lymphopenia after radiation in BC patients. The significance of V1 deserves special attention, as modern VMAT RT technology often has a relatively high value of this parameter. Further study warranted for RT plan optimization.
PMID:38772478 | DOI:10.1016/j.radonc.2024.110333
Computed tomography machine learning classifier correlates with mortality in interstitial lung disease
Respir Investig. 2024 May 20;62(4):670-676. doi: 10.1016/j.resinv.2024.05.010. Online ahead of print.
ABSTRACT
BACKGROUND: A machine learning classifier system, Fibresolve, was designed and validated as an adjunct to non-invasive diagnosis in idiopathic pulmonary fibrosis (IPF). The system uses a deep learning algorithm to analyze chest computed tomography (CT) imaging. We hypothesized that Fibresolve is a useful predictor of mortality in interstitial lung diseases (ILD).
METHODS: Fibresolve was previously validated in a multi-site >500-patient dataset. In this analysis, we assessed the usefulness of Fibresolve to predict mortality in a subset of 228 patients with IPF and other ILDs in whom follow up data was available. We applied Cox regression analysis adjusting for the Gender, Age, and Physiology (GAP) score and for other known predictors of mortality in IPF. We also analyzed the role of Fibresolve as tertiles adjusting for GAP stages.
RESULTS: During a median follow-up of 2.8 years (range 5 to 3434 days), 89 patients died. After adjusting for GAP score and other mortality risk factors, the Fibresolve score significantly predicted the risk of death (HR: 7.14; 95% CI: 1.31-38.85; p = 0.02) during the follow-up period, as did forced vital capacity and history of lung cancer. After adjusting for GAP stages and other variables, Fibresolve score split into tertiles significantly predicted the risk of death (p = 0.027 for the model; HR 1.37 for 2nd tertile; 95% CI: 0.77-2.42. HR 2.19 for 3rd tertile; 95% CI: 1.22-3.93).
CONCLUSIONS: The machine learning classifier Fibresolve demonstrated to be an independent predictor of mortality in ILDs, with prognostic performance equivalent to GAP based solely on CT images.
PMID:38772191 | DOI:10.1016/j.resinv.2024.05.010
Information extraction of UV-NIR spectral data in waste water based on Large Language Model
Spectrochim Acta A Mol Biomol Spectrosc. 2024 May 17;318:124475. doi: 10.1016/j.saa.2024.124475. Online ahead of print.
ABSTRACT
In recent years, with the rise of various machine learning methods, the Ultraviolet and Near Infrared (UV-NIR) spectral analysis has been impressive in the determination of intricate systems. However, the UV-NIR spectral analysis based on traditional machine learning requires independent training with tedious parameter tuning for different samples or tasks. As a result, training a high-quality model is often complicated and time-consuming. Large language model (LLM) is one of the cutting-edge achievements in deep learning, with the parameter size of the order of billion. LLM can extract abstract information from input and use it effectively. Even without any additional training, using only simple natural language prompts, LLM can accomplish tasks that have never been seen before in completely new domains. We look forward to utilizing this capability in spectral analysis to reduce the time-consuming and operational difficulties. In this study, we used UV-NIR spectral analysis to predict the concentration of Chemical Oxygen Demand (COD) in three different water samples, including a complex wastewater. By extracting the characteristic bands in the spectrum, we input them into LLM for concentration prediction. We compared the COD prediction results of different models on water samples and discussed the effects of different experiments setting on LLM. The results show that even with brief prompts, the prediction of LLM in wastewater achieved the best performance, with R2 and RMSE equal to 0.931 and 10.966, which exceed the best results of traditional models, where R2 and RMSE correspond to 0.920 and 11.854. This result indicates that LLM, with simpler operation and less time-consuming, has ability to approach or even surpass traditional machine learning models in UV-NIR spectral analysis. In conclusion, our study proposed a new method for the UV-NIR spectral analysis based on LLM and preliminary demonstrated the potential of LLM for application.
PMID:38772179 | DOI:10.1016/j.saa.2024.124475
Cost-utility analysis of prenatal diagnosis of congenital cardiac diseases using deep learning
Cost Eff Resour Alloc. 2024 May 22;22(1):44. doi: 10.1186/s12962-024-00550-3.
ABSTRACT
BACKGROUND: Deep learning (DL) is a new technology that can assist prenatal ultrasound (US) in the detection of congenital heart disease (CHD) at the prenatal stage. Hence, an economic-epidemiologic evaluation (aka Cost-Utility Analysis) is required to assist policymakers in deciding whether to adopt the new technology.
METHODS: The incremental cost-utility ratios (CUR), of adding DL assisted ultrasound (DL-US) to the current provision of US plus pulse oximetry (POX), was calculated by building a spreadsheet model that integrated demographic, economic epidemiological, health service utilization, screening performance, survival and lifetime quality of life data based on the standard formula: CUR = Increase in Intervention Costs - Decrease in Treatment costs Averted QALY losses of adding DL to US & POX US screening data were based on real-world operational routine reports (as opposed to research studies). The DL screening cost of 145 USD was based on Israeli US costs plus 20.54 USD for reading and recording screens.
RESULTS: The addition of DL assisted US, which is associated with increased sensitivity (95% vs 58.1%), resulted in far fewer undiagnosed infants (16 vs 102 [or 2.9% vs 15.4%] of the 560 and 659 births, respectively). Adoption of DL-US will add 1,204 QALYs. with increased screening costs 22.5 million USD largely offset by decreased treatment costs (20.4 million USD). Therefore, the new DL-US technology is considered "very cost-effective", costing only 1,720 USD per QALY. For most performance combinations (sensitivity > 80%, specificity > 90%), the adoption of DL-US is either cost effective or very cost effective. For specificities greater than 98% (with sensitivities above 94%), DL-US (& POX) is said to "dominate" US (& POX) by providing more QALYs at a lower cost.
CONCLUSION: Our exploratory CUA calculations indicate the feasibility of DL-US as being at least cost-effective.
PMID:38773527 | DOI:10.1186/s12962-024-00550-3
Convolutional neural networks combined with conventional filtering to semantically segment plant roots in rapidly scanned X-ray computed tomography volumes with high noise levels
Plant Methods. 2024 May 21;20(1):73. doi: 10.1186/s13007-024-01208-0.
ABSTRACT
BACKGROUND: X-ray computed tomography (CT) is a powerful tool for measuring plant root growth in soil. However, a rapid scan with larger pots, which is required for throughput-prioritized crop breeding, results in high noise levels, low resolution, and blurred root segments in the CT volumes. Moreover, while plant root segmentation is essential for root quantification, detailed conditional studies on segmenting noisy root segments are scarce. The present study aimed to investigate the effects of scanning time and deep learning-based restoration of image quality on semantic segmentation of blurry rice (Oryza sativa) root segments in CT volumes.
RESULTS: VoxResNet, a convolutional neural network-based voxel-wise residual network, was used as the segmentation model. The training efficiency of the model was compared using CT volumes obtained at scan times of 33, 66, 150, 300, and 600 s. The learning efficiencies of the samples were similar, except for scan times of 33 and 66 s. In addition, The noise levels of predicted volumes differd among scanning conditions, indicating that the noise level of a scan time ≥ 150 s does not affect the model training efficiency. Conventional filtering methods, such as median filtering and edge detection, increased the training efficiency by approximately 10% under any conditions. However, the training efficiency of 33 and 66 s-scanned samples remained relatively low. We concluded that scan time must be at least 150 s to not affect segmentation. Finally, we constructed a semantic segmentation model for 150 s-scanned CT volumes, for which the Dice loss reached 0.093. This model could not predict the lateral roots, which were not included in the training data. This limitation will be addressed by preparing appropriate training data.
CONCLUSIONS: A semantic segmentation model can be constructed even with rapidly scanned CT volumes with high noise levels. Given that scanning times ≥ 150 s did not affect the segmentation results, this technique holds promise for rapid and low-dose scanning. This study offers insights into images other than CT volumes with high noise levels that are challenging to determine when annotating.
PMID:38773503 | DOI:10.1186/s13007-024-01208-0
Automatic segmentation of 15 critical anatomical labels and measurements of cardiac axis and cardiothoracic ratio in fetal four chambers using nnU-NetV2
BMC Med Inform Decis Mak. 2024 May 21;24(1):128. doi: 10.1186/s12911-024-02527-x.
ABSTRACT
BACKGROUND: Accurate segmentation of critical anatomical structures in fetal four-chamber view images is essential for the early detection of congenital heart defects. Current prenatal screening methods rely on manual measurements, which are time-consuming and prone to inter-observer variability. This study develops an AI-based model using the state-of-the-art nnU-NetV2 architecture for automatic segmentation and measurement of key anatomical structures in fetal four-chamber view images.
METHODS: A dataset, consisting of 1,083 high-quality fetal four-chamber view images, was annotated with 15 critical anatomical labels and divided into training/validation (867 images) and test (216 images) sets. An AI-based model using the nnU-NetV2 architecture was trained on the annotated images and evaluated using the mean Dice coefficient (mDice) and mean intersection over union (mIoU) metrics. The model's performance in automatically computing the cardiac axis (CAx) and cardiothoracic ratio (CTR) was compared with measurements from sonographers with varying levels of experience.
RESULTS: The AI-based model achieved a mDice coefficient of 87.11% and an mIoU of 77.68% for the segmentation of critical anatomical structures. The model's automated CAx and CTR measurements showed strong agreement with those of experienced sonographers, with respective intraclass correlation coefficients (ICCs) of 0.83 and 0.81. Bland-Altman analysis further confirmed the high agreement between the model and experienced sonographers.
CONCLUSION: We developed an AI-based model using the nnU-NetV2 architecture for accurate segmentation and automated measurement of critical anatomical structures in fetal four-chamber view images. Our model demonstrated high segmentation accuracy and strong agreement with experienced sonographers in computing clinically relevant parameters. This approach has the potential to improve the efficiency and reliability of prenatal cardiac screening, ultimately contributing to the early detection of congenital heart defects.
PMID:38773456 | DOI:10.1186/s12911-024-02527-x
Refining neural network algorithms for accurate brain tumor classification in MRI imagery
BMC Med Imaging. 2024 May 21;24(1):118. doi: 10.1186/s12880-024-01285-6.
ABSTRACT
Brain tumor diagnosis using MRI scans poses significant challenges due to the complex nature of tumor appearances and variations. Traditional methods often require extensive manual intervention and are prone to human error, leading to misdiagnosis and delayed treatment. Current approaches primarily include manual examination by radiologists and conventional machine learning techniques. These methods rely heavily on feature extraction and classification algorithms, which may not capture the intricate patterns present in brain MRI images. Conventional techniques often suffer from limited accuracy and generalizability, mainly due to the high variability in tumor appearance and the subjective nature of manual interpretation. Additionally, traditional machine learning models may struggle with the high-dimensional data inherent in MRI images. To address these limitations, our research introduces a deep learning-based model utilizing convolutional neural networks (CNNs).Our model employs a sequential CNN architecture with multiple convolutional, max-pooling, and dropout layers, followed by dense layers for classification. The proposed model demonstrates a significant improvement in diagnostic accuracy, achieving an overall accuracy of 98% on the test dataset. The proposed model demonstrates a significant improvement in diagnostic accuracy, achieving an overall accuracy of 98% on the test dataset. The precision, recall, and F1-scores ranging from 97 to 98% with a roc-auc ranging from 99 to 100% for each tumor category further substantiate the model's effectiveness. Additionally, the utilization of Grad-CAM visualizations provides insights into the model's decision-making process, enhancing interpretability. This research addresses the pressing need for enhanced diagnostic accuracy in identifying brain tumors through MRI imaging, tackling challenges such as variability in tumor appearance and the need for rapid, reliable diagnostic tools.
PMID:38773391 | DOI:10.1186/s12880-024-01285-6
Predicting tumour origin with cytology-based deep learning: hype or hope?
Nat Rev Clin Oncol. 2024 May 21. doi: 10.1038/s41571-024-00906-x. Online ahead of print.
NO ABSTRACT
PMID:38773339 | DOI:10.1038/s41571-024-00906-x
Enhanced multi-class pathology lesion detection in gastric neoplasms using deep learning-based approach and validation
Sci Rep. 2024 May 21;14(1):11527. doi: 10.1038/s41598-024-62494-1.
ABSTRACT
This study developed a new convolutional neural network model to detect and classify gastric lesions as malignant, premalignant, and benign. We used 10,181 white-light endoscopy images from 2606 patients in an 8:1:1 ratio. Lesions were categorized as early gastric cancer (EGC), advanced gastric cancer (AGC), gastric dysplasia, benign gastric ulcer (BGU), benign polyp, and benign erosion. We assessed the lesion detection and classification model using six-class, cancer versus non-cancer, and neoplasm versus non-neoplasm categories, as well as T-stage estimation in cancer lesions (T1, T2-T4). The lesion detection rate was 95.22% (219/230 patients) on a per-patient basis: 100% for EGC, 97.22% for AGC, 96.49% for dysplasia, 75.00% for BGU, 97.22% for benign polyps, and 80.49% for benign erosion. The six-class category exhibited an accuracy of 73.43%, sensitivity of 80.90%, specificity of 83.32%, positive predictive value (PPV) of 73.68%, and negative predictive value (NPV) of 88.53%. The sensitivity and NPV were 78.62% and 88.57% for the cancer versus non-cancer category, and 83.26% and 89.80% for the neoplasm versus non-neoplasm category, respectively. The T stage estimation model achieved an accuracy of 85.17%, sensitivity of 88.68%, specificity of 79.81%, PPV of 87.04%, and NPV of 82.18%. The novel CNN-based model remarkably detected and classified malignant, premalignant, and benign gastric lesions and accurately estimated gastric cancer T-stages.
PMID:38773274 | DOI:10.1038/s41598-024-62494-1
Exploring UMAP in hybrid models of entropy-based and representativeness sampling for active learning in biomedical segmentation
Comput Biol Med. 2024 May 16;176:108605. doi: 10.1016/j.compbiomed.2024.108605. Online ahead of print.
ABSTRACT
In this work, we study various hybrid models of entropy-based and representativeness sampling techniques in the context of active learning in medical segmentation, in particular examining the role of UMAP (Uniform Manifold Approximation and Projection) as a technique for capturing representativeness. Although UMAP has been shown viable as a general purpose dimension reduction method in diverse areas, its role in deep learning-based medical segmentation has yet been extensively explored. Using the cardiac and prostate datasets in the Medical Segmentation Decathlon for validation, we found that a novel hybrid combination of Entropy-UMAP sampling technique achieved a statistically significant Dice score advantage over the random baseline (3.2% for cardiac, 4.5% for prostate), and attained the highest Dice coefficient among the spectrum of 10 distinct active learning methodologies we examined. This provides preliminary evidence that there is an interesting synergy between entropy-based and UMAP methods when the former precedes the latter in a hybrid model of active learning.
PMID:38772054 | DOI:10.1016/j.compbiomed.2024.108605