Deep learning

Self-Supervised Learning for Improved Optical Coherence Tomography Detection of Macular Telangiectasia Type 2

Thu, 2024-02-08 06:00

JAMA Ophthalmol. 2024 Feb 8. doi: 10.1001/jamaophthalmol.2023.6454. Online ahead of print.

ABSTRACT

IMPORTANCE: Deep learning image analysis often depends on large, labeled datasets, which are difficult to obtain for rare diseases.

OBJECTIVE: To develop a self-supervised approach for automated classification of macular telangiectasia type 2 (MacTel) on optical coherence tomography (OCT) with limited labeled data.

DESIGN, SETTING, AND PARTICIPANTS: This was a retrospective comparative study. OCT images from May 2014 to May 2019 were collected by the Lowy Medical Research Institute, La Jolla, California, and the University of Washington, Seattle, from January 2016 to October 2022. Clinical diagnoses of patients with and without MacTel were confirmed by retina specialists. Data were analyzed from January to September 2023.

EXPOSURES: Two convolutional neural networks were pretrained using the Bootstrap Your Own Latent algorithm on unlabeled training data and fine-tuned with labeled training data to predict MacTel (self-supervised method). ResNet18 and ResNet50 models were also trained using all labeled data (supervised method).

MAIN OUTCOMES AND MEASURES: The ground truth yes vs no MacTel diagnosis is determined by retinal specialists based on spectral-domain OCT. The models' predictions were compared against human graders using accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), area under precision recall curve (AUPRC), and area under the receiver operating characteristic curve (AUROC). Uniform manifold approximation and projection was performed for dimension reduction and GradCAM visualizations for supervised and self-supervised methods.

RESULTS: A total of 2636 OCT scans from 780 patients with MacTel and 131 patients without MacTel were included from the MacTel Project (mean [SD] age, 60.8 [11.7] years; 63.8% female), and another 2564 from 1769 patients without MacTel from the University of Washington (mean [SD] age, 61.2 [18.1] years; 53.4% female). The self-supervised approach fine-tuned on 100% of the labeled training data with ResNet50 as the feature extractor performed the best, achieving an AUPRC of 0.971 (95% CI, 0.969-0.972), an AUROC of 0.970 (95% CI, 0.970-0.973), accuracy of 0.898%, sensitivity of 0.898, specificity of 0.949, PPV of 0.935, and NPV of 0.919. With only 419 OCT volumes (185 MacTel patients in 10% of labeled training dataset), the ResNet18 self-supervised model achieved comparable performance, with an AUPRC of 0.958 (95% CI, 0.957-0.960), an AUROC of 0.966 (95% CI, 0.964-0.967), and accuracy, sensitivity, specificity, PPV, and NPV of 90.2%, 0.884, 0.916, 0.896, and 0.906, respectively. The self-supervised models showed better agreement with the more experienced human expert graders.

CONCLUSIONS AND RELEVANCE: The findings suggest that self-supervised learning may improve the accuracy of automated MacTel vs non-MacTel binary classification on OCT with limited labeled training data, and these approaches may be applicable to other rare diseases, although further research is warranted.

PMID:38329740 | DOI:10.1001/jamaophthalmol.2023.6454

Categories: Literature Watch

Automated detection of fatal cerebral haemorrhage in postmortem CT data

Thu, 2024-02-08 06:00

Int J Legal Med. 2024 Feb 8. doi: 10.1007/s00414-024-03183-6. Online ahead of print.

ABSTRACT

During the last years, the detection of different causes of death based on postmortem imaging findings became more and more relevant. Especially postmortem computed tomography (PMCT) as a non-invasive, relatively cheap, and fast technique is progressively used as an important imaging tool for supporting autopsies. Additionally, previous works showed that deep learning applications yielded robust results for in vivo medical imaging interpretation. In this work, we propose a pipeline to identify fatal cerebral haemorrhage on three-dimensional PMCT data. We retrospectively selected 81 PMCT cases from the database of our institute, whereby 36 cases suffered from a fatal cerebral haemorrhage as confirmed by autopsy. The remaining 45 cases were considered as neurologically healthy. Based on these datasets, six machine learning classifiers (k-nearest neighbour, Gaussian naive Bayes, logistic regression, decision tree, linear discriminant analysis, and support vector machine) were executed and two deep learning models, namely a convolutional neural network (CNN) and a densely connected convolutional network (DenseNet), were trained. For all algorithms, 80% of the data was randomly selected for training and 20% for validation purposes and a five-fold cross-validation was executed. The best-performing classification algorithm for fatal cerebral haemorrhage was the artificial neural network CNN, which resulted in an accuracy of 0.94 for all folds. In the future, artificial neural network algorithms may be applied by forensic pathologists as a helpful computer-assisted diagnostics tool supporting PMCT-based evaluation of cause of death.

PMID:38329584 | DOI:10.1007/s00414-024-03183-6

Categories: Literature Watch

Anti-HER2 therapy response assessment for guiding treatment (de-)escalation in early HER2-positive breast cancer using a novel deep learning radiomics model

Thu, 2024-02-08 06:00

Eur Radiol. 2024 Feb 8. doi: 10.1007/s00330-024-10609-7. Online ahead of print.

ABSTRACT

OBJECTIVES: Anti-HER2 targeted therapy significantly reduces risk of relapse in HER2 + breast cancer. New measures are needed for a precise risk stratification to guide (de-)escalation of anti-HER2 strategy.

METHODS: A total of 726 HER2 + cases who received no/single/dual anti-HER2 targeted therapies were split into three respective cohorts. A deep learning model (DeepTEPP) based on preoperative breast magnetic resonance (MR) was developed. Patients were scored and categorized into low-, moderate-, and high-risk groups. Recurrence-free survival (RFS) was compared in patients with different risk groups according to the anti-HER2 treatment they received, to validate the value of DeepTEPP in predicting treatment efficacy and guiding anti-HER2 strategy.

RESULTS: DeepTEPP was capable of risk stratification and guiding anti-HER2 treatment strategy: DeepTEPP-Low patients (60.5%) did not derive significant RFS benefit from trastuzumab (p = 0.144), proposing an anti-HER2 de-escalation. DeepTEPP-Moderate patients (19.8%) significantly benefited from trastuzumab (p = 0.048), but did not obtain additional improvements from pertuzumab (p = 0.125). DeepTEPP-High patients (19.7%) significantly benefited from dual HER2 blockade (p = 0.045), suggesting an anti-HER2 escalation.

CONCLUSIONS: DeepTEPP represents a pioneering MR-based deep learning model that enables the non-invasive prediction of adjuvant anti-HER2 effectiveness, thereby providing valuable guidance for anti-HER2 (de-)escalation strategies. DeepTEPP provides an important reference for choosing the appropriate individualized treatment in HER2 + breast cancer patients, warranting prospective validation.

CLINICAL RELEVANCE STATEMENT: We built an MR-based deep learning model DeepTEPP, which enables the non-invasive prediction of adjuvant anti-HER2 effectiveness, thus guiding anti-HER2 (de-)escalation strategies in early HER2-positive breast cancer patients.

KEY POINTS: • DeepTEPP is able to predict anti-HER2 effectiveness and to guide treatment (de-)escalation. • DeepTEPP demonstrated an impressive prognostic efficacy for recurrence-free survival and overall survival. • To our knowledge, this is one of the very few, also the largest study to test the efficacy of a deep learning model extracted from breast MR images on HER2-positive breast cancer survival and anti-HER2 therapy effectiveness prediction.

PMID:38329503 | DOI:10.1007/s00330-024-10609-7

Categories: Literature Watch

Estimating lung function from computed tomography at the patient and lobe level using machine learning

Thu, 2024-02-08 06:00

Med Phys. 2024 Feb 8. doi: 10.1002/mp.16915. Online ahead of print.

ABSTRACT

BACKGROUND: Automated estimation of Pulmonary function test (PFT) results from Computed Tomography (CT) could advance the use of CT in screening, diagnosis, and staging of restrictive pulmonary diseases. Estimating lung function per lobe, which cannot be done with PFTs, would be helpful for risk assessment for pulmonary resection surgery and bronchoscopic lung volume reduction.

PURPOSE: To automatically estimate PFT results from CT and furthermore disentangle the individual contribution of pulmonary lobes to a patient's lung function.

METHODS: We propose I3Dr, a deep learning architecture for estimating global measures from an image that can also estimate the contributions of individual parts of the image to this global measure. We apply it to estimate the separate contributions of each pulmonary lobe to a patient's total lung function from CT, while requiring only CT scans and patient level lung function measurements for training. I3Dr consists of a lobe-level and a patient-level model. The lobe-level model extracts all anatomical pulmonary lobes from a CT scan and processes them in parallel to produce lobe level lung function estimates that sum up to a patient level estimate. The patient-level model directly estimates patient level lung function from a CT scan and is used to re-scale the output of the lobe-level model to increase performance. After demonstrating the viability of the proposed approach, the I3Dr model is trained and evaluated for PFT result estimation using a large data set of 8 433 CT volumes for training, 1 775 CT volumes for validation, and 1 873 CT volumes for testing.

RESULTS: First, we demonstrate the viability of our approach by showing that a model trained with a collection of digit images to estimate their sum implicitly learns to assign correct values to individual digits. Next, we show that our models can estimate lobe-level quantities, such as COVID-19 severity scores, pulmonary volume (PV), and functional pulmonary volume (FPV) from CT while only provided with patient-level quantities during training. Lastly, we train and evaluate models for producing spirometry and diffusion capacity of carbon mono-oxide (DLCO) estimates at the patient and lobe level. For producing Forced Expiratory Volume in one second (FEV1), Forced Vital Capacity (FVC), and DLCO estimates, I3Dr obtains mean absolute errors (MAE) of 0.377 L, 0.297 L, and 2.800 mL/min/mm Hg respectively. We release the resulting algorithms for lung function estimation to the research community at https://grand-challenge.org/algorithms/lobe-wise-lung-function-estimation/ CONCLUSIONS: I3Dr can estimate global measures from an image, as well as the contributions of individual parts of the image to this global measure. It offers a promising approach for estimating PFT results from CT scans and disentangling the individual contribution of pulmonary lobes to a patient's lung function. The findings presented in this work may advance the use of CT in screening, diagnosis, and staging of restrictive pulmonary diseases as well as in risk assessment for pulmonary resection surgery and bronchoscopic lung volume reduction.

PMID:38329315 | DOI:10.1002/mp.16915

Categories: Literature Watch

In silico prediction of ocular toxicity of compounds using explainable machine learning and deep learning approaches

Thu, 2024-02-08 06:00

J Appl Toxicol. 2024 Feb 8. doi: 10.1002/jat.4586. Online ahead of print.

ABSTRACT

The accurate identification of chemicals with ocular toxicity is of paramount importance in health hazard assessment. In contemporary chemical toxicology, there is a growing emphasis on refining, reducing, and replacing animal testing in safety evaluations. Therefore, the development of robust computational tools is crucial for regulatory applications. The performance of predictive models is heavily reliant on the quality and quantity of data. In this investigation, we amalgamated the most extensive dataset (4901 compounds) sourced from governmental GHS-compliant databases and literature to develop binary classification models of chemical ocular toxicity. We employed 12 molecular representations in conjunction with six machine learning algorithms and two deep learning algorithms to create a series of binary classification models. The findings indicated that the deep learning method GCN outperformed the machine learning models in cross-validation, achieving an impressive AUC of 0.915. However, the top-performing machine learning model (RF-Descriptor) demonstrated excellent performance with an AUC of 0.869 on the test set and was therefore selected as the best model. To enhance model interpretability, we conducted the SHAP method and attention weights analysis. The two approaches offered visual depictions of the relevance of key descriptors and substructures in predicting ocular toxicity of chemicals. Thus, we successfully struck a delicate balance between data quality and model interpretability, rendering our model valuable for predicting and comprehending potential ocular-toxic compounds in the early stages of drug discovery.

PMID:38329145 | DOI:10.1002/jat.4586

Categories: Literature Watch

Predicting Hypoxia Using Machine Learning: Systematic Review

Thu, 2024-02-08 06:00

JMIR Med Inform. 2024 Feb 2;12:e50642. doi: 10.2196/50642.

ABSTRACT

BACKGROUND: Hypoxia is an important risk factor and indicator for the declining health of inpatients. Predicting future hypoxic events using machine learning is a prospective area of study to facilitate time-critical interventions to counter patient health deterioration.

OBJECTIVE: This systematic review aims to summarize and compare previous efforts to predict hypoxic events in the hospital setting using machine learning with respect to their methodology, predictive performance, and assessed population.

METHODS: A systematic literature search was performed using Web of Science, Ovid with Embase and MEDLINE, and Google Scholar. Studies that investigated hypoxia or hypoxemia of hospitalized patients using machine learning models were considered. Risk of bias was assessed using the Prediction Model Risk of Bias Assessment Tool.

RESULTS: After screening, a total of 12 papers were eligible for analysis, from which 32 models were extracted. The included studies showed a variety of population, methodology, and outcome definition. Comparability was further limited due to unclear or high risk of bias for most studies (10/12, 83%). The overall predictive performance ranged from moderate to high. Based on classification metrics, deep learning models performed similar to or outperformed conventional machine learning models within the same studies. Models using only prior peripheral oxygen saturation as a clinical variable showed better performance than models based on multiple variables, with most of these studies (2/3, 67%) using a long short-term memory algorithm.

CONCLUSIONS: Machine learning models provide the potential to accurately predict the occurrence of hypoxic events based on retrospective data. The heterogeneity of the studies and limited generalizability of their results highlight the need for further validation studies to assess their predictive performance.

PMID:38329094 | DOI:10.2196/50642

Categories: Literature Watch

UniSpec: Deep Learning for Predicting the Full Range of Peptide Fragment Ion Series to Enhance the Proteomics Data Analysis Workflow

Thu, 2024-02-08 06:00

Anal Chem. 2024 Feb 8. doi: 10.1021/acs.analchem.3c02321. Online ahead of print.

ABSTRACT

We present UniSpec, an attention-driven deep neural network designed to predict comprehensive collision-induced fragmentation spectra, thereby improving peptide identification in shotgun proteomics. Utilizing a training data set of 1.8 million unique high-quality tandem mass spectra (MS2) from 0.8 million unique peptide ions, UniSpec learned with a peptide fragmentation dictionary encompassing 7919 fragment peaks. Among these, 5712 are neutral loss peaks, with 2310 corresponding to modification-specific neutral losses. Remarkably, UniSpec can predict 73%-77% of fragment intensities based on our NIST reference library spectra, a significant leap from the 35%-45% coverage of only b and y ions. Comparative studies with Prosit elucidate that while both models are strong at predicting their respective fragment ion series, UniSpec particularly shines in generating more complex MS2 spectra with diverse ion annotations. The integration of UniSpec's predictions into shotgun proteomics data analysis boosts the identification rate of tryptic peptides by 48% at a 1% false discovery rate (FDR) and 60% at a more confident 0.1% FDR. Using UniSpec's predicted in-silico spectral library, the search results closely matched those from search engines and experimental spectral libraries used in peptide identification, highlighting its potential as a stand-alone identification tool. The source code and Python scripts are available on GitHub (https://github.com/usnistgov/UniSpec) and Zenodo (https://zenodo.org/records/10452792), and all data sets and analysis results generated in this work were deposited in Zenodo (https://zenodo.org/records/10052268).

PMID:38329031 | DOI:10.1021/acs.analchem.3c02321

Categories: Literature Watch

Machine learning-based prediction of hip joint moment in healthy subjects, patients and post-operative subjects

Thu, 2024-02-08 06:00

Comput Methods Biomech Biomed Engin. 2024 Feb 8:1-5. doi: 10.1080/10255842.2024.2310732. Online ahead of print.

ABSTRACT

The application of machine learning in the field of motion capture research is growing rapidly. The purpose of the study is to implement a long-short term memory (LSTM) model able to predict sagittal plane hip joint moment (HJM) across three distinct cohorts (healthy controls, patients and post-operative patients) starting from 3D motion capture and force data. Statistical parametric mapping with paired samples t-test was performed to compare machine learning and inverse dynamics HJM predicted values, with the latter used as gold standard. The results demonstrated favorable model performance on each of the three cohorts, showcasing its ability to successfully generalize predictions across diverse cohorts.

PMID:38328932 | DOI:10.1080/10255842.2024.2310732

Categories: Literature Watch

An economical in-class sticker microfluidic activity develops student expertise in microscale physics and device manufacturing

Thu, 2024-02-08 06:00

Lab Chip. 2024 Feb 8. doi: 10.1039/d3lc00912b. Online ahead of print.

ABSTRACT

Educating new students in miniaturization science remains challenging due to the non-intuitive behavior of microscale objects and specialized layer-by-layer assembly approaches. In our analysis of the existing literature, we noted that it remains difficult to have low cost activities that elicit deep learning. Furthermore, few activities have stated learning goals and measurements of effectiveness. To that end, we created a new educational activity that enables students to build and test microfluidic mixers, valves, and bubble generators in the classroom setting with inexpensive, widely-available materials. Although undergraduate and graduate engineering students are able to successfully construct the devices, our activity is unique in that the focus is not on successfully building and operating each device. Instead, it is to gain understanding about miniaturization science, device design, and construction so as to be able to do so independently. Our data show that the activity is appropriate for developing the conceptual understanding of graduate and advanced undergraduate students (n = 57), as well as makes a lasting impression on the students. We also report on observations related to student patterns of misunderstanding and how miniaturization science provides a unique opportunity for educational researchers to elicit and study misconceptions. More broadly, since this activity teaches participants a viable approach to creating microsystems and can be implemented in nearly any global setting, our work democratizes the education of miniaturization science. Noting the broad potential of point-of-care technologies in the global setting, such an activity could empower local experts to address their needs.

PMID:38328814 | DOI:10.1039/d3lc00912b

Categories: Literature Watch

Deep reinforcement learning classification of sparkling wines based on ICP-MS and DOSY NMR spectra

Thu, 2024-02-08 06:00

Food Chem X. 2024 Jan 28;21:101162. doi: 10.1016/j.fochx.2024.101162. eCollection 2024 Mar 30.

ABSTRACT

An approach that combines NMR spectroscopy and inductively coupled plasma mass spectrometry (ICP-MS) and advanced tensor decomposition algorithms with state-of-the-art deep learning procedures was applied for the classification of Croatian continental sparkling wines by their geographical origin. It has been demonstrated that complex high-dimensional NMR or ICP-MS data cannot be classified by higher-order tensor decomposition alone. Extension of the procedure by deep reinforcement learning resulted in an exquisite neural network predictive model for the classification of sparkling wines according to their geographical origin. A network trained on half of the sample set was able to classify even 94% of all samples. The model can particularly be useful in cases where the number of samples is limited and when simpler statistical methods fail to produce reliable data. The model can further be exploited for the identification and differentiation of sparkling wines including a high potential for authenticity or quality control.

PMID:38328694 | PMC:PMC10847605 | DOI:10.1016/j.fochx.2024.101162

Categories: Literature Watch

A general dual-pathway network for EEG denoising

Thu, 2024-02-08 06:00

Front Neurosci. 2024 Jan 24;17:1258024. doi: 10.3389/fnins.2023.1258024. eCollection 2023.

ABSTRACT

INTRODUCTION: Scalp electroencephalogram (EEG) analysis and interpretation are crucial for tracking and analyzing brain activity. The collected scalp EEG signals, however, are weak and frequently tainted with various sorts of artifacts. The models based on deep learning provide comparable performance with that of traditional techniques. However, current deep learning networks applied to scalp EEG noise reduction are large in scale and suffer from overfitting.

METHODS: Here, we propose a dual-pathway autoencoder modeling framework named DPAE for scalp EEG signal denoising and demonstrate the superiority of the model on multi-layer perceptron (MLP), convolutional neural network (CNN) and recurrent neural network (RNN), respectively. We validate the denoising performance on benchmark scalp EEG artifact datasets.

RESULTS: The experimental results show that our model architecture not only significantly reduces the computational effort but also outperforms existing deep learning denoising algorithms in root relative mean square error (RRMSE)metrics, both in the time and frequency domains.

DISCUSSION: The DPAE architecture does not require a priori knowledge of the noise distribution nor is it limited by the network layer structure, which is a general network model oriented toward blind source separation.

PMID:38328554 | PMC:PMC10847297 | DOI:10.3389/fnins.2023.1258024

Categories: Literature Watch

Video dataset of sheep activity for animal behavioral analysis via deep learning

Thu, 2024-02-08 06:00

Data Brief. 2024 Jan 9;52:110027. doi: 10.1016/j.dib.2024.110027. eCollection 2024 Feb.

ABSTRACT

A primary dataset capturing five distinct types of sheep activities in realistic settings was constructed at various resolutions and viewing angles, targeting the expansion of the domain knowledge for non-contact virtual fencing approaches. The present dataset can be used to develop non-invasive approaches for sheep activity detection, which can be proven useful for farming activities including, but not limited to, sheep counting, virtual fencing, behavior detection for health status, and effective sheep breeding. Sheep activity classes include grazing, running, sitting, standing, and walking. The activities of individuals, as well as herds of sheep, were recorded at different resolutions and angles to provide a dataset of diverse characteristics, as summarized in Table 1. Overall, a total of 149,327 frames from 417 videos (the equivalent of 59 minutes of footage) are presented with a balanced set for each activity class, which can be utilized for robust non-invasive detection models based on computer vision techniques. Despite a decent existence of noise within the original data (e.g., segments with no sheep present, multiple sheep in single frames, multiple activities by one or more sheep in single as well as multiple frames, segments with sheep alongside other non-sheep objects), we provide original videos and the original videos' frames (with videos and frames containing humans omitted for privacy reasons). The present dataset includes diverse sheep activity characteristics and can be useful for robust detection and recognition models, as well as advanced activity detection models as a function of time for the applications.

PMID:38328501 | PMC:PMC10847016 | DOI:10.1016/j.dib.2024.110027

Categories: Literature Watch

Artificial neural network models: implementation of functional near-infrared spectroscopy-based spontaneous lie detection in an interactive scenario

Thu, 2024-02-08 06:00

Front Comput Neurosci. 2024 Jan 24;17:1286664. doi: 10.3389/fncom.2023.1286664. eCollection 2023.

ABSTRACT

Deception is an inevitable occurrence in daily life. Various methods have been used to understand the mechanisms underlying brain deception. Moreover, numerous efforts have been undertaken to detect deception and truth-telling. Functional near-infrared spectroscopy (fNIRS) has great potential for neurological applications compared with other state-of-the-art methods. Therefore, an fNIRS-based spontaneous lie detection model was used in the present study. We interviewed 10 healthy subjects to identify deception using the fNIRS system. A card game frequently referred to as a bluff or cheat was introduced. This game was selected because its rules are ideal for testing our hypotheses. The optical probe of the fNIRS was placed on the subject's forehead, and we acquired optical density signals, which were then converted into oxy-hemoglobin and deoxy-hemoglobin signals using the Modified Beer-Lambert law. The oxy-hemoglobin signal was preprocessed to eliminate noise. In this study, we proposed three artificial neural networks inspired by deep learning models, including AlexNet, ResNet, and GoogleNet, to classify deception and truth-telling. The proposed models achieved accuracies of 88.5%, 88.0%, and 90.0%, respectively. These proposed models were compared with other classification models, including k-nearest neighbor, linear support vector machines (SVM), quadratic SVM, cubic SVM, simple decision trees, and complex decision trees. These comparisons showed that the proposed models performed better than the other state-of-the-art methods.

PMID:38328471 | PMC:PMC10848249 | DOI:10.3389/fncom.2023.1286664

Categories: Literature Watch

Construction and validation of a progression prediction model for locally advanced rectal cancer patients received neoadjuvant chemoradiotherapy followed by total mesorectal excision based on machine learning

Thu, 2024-02-08 06:00

Front Oncol. 2024 Jan 24;13:1231508. doi: 10.3389/fonc.2023.1231508. eCollection 2023.

ABSTRACT

BACKGROUND: We attempted to develop a progression prediction model for local advanced rectal cancer(LARC) patients who received preoperative neoadjuvant chemoradiotherapy(NCRT) and operative treatment to identify high-risk patients in advance.

METHODS: Data from 272 LARC patients who received NCRT and total mesorectal excision(TME) from 2011 to 2018 at the Fourth Hospital of Hebei Medical University were collected. Data from 161 patients with rectal cancer (each sample with one target variable (progression) and 145 characteristic variables) were included. One Hot Encoding was applied to numerically represent some characteristics. The K-Nearest Neighbor (KNN) filling method was used to determine the missing values, and SmoteTomek comprehensive sampling was used to solve the data imbalance. Eventually, data from 135 patients with 45 characteristic clinical variables were obtained. Random forest, decision tree, support vector machine (SVM), and XGBoost were used to predict whether patients with rectal cancer will exhibit progression. LASSO regression was used to further filter the variables and narrow down the list of variables using a Venn diagram. Eventually, the prediction model was constructed by multivariate logistic regression, and the performance of the model was confirmed in the validation set.

RESULTS: Eventually, data from 135 patients including 45 clinical characteristic variables were included in the study. Data were randomly divided in an 8:2 ratio into a data set and a validation set, respectively. Area Under Curve (AUC) values of 0.72 for the decision tree, 0.97 for the random forest, 0.89 for SVM, and 0.94 for XGBoost were obtained from the data set. Similar results were obtained from the validation set. Twenty-three variables were obtained from LASSO regression, and eight variables were obtained by considering the intersection of the variables obtained using the previous four machine learning methods. Furthermore, a multivariate logistic regression model was constructed using the data set; the ROC indicated its good performance. The ROC curve also verified the good predictive performance in the validation set.

CONCLUSIONS: We constructed a logistic regression model with good predictive performance, which allowed us to accurately predict whether patients who received NCRT and TME will exhibit disease progression.

PMID:38328435 | PMC:PMC10849061 | DOI:10.3389/fonc.2023.1231508

Categories: Literature Watch

BAU-Insectv2: An agricultural plant insect dataset for deep learning and biomedical image analysis

Thu, 2024-02-08 06:00

Data Brief. 2024 Jan 23;53:110083. doi: 10.1016/j.dib.2024.110083. eCollection 2024 Apr.

ABSTRACT

"BAU-Insectv2" represents a novel agricultural dataset tailored for deep learning applications and biomedical image analysis focused on plant-insect interactions. This dataset encompasses a diverse collection of high-resolution images capturing intricate details of plant-insect interactions across various agricultural settings. Leveraging deep learning methodologies, this study aims to employ convolutional neural networks (CNN) and advanced image analysis techniques for precise insect detection, classification, and understanding of insect-related patterns within agricultural ecosystems. We mainly focus on addressing insect-related issues in South Asian crop cultivation. The dataset's extensive scope and high-quality imagery provide a robust foundation for developing and validating models capable of accurately identifying and analyzing diverse plant insects. The dataset's utility extends to biomedical image analysis, fostering interdisciplinary research avenues across agriculture and biomedical sciences. This dataset holds significant promise for advancing research in agricultural pest management, ecosystem dynamics, and biomedical image analysis techniques.

PMID:38328295 | PMC:PMC10847483 | DOI:10.1016/j.dib.2024.110083

Categories: Literature Watch

Image dataset of healthy and infected fig leaves with Ficus leaf worm

Thu, 2024-02-08 06:00

Data Brief. 2024 Jan 1;53:109958. doi: 10.1016/j.dib.2023.109958. eCollection 2024 Apr.

ABSTRACT

This work presents an extensive dataset comprising images meticulously obtained from diverse geographic locations within Iraq, depicting both healthy and infected fig leaves affected by Ficus leafworm. This particular pest poses a significant threat to economic interests, as its infestations often lead to the defoliation of trees, resulting in reduced fruit production. The dataset comprises two distinct classes: infected and healthy, with the acquisition of images executed with precision during the fruiting season, employing state-of-the-art high-resolution equipment, as detailed in the specifications table. In total, the dataset encompasses a substantial 2,321 images, with 1,350 representing infected leaves and 971 depicting healthy ones. The images were acquired through a random sampling approach, ensuring a harmonious blend of balance and diversity across data emanating from distinct fig trees. The proposed dataset carries substantial potential for impact and utility, featuring essential attributes such as the binary classification of infected and healthy leaves. The presented dataset holds the potential to be a valuable resource for the pest control industry within the domains of agriculture and food production.

PMID:38328293 | PMC:PMC10847847 | DOI:10.1016/j.dib.2023.109958

Categories: Literature Watch

Dataset for detecting and characterizing Arab computation propaganda on X

Thu, 2024-02-08 06:00

Data Brief. 2024 Jan 23;53:110089. doi: 10.1016/j.dib.2024.110089. eCollection 2024 Apr.

ABSTRACT

Arab nations are greatly influenced by computational propaganda. Detecting Arab computational propaganda has become a trending topic in social media research. Despite all the efforts made, the definitive definition of a propagandistic characteristic is still not clear. Additionally, the earlier datasets were acquired and labelled for a specific study but were neglected thereafter. As a result, researchers are unable to assess whether the proposed AI detectors can be generalized or not. There is a lack of real ground truth, either to characterize Arab propagandist behaviours or evaluate the new proposed detectors. The provided dataset aims to demonstrate the value of characterizing Arab computational propaganda on X (Twitter) to close the research gap. It is prepared using a scientific approach to guarantee data quality. To ensure the quality of the data, the propagandist users' data was requested from the X Transparency center. Although the data released by X relates to propagandist users, at their level, the tweets were not classified as propaganda or not. Usually, propagandists mix propaganda and non-propaganda tweets to hide their identities. Therefore, three journalist volunteers were employed to label 2100 tweets for either propaganda or not and then label the propagandist tweet according to the propaganda technique used. The dataset covers sports and banking issues. As a result, the dataset consists of 16,355,558 tweets with their meta data from propagandist users in 2019. Plus, 2100 propagandists labelled tweets. The propagandist's dataset helps the research community apply supervised and unsupervised machine learning and deep learning algorithms to classify the credibility of Arab tweets and users. On the other hand, this paper suggests looking at behaviour rather than content to distinguish propaganda communication. The datasets enable deep non-textual analysis to investigate the main characteristics of Arab computational propaganda on X.

PMID:38328292 | PMC:PMC10847467 | DOI:10.1016/j.dib.2024.110089

Categories: Literature Watch

Multiomics and blood-based biomarkers of moyamoya disease: protocol of Moyamoya Omics Atlas (MOYAOMICS)

Wed, 2024-02-07 06:00

Chin Neurosurg J. 2024 Feb 8;10(1):5. doi: 10.1186/s41016-024-00358-3.

ABSTRACT

BACKGROUND: Moyamoya disease (MMD) is a rare and complex cerebrovascular disorder characterized by the progressive narrowing of the internal carotid arteries and the formation of compensatory collateral vessels. The etiology of MMD remains enigmatic, making diagnosis and management challenging. The MOYAOMICS project was initiated to investigate the molecular underpinnings of MMD and explore potential diagnostic and therapeutic strategies.

METHODS: The MOYAOMICS project employs a multidisciplinary approach, integrating various omics technologies, including genomics, transcriptomics, proteomics, and metabolomics, to comprehensively examine the molecular signatures associated with MMD pathogenesis. Additionally, we will investigate the potential influence of gut microbiota and brain-gut peptides on MMD development, assessing their suitability as targets for therapeutic strategies and dietary interventions. Radiomics, a specialized field in medical imaging, is utilized to analyze neuroimaging data for early detection and characterization of MMD-related brain changes. Deep learning algorithms are employed to differentiate MMD from other conditions, automating the diagnostic process. We also employ single-cellomics and mass cytometry to precisely study cellular heterogeneity in peripheral blood samples from MMD patients.

CONCLUSIONS: The MOYAOMICS project represents a significant step toward comprehending MMD's molecular underpinnings. This multidisciplinary approach has the potential to revolutionize early diagnosis, patient stratification, and the development of targeted therapies for MMD. The identification of blood-based biomarkers and the integration of multiple omics data are critical for improving the clinical management of MMD and enhancing patient outcomes for this complex disease.

PMID:38326922 | DOI:10.1186/s41016-024-00358-3

Categories: Literature Watch

Are better AI algorithms for breast cancer detection also better at predicting risk? A paired case-control study

Wed, 2024-02-07 06:00

Breast Cancer Res. 2024 Feb 7;26(1):25. doi: 10.1186/s13058-024-01775-z.

ABSTRACT

BACKGROUND: There is increasing evidence that artificial intelligence (AI) breast cancer risk evaluation tools using digital mammograms are highly informative for 1-6 years following a negative screening examination. We hypothesized that algorithms that have previously been shown to work well for cancer detection will also work well for risk assessment and that performance of algorithms for detection and risk assessment is correlated.

METHODS: To evaluate our hypothesis, we designed a case-control study using paired mammograms at diagnosis and at the previous screening visit. The study included n = 3386 women from the OPTIMAM registry, that includes mammograms from women diagnosed with breast cancer in the English breast screening program 2010-2019. Cases were diagnosed with invasive breast cancer or ductal carcinoma in situ at screening and were selected if they had a mammogram available at the screening examination that led to detection, and a paired mammogram at their previous screening visit 3y prior to detection when no cancer was detected. Controls without cancer were matched 1:1 to cases based on age (year), screening site, and mammography machine type. Risk assessment was conducted using a deep-learning model designed for breast cancer risk assessment (Mirai), and three open-source deep-learning algorithms designed for breast cancer detection. Discrimination was assessed using a matched area under the curve (AUC) statistic.

RESULTS: Overall performance using the paired mammograms followed the same order by algorithm for risk assessment (AUC range 0.59-0.67) and detection (AUC 0.81-0.89), with Mirai performing best for both. There was also a correlation in performance for risk and detection within algorithms by cancer size, with much greater accuracy for large cancers (30 mm+, detection AUC: 0.88-0.92; risk AUC: 0.64-0.74) than smaller cancers (0 to < 10 mm, detection AUC: 0.73-0.86, risk AUC: 0.54-0.64). Mirai was relatively strong for risk assessment of smaller cancers (0 to < 10 mm, risk, Mirai AUC: 0.64 (95% CI 0.57 to 0.70); other algorithms AUC 0.54-0.56).

CONCLUSIONS: Improvements in risk assessment could stem from enhancing cancer detection capabilities of smaller cancers. Other state-of-the-art AI detection algorithms with high performance for smaller cancers might achieve relatively high performance for risk assessment.

PMID:38326868 | DOI:10.1186/s13058-024-01775-z

Categories: Literature Watch

Exploring the performance and explainability of fine-tuned BERT models for neuroradiology protocol assignment

Wed, 2024-02-07 06:00

BMC Med Inform Decis Mak. 2024 Feb 7;24(1):40. doi: 10.1186/s12911-024-02444-z.

ABSTRACT

BACKGROUND: Deep learning has demonstrated significant advancements across various domains. However, its implementation in specialized areas, such as medical settings, remains approached with caution. In these high-stake environments, understanding the model's decision-making process is critical. This study assesses the performance of different pretrained Bidirectional Encoder Representations from Transformers (BERT) models and delves into understanding its decision-making within the context of medical image protocol assignment.

METHODS: Four different pre-trained BERT models (BERT, BioBERT, ClinicalBERT, RoBERTa) were fine-tuned for the medical image protocol classification task. Word importance was measured by attributing the classification output to every word using a gradient-based method. Subsequently, a trained radiologist reviewed the resulting word importance scores to assess the model's decision-making process relative to human reasoning.

RESULTS: The BERT model came close to human performance on our test set. The BERT model successfully identified relevant words indicative of the target protocol. Analysis of important words in misclassifications revealed potential systematic errors in the model.

CONCLUSIONS: The BERT model shows promise in medical image protocol assignment by reaching near human level performance and identifying key words effectively. The detection of systematic errors paves the way for further refinements to enhance its safety and utility in clinical settings.

PMID:38326769 | DOI:10.1186/s12911-024-02444-z

Categories: Literature Watch

Pages