Deep learning
Video-audio neural network ensemble for comprehensive screening of autism spectrum disorder in young children
PLoS One. 2024 Oct 3;19(10):e0308388. doi: 10.1371/journal.pone.0308388. eCollection 2024.
ABSTRACT
A timely diagnosis of autism is paramount to allow early therapeutic intervention in preschoolers. Deep Learning tools have been increasingly used to identify specific autistic symptoms. But they also offer opportunities for broad automated detection of autism at an early age. Here, we leverage a multi-modal approach by combining two neural networks trained on video and audio features of semi-standardized social interactions in a sample of 160 children aged 1 to 5 years old. Our ensemble model performs with an accuracy of 82.5% (F1 score: 0.816, Precision: 0.775, Recall: 0.861) for screening Autism Spectrum Disorders (ASD). Additional combinations of our model were developed to achieve higher specificity (92.5%, i.e., few false negatives) or sensitivity (90%, i.e. few false positives). Finally, we found a relationship between the neural network modalities and specific audio versus video ASD characteristics, bringing evidence that our neural network implementation was effective in taking into account different features that are currently standardized under the gold standard ASD assessment.
PMID:39361665 | DOI:10.1371/journal.pone.0308388
Improved deep learning prediction of antigen-antibody interactions
Proc Natl Acad Sci U S A. 2024 Oct 8;121(41):e2410529121. doi: 10.1073/pnas.2410529121. Epub 2024 Oct 3.
ABSTRACT
Identifying antibodies that neutralize specific antigens is crucial for developing effective immunotherapies, but this task remains challenging for many target antigens. The rise of deep learning-based computational approaches presents a promising avenue to address this challenge. Here, we assess the performance of a deep learning approach through two benchmark tests aimed at predicting antibodies for the receptor-binding domain of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein. Three different strategies for constructing input sequence alignments are employed for predicting structural models of antigen-antibody complexes. In our initial testing set, which comprises known experimental structures, these strategies collectively yield a significant top-ranked prediction for 61% of cases and a success rate of 47%. Notably, one strategy that utilizes the sequences of known antigen binders outperforms the other two, achieving a precision of 90% in a subsequent test set of ~1,000 antibodies, balanced between true and control antibodies for the antigen, albeit with a lower recall of 25%. Our results underscore the potential of integrating deep learning methods with single B cell sequencing techniques to enhance the prediction accuracy of antigen-antibody interactions.
PMID:39361651 | DOI:10.1073/pnas.2410529121
Design of image segmentation model based on residual connection and feature fusion
PLoS One. 2024 Oct 3;19(10):e0309434. doi: 10.1371/journal.pone.0309434. eCollection 2024.
ABSTRACT
With the development of deep learning technology, convolutional neural networks have made great progress in the field of image segmentation. However, for complex scenes and multi-scale target images, the existing technologies are still unable to achieve effective image segmentation. In view of this, an image segmentation model based on residual connection and feature fusion is proposed. The model makes comprehensive use of the deep feature extraction ability of residual connections and the multi-scale feature integration ability of feature fusion. In order to solve the problem of background complexity and information loss in traditional image segmentation, experiments were carried out on two publicly available data sets. The results showed that in the ISPRS Vaihingen dataset and the Caltech UCSD Birds200 dataset, when the model completed the 56th and 84th iterations, respectively, the average accuracy of FRes-MFDNN was the highest, which was 97.89% and 98.24%, respectively. In the ISPRS Vaihingen dataset and the Caltech UCSD Birds200 dataset, when the system model ran to 0.20s and 0.26s, the F1 value of the FRes-MFDNN method was the largest, and the F1 value approached 100% infinitely. The FRes-MFDNN segmented four images in the ISPRS Vaihingen dataset, and the segmentation accuracy of images 1, 2, 3 and 4 were 91.44%, 92.12%, 94.02% and 91.41%, respectively. In practical applications, the MSRF-Net method, LBN-AA-SPN method, ARG-Otsu method, and FRes-MFDNN were used to segment unlabeled bird images. The results showed that the FRes-MFDNN was more complete in details, and the overall effect was significantly better than the other three models. Meanwhile, in ordinary scene images, although there was a certain degree of noise and occlusion, the model still accurately recognized and segmented the main bird images. The results show that compared with the traditional model, after FRes-MFDNN segmentation, the completeness, detail, and spatial continuity of pixels have been significantly improved, making it more suitable for complex scenes.
PMID:39361568 | DOI:10.1371/journal.pone.0309434
Machine learning and deep learning-based approach to categorize Bengali comments on social networks using fused dataset
PLoS One. 2024 Oct 3;19(10):e0308862. doi: 10.1371/journal.pone.0308862. eCollection 2024.
ABSTRACT
Through the advancement of the contemporary web and the rapid adoption of social media platforms such as YouTube, Twitter, and Facebook, for example, life has become much easier when dealing with certain highly personal problems. The far-reaching consequences of online harassment require immediate preventative steps to safeguard psychological wellness and scholarly achievement via detection at an earlier stage. This piece of writing aims to eliminate online harassment and create a criticism-free online environment. In the paper, we have used a variety of attributes to evaluate a large number of Bengali comments. We communicate cleansed data utilizing machine learning (ML) methods and natural language processing techniques, which must be followed using term frequency and reverse document frequency (TF-IDF) with a count vectorizer. In addition, we used tokenization with padding to feed our deep learning (DL) models. Using mathematical visualization and natural language processing, online bullying could be detected quickly. Multi-layer Perceptron (MLP), K-Nearest Neighbors (K-NN), Extreme Gradient Boosting (XGBoost), Adaptive Boosting Classifier (AdaBoost), Logistic Regression Classifier (LR), Random Forest Classifier (RF), Bagging Classifier, Stochastic Gradient Descent (SGD), Voting Classifier, and Stacking are employed in the research we conducted. We expanded our investigation to include different DL frameworks. Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Convolutional-Long Short-Term Memory (C-LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) are all implemented. A large amount of data is required to precisely recognize harassing behavior. To rapidly recognize internet harassment written material, we combined two sets of data, producing 94,000 Bengali comments from different points of view. After understanding the ML and DL models, we can see that a hybrid model (MLP+SGD+LR) performed more effectively when compared to other models, its evaluation accuracy is 99.34%, precision is 99.34%, recall rate is 99.33%, and F1 score is 99.34% on multi-label class. For the binary classification model, we got 99.41% of accuracy.
PMID:39361557 | DOI:10.1371/journal.pone.0308862
Simultaneous Estimation of Digit Tip Forces and Hand Postures in a Simulated Real-Life Condition With High-Density Electromyography and Deep Learning
IEEE J Biomed Health Inform. 2024 Oct;28(10):5708-5717. doi: 10.1109/JBHI.2024.3350239.
ABSTRACT
In myoelectric control, continuous estimation of multiple degrees of freedom has an important role. Most studies have focused on estimating discrete postures or forces of the human hand but for a practical prosthetic system, both should be considered. In daily life activities, hand postures vary for grasping different objects and the amount of force exerted on each fingertip depends on the shape and weight of the object. This study aims to investigate the feasibility of continuous estimation of multiple degrees of freedom. We proposed a reach and grasp framework to study both absolute fingertip forces and hand movement types using deep learning techniques applied to high-density surface electromyography (HD-sEMG). Four daily life grasp types were examined and absolute fingertip forces were simultaneously estimated while grasping various objects, along with the grasp types. We showed that combining a 3-dimensional Convolutional Neural Network (3DCNN) with a Long Short-term Memory (LSTM) can reliably and continuously estimate the digit tip forces and classify different hand postures in human individuals. The mean absolute error (MAE) and Pearson correlation coefficient (PCC) results of the force estimation problem across all fingers and subjects were 0.46 ± 0.23 and 0.90 ± 0.03% respectively and for the classification problem, they were 0.04 ± 0.01 and 0.97 ± 0.02%. The results demonstrated that both absolute digit tip forces and hand postures can be successfully estimated through deep learning and HD-sEMG.
PMID:39361489 | DOI:10.1109/JBHI.2024.3350239
Sparse Non-Local CRF With Applications
IEEE Trans Pattern Anal Mach Intell. 2024 Oct 3;PP. doi: 10.1109/TPAMI.2024.3474468. Online ahead of print.
ABSTRACT
CRFs model spatial coherence in classical and deep learning computer vision. The most common CRF is called pairwise, as it connects pixel pairs. There are two types of pairwise CRF: sparse and dense. A sparse CRF connects the nearby pixels, leading to a linear number of connections in the image size. A dense CRF connects all pixel pairs, leading to a quadratic number of connections. While dense CRF is a more general model, it is much less efficient than sparse CRF. In fact, only Gaussian edge dense CRF is used in practice, and even then with approximations. We propose a new pairwise CRF, which we call sparse non-local CRF. Like dense CRF, it has non-local connections, and, therefore, it is more general than sparse CRF. Like sparse CRF, the number of connections is linear, and, therefore, our model is efficient. Besides efficiency, another advantage is that our edge weights are unrestricted. We show that our sparse non-local CRF models properties similar to that of Gaussian dense CRF. We also discuss connections to other CRF models. We demonstrate the usefulness of our model on classical and deep learning applications, for two and multiple labels.
PMID:39361458 | DOI:10.1109/TPAMI.2024.3474468
CHiMP: deep-learning tools trained on protein crystallization micrographs to enable automation of experiments
Acta Crystallogr D Struct Biol. 2024 Oct 1;80(Pt 10):744-764. doi: 10.1107/S2059798324009276. Epub 2024 Oct 1.
ABSTRACT
A group of three deep-learning tools, referred to collectively as CHiMP (Crystal Hits in My Plate), were created for analysis of micrographs of protein crystallization experiments at the Diamond Light Source (DLS) synchrotron, UK. The first tool, a classification network, assigns images into categories relating to experimental outcomes. The other two tools are networks that perform both object detection and instance segmentation, resulting in masks of individual crystals in the first case and masks of crystallization droplets in addition to crystals in the second case, allowing the positions and sizes of these entities to be recorded. The creation of these tools used transfer learning, where weights from a pre-trained deep-learning network were used as a starting point and repurposed by further training on a relatively small set of data. Two of the tools are now integrated at the VMXi macromolecular crystallography beamline at DLS, where they have the potential to absolve the need for any user input, both for monitoring crystallization experiments and for triggering in situ data collections. The third is being integrated into the XChem fragment-based drug-discovery screening platform, also at DLS, to allow the automatic targeting of acoustic compound dispensing into crystallization droplets.
PMID:39361357 | DOI:10.1107/S2059798324009276
OCT Radiomic Features Used for the Assessment of Activity of Thyroid Eye Disease
J Craniofac Surg. 2024 Sep 9. doi: 10.1097/SCS.0000000000010503. Online ahead of print.
ABSTRACT
This retrospective study aimed to develop deep-learning radiomics models based on optical coherence tomography (OCT) scans to evaluate the activity of thyroid eye disease. The study included 33 patients (66 orbits) diagnosed with thyroid eye disease at Beijing Tongren Hospital between July 2021 and August 2022. We collected OCT scans, clinical activity score, and medical records of the patients. Patients were divided into active and inactive groups based on the clinical activity score, which were then divided into a training set and a test set at a ratio of ∼7:3. The macula-centered horizontal meridian image was used for the identification of the regions of interest using 3D slicer. Radiomics features were extracted and selected by t test and least absolute shrinkage and selection operator regression algorithm with 10-fold cross-validation. The random forest (RF) model and support vector machine (SVM) model were built based on retinal or choroid features and validated by receiver operating characteristic curves and area under the curve (AUC). For the retinal features, AUC were 0.800 (RF) and 0.840 (SVM) in the test set, and for the choroid features, the AUC were 0.733 and 0.813, for the RF model and SVM model, respectively. For the confusion matrix, the choroid-based SVM model had more balanced parameters compared with the retina-based SVM model. OCT-based deep learning radiomics analysis can be used to evaluate activity, which provide convenience in clinical practice.
PMID:39361327 | DOI:10.1097/SCS.0000000000010503
A flexible deep learning framework for liver tumor diagnosis using variable multi-phase contrast-enhanced CT scans
J Cancer Res Clin Oncol. 2024 Oct 3;150(10):443. doi: 10.1007/s00432-024-05977-y.
ABSTRACT
BACKGROUND: Liver cancer is a significant cause of cancer-related mortality worldwide and requires tailored treatment strategies for different types. However, preoperative accurate diagnosis of the type presents a challenge. This study aims to develop an automatic diagnostic model based on multi-phase contrast-enhanced CT (CECT) images to distinguish between hepatocellular carcinoma (HCC), intrahepatic cholangiocarcinoma (ICC), and normal individuals.
METHODS: We designed a Hierarchical Long Short-Term Memory (H-LSTM) model, whose core components consist of a shared image feature extractor across phases, an internal LSTM for each phase, and an external LSTM across phases. The internal LSTM aggregates features from different layers of 2D CECT images, while the external LSTM aggregates features across different phases. H-LSTM can handle incomplete phases and varying numbers of CECT image layers, making it suitable for real-world decision support scenarios. Additionally, we applied phase augmentation techniques to process multi-phase CECT images, improving the model's robustness.
RESULTS: The H-LSTM model achieved an overall average AUROC of 0.93 (0.90, 1.00) on the test dataset, with AUROC for HCC classification reaching 0.97 (0.93, 1.00) and for ICC classification reaching 0.90 (0.78, 1.00). Comprehensive validation in scenarios with incomplete phases was performed, with the H-LSTM model consistently achieving AUROC values over 0.9.
CONCLUSION: The proposed H-LSTM model can be employed for classification tasks involving incomplete phases of CECT images in real-world scenarios, demonstrating high performance. This highlights the potential of AI-assisted systems in achieving accurate diagnosis and treatment of liver cancer. H-LSTM offers an effective solution for processing multi-phase data and provides practical value for clinical diagnostics.
PMID:39361193 | DOI:10.1007/s00432-024-05977-y
BEATRICE: Bayesian Fine-mapping from Summary Data using Deep Variational Inference
Bioinformatics. 2024 Oct 3:btae590. doi: 10.1093/bioinformatics/btae590. Online ahead of print.
ABSTRACT
MOTIVATION: We introduce a novel framework BEATRICE to identify putative causal variants from GWAS statistics. Identifying causal variants is challenging due to their sparsity and high correlation in the nearby regions. To account for these challenges, we rely on a hierarchical Bayesian model that imposes a binary concrete prior on the set of causal variants. We derive a variational algorithm for this fine-mapping problem by minimizing the KL divergence between an approximate density and the posterior probability distribution of the causal configurations. Correspondingly, we use a deep neural network as an inference machine to estimate the parameters of our proposal distribution. Our stochastic optimization procedure allows us to sample from the space of causal configurations, which we use to compute the posterior inclusion probabilities and determine credible sets for each causal variant. We conduct a detailed simulation study to quantify the performance of our framework against two state-of-the-art baseline methods across different numbers of causal variants and noise paradigms, as defined by the relative genetic contributions of causal and non-causal variants.
RESULTS: We demonstrate that BEATRICE achieves uniformly better coverage with comparable power and set sizes, and that the performance gain increases with the number of causal variants. We also show the efficacy BEATRICE in finding causal variants from the GWAS study of Alzheimer's disease. In comparison to the baselines, only BEATRICE can successfully find the APOE ϵ2 allele, a commonly associated variant of Alzheimer's.
AVAILABILITY: BEATRICE is available for download at https://github.com/sayangsep/Beatrice-Finemapping.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
PMID:39360993 | DOI:10.1093/bioinformatics/btae590
Learning to Explain is a Good Biomedical Few-Shot Learner
Bioinformatics. 2024 Oct 3:btae589. doi: 10.1093/bioinformatics/btae589. Online ahead of print.
ABSTRACT
MOTIVATION: Significant progress has been achieved in biomedical text mining using deep learning methods, which rely heavily on large amounts of high-quality data annotated by human experts. However, the reality is that obtaining high-quality annotated data is extremely challenging due to data scarcity (e.g., rare or new diseases), data privacy and security concerns, and the high cost of data annotation. Additionally, nearly all researches focus on predicting labels without providing corresponding explanations. Therefore, in this paper, we investigate a more realistic scenario, biomedical few-shot learning, and explore the impact of interpretability on biomedical few-shot learning.
RESULTS: We present LetEx-Learning to explain-a novel multi-task generative approach that leverages reasoning explanations from large language models (LLMs) to enhance the inductive reasoning ability of few-shot learning. Our approach includes (1) collecting high-quality explanations by devising a suite of complete workflow based on LLMs through CoT prompting and self-training strategies. (2) converting various biomedical NLP tasks into a text-to-text generation task in a unified manner, where collected explanations serve as additional supervision between text-label pairs by multi-task training. Experiments are conducted on 3 few-shot settings across 6 biomedical benchmark datasets. The results show that learning to explain improves the performances of diverse biomedical NLP tasks in low-resource scenario, outperforming strong baseline models significantly by up to 6.41%. Notably, the proposed method makes the 220M LetEx perform superior reasoning explanation ability against LLMs.
AVAILABILITY: Our source code and data are available at https://github.com/cpmss521/LetEx.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
PMID:39360976 | DOI:10.1093/bioinformatics/btae589
Alchemical Transformations and Beyond: Recent Advances and Real-World Applications of Free Energy Calculations in Drug Discovery
J Chem Inf Model. 2024 Oct 3. doi: 10.1021/acs.jcim.4c01024. Online ahead of print.
ABSTRACT
Computational methods constitute efficient strategies for screening and optimizing potential drug molecules. A critical factor in this process is the binding affinity between candidate molecules and targets, quantified as binding free energy. Among various estimation methods, alchemical transformation methods stand out for their theoretical rigor. Despite challenges in force field accuracy and sampling efficiency, advancements in algorithms, software, and hardware have increased the application of free energy perturbation (FEP) calculations in the pharmaceutical industry. Here, we review the practical applications of FEP in drug discovery projects since 2018, covering both ligand-centric and residue-centric transformations. We show that relative binding free energy calculations have steadily achieved chemical accuracy in real-world applications. In addition, we discuss alternative physics-based simulation methods and the incorporation of deep learning into free energy calculations.
PMID:39360948 | DOI:10.1021/acs.jcim.4c01024
Artificial intelligence performance in testing microfluidics for point-of-care
Lab Chip. 2024 Oct 3. doi: 10.1039/d4lc00671b. Online ahead of print.
ABSTRACT
Artificial intelligence (AI) is revolutionizing medicine by automating tasks like image segmentation and pattern recognition. These AI approaches support seamless integration with existing platforms, enhancing diagnostics, treatment, and patient care. While recent advancements have demonstrated AI superiority in advancing microfluidics for point of care (POC) diagnostics, a gap remains in comparative evaluations of AI algorithms in testing microfluidics. We conducted a comparative evaluation of AI models specifically for the two-class classification problem of identifying the presence or absence of bubbles in microfluidic channels under various imaging conditions. Using a model microfluidic system with a single channel loaded with 3D transparent objects (bubbles), we challenged each of the tested machine learning (ML) (n = 6) and deep learning (DL) (n = 9) models across different background settings. Evaluation revealed that the random forest ML model achieved 95.52% sensitivity, 82.57% specificity, and 97% AUC, outperforming other ML algorithms. Among DL models suitable for mobile integration, DenseNet169 demonstrated superior performance, achieving 92.63% sensitivity, 92.22% specificity, and 92% AUC. Remarkably, DenseNet169 integration into a mobile POC system demonstrated exceptional accuracy (>0.84) in testing microfluidics at under challenging imaging settings. Our study confirms the transformative potential of AI in healthcare, emphasizing its capacity to revolutionize precision medicine through accurate and accessible diagnostics. The integration of AI into healthcare systems holds promise for enhancing patient outcomes and streamlining healthcare delivery.
PMID:39360887 | DOI:10.1039/d4lc00671b
Deep learning models for tendinopathy detection: a systematic review and meta-analysis of diagnostic tests
EFORT Open Rev. 2024 Oct 3;9(10):941-952. doi: 10.1530/EOR-24-0016.
ABSTRACT
PURPOSE: Different deep-learning models have been employed to aid in the diagnosis of musculoskeletal pathologies. The diagnosis of tendon pathologies could particularly benefit from applying these technologies. The objective of this study is to assess the performance of deep learning models in diagnosing tendon pathologies using various imaging modalities.
METHODS: A meta-analysis was conducted, with searches performed on MEDLINE/PubMed, SCOPUS, Cochrane Library, Lilacs, and SciELO. The QUADAS-2 tool was employed to assess the quality of the studies. Diagnostic measures, such as sensitivity, specificity, diagnostic odds ratio, positive and negative likelihood ratios, area under the curve, and summary receiver operating characteristic, were included using a random-effects model. Heterogeneity and subgroup analyses were also conducted. All statistical analyses and plots were generated using the R software package. The PROSPERO ID is CRD42024506491.
RESULTS: Eleven deep-learning models from six articles were analyzed. In the random effects models, the sensitivity and specificity of the algorithms for detecting tendon conditions were 0.910 (95% CI: 0.865; 0.940) and 0.954 (0.909; 0.977). The PLR, NLR, lnDOR, and AUC estimates were found to be 37.075 (95%CI: 4.654; 69.496), 0.114 (95%CI: 0.056; 0.171), 5.160 (95% CI: 4.070; 6.250) with a (P < 0.001), and 96%, respectively.
CONCLUSION: The deep-learning algorithms demonstrated a high level of accuracy level in detecting tendon anomalies. The overall robust performance suggests their potential application as a valuable complementary tool in diagnosing medical images.
PMID:39360789 | DOI:10.1530/EOR-24-0016
Deep learning predicted perceived age is a reliable approach for analysis of facial ageing: A proof of principle study
J Eur Acad Dermatol Venereol. 2024 Oct 3. doi: 10.1111/jdv.20365. Online ahead of print.
ABSTRACT
BACKGROUND: Perceived age (PA) has been associated with mortality, genetic variants linked to ageing and several age-related morbidities. However, estimating PA in large datasets is laborious and costly to generate, limiting its practical applicability.
OBJECTIVES: To determine if estimating PA using deep learning-based algorithms results in the same associations with morbidities and genetic variants as human-estimated perceived age.
METHODS: Self-supervised learning (SSL) and deep feature transfer (DFT) deep learning (DL) approaches were trained and tested on human-estimated PAs and their corresponding frontal face images of middle-aged to elderly Dutch participants (n = 2679) from a population-based study in the Netherlands. We compared the DL-estimated PAs with morbidities previously associated with human-estimated PA as well as genetic variants in the gene MC1R; we additionally tested the PA associations with MC1R in a new validation cohort (n = 1158).
RESULTS: The DL approaches predicted PA in this population with a mean absolute error of 2.84 years (DFT) and 2.39 years (SSL). In the training-test dataset, we found the same significant (p < 0.05) associations for DL PA with osteoporosis, ARHL, cognition, COPD and cataracts and MC1R, as with human PA. We also found a similar but less significant association for SSL and DFT PAs (0.69 and 0.71 years per allele, p = 0.008 and 0.011, respectively) with MC1R variants in the validation dataset as that found with human, SSL and DFT PAs in the training-test dataset (0.79, 0.78 and 0.71 years per allele respectively; all p < 0.0001).
CONCLUSIONS: Deep learning methods can automatically estimate PA from facial images with enough accuracy to replicate known links between human-estimated perceived age and several age-related morbidities. Furthermore, DL predicted perceived age associated with MC1R gene variants in a validation cohort. Hence, such DL PA techniques may be used instead of human estimations in perceived age studies thereby reducing time and costs.
PMID:39360788 | DOI:10.1111/jdv.20365
A deep learning approach to case prioritisation of colorectal biopsies
Histopathology. 2024 Oct 3. doi: 10.1111/his.15331. Online ahead of print.
ABSTRACT
AIMS: To create and validate a weakly supervised artificial intelligence (AI) model for detection of abnormal colorectal histology, including dysplasia and cancer, and prioritise biopsies according to clinical significance (severity of diagnosis).
MATERIALS AND METHODS: Triagnexia Colorectal, a weakly supervised deep learning model, was developed for the classification of colorectal samples from haematoxylin and eosin (H&E)-stained whole slide images. The model was trained on 24 983 digitised images and assessed by multiple pathologists in a simulated digital pathology environment. The AI application was implemented as part of a point and click graphical user interface to streamline decision-making. Pathologists assessed the accuracy of the AI tool, its value, ease of use and integration into the digital pathology workflow.
RESULTS: Validation of the model was conducted on two cohorts: the first, on 100 single-slide cases, achieved micro-average model specificity of 0.984, micro-average model sensitivity of 0.949 and micro-average model F1 score of 0.949 across all classes. A secondary multi-institutional validation cohort, of 101 single-slide cases, achieved micro-average model specificity of 0.978, micro-average model sensitivity of 0.931 and micro-average model F1 score of 0.931 across all classes. Pathologists reflected their positive impressions on the overall accuracy of the AI in detecting colorectal pathology abnormalities.
CONCLUSIONS: We have developed a high-performing colorectal biopsy AI triage model that can be integrated into a routine digital pathology workflow to assist pathologists in prioritising cases and identifying cases with dysplasia/cancer versus non-neoplastic biopsies.
PMID:39360579 | DOI:10.1111/his.15331
YOLO-Faster: An efficient remote sensing object detection method based on AMFFN
Sci Prog. 2024 Oct-Dec;107(4):368504241280765. doi: 10.1177/00368504241280765.
ABSTRACT
As a pivotal task within computer vision, object detection finds application across a diverse spectrum of industrial scenarios. The advent of deep learning technologies has significantly elevated the accuracy of object detectors designed for general-purpose applications. Nevertheless, in contrast to conventional terrestrial environments, remote sensing object detection scenarios pose formidable challenges, including intricate and diverse backgrounds, fluctuating object scales, and pronounced interference from background noise, rendering remote sensing object detection an enduringly demanding task. In addition, despite the superior detection performance of deep learning-based object detection networks compared to traditional counterparts, their substantial parameter and computational demands curtail their feasibility for deployment on mobile devices equipped with low-power processors. In response to the aforementioned challenges, this paper introduces an enhanced lightweight remote sensing object detection network, denoted as YOLO-Faster, built upon the foundation of YOLOv5. Firstly, the lightweight design and inference speed of the object detection network is augmented by incorporating the lightweight network as the foundational network within YOLOv5, satisfying the demand for real-time detection on mobile devices. Moreover, to tackle the issue of detecting objects of different scales in large and complex backgrounds, an adaptive multiscale feature fusion network is introduced, which dynamically adjusts the large receptive field to capture dependencies among objects of different scales, enabling better modeling of object detection scenarios in remote sensing scenes. At last, the robustness of the object detection network under background noise is enhanced through incorporating a decoupled detection head that separates the classification and regression processes of the detection network. The results obtained from the public remote sensing object detection dataset DOTA show that the proposed method has a mean average precision of 71.4% and a detection speed of 38 frames per second.
PMID:39360473 | DOI:10.1177/00368504241280765
Visual Enhancement and Semantic Segmentation of Murine Tissue Scans with Pulsed THz Spectroscopy
Proc IEEE Int Conf Semant Comput. 2023 Feb;2023:80-87. doi: 10.1109/ICSC56153.2023.00018. Epub 2023 Mar 20.
ABSTRACT
Semantic Artificial Intelligence has certain qualities that are advantageous for deep learning-based medical imaging tasks. Medical images can be augmented by injecting semantic context into the underlying classification mechanism, increasing the information density of the scan and ultimately can provide more trust in the result. This work considers an application of semantic AI to segment tissue types from excised breast tumors imaged with pulsed terahertz (THz)-an emerging imaging technology. Prior work has demonstrated traditional data driven methodology for deep learning on THz has two key weaknesses: namely 1) low image resolution compared to other state-of-the-art imaging techniques and 2) a lack of expertly-labeled images due to domain transformation and tissue changes during histopathology. This work seeks to address these limitations through two semantic AI mechanisms. Specifically, we introduce a two stage pipeline using an unsupervised image-to-image translation network and a supervised segmentation network. The combination of these contributions enables enhanced near-real-time visualization of excised tissue through THz scans and a supervised segmentation and classification training strategy using only synthetic THz scans generated by our bi-directional image-to-image translation network.
PMID:39360127 | PMC:PMC11445794 | DOI:10.1109/ICSC56153.2023.00018
Investigating the contribution of image time series observations to cauliflower harvest-readiness prediction
Front Artif Intell. 2024 Sep 18;7:1416323. doi: 10.3389/frai.2024.1416323. eCollection 2024.
ABSTRACT
Cauliflower cultivation is subject to high-quality control criteria during sales, which underlines the importance of accurate harvest timing. Using time series data for plant phenotyping can provide insights into the dynamic development of cauliflower and allow more accurate predictions of when the crop is ready for harvest than single-time observations. However, data acquisition on a daily or weekly basis is resource-intensive, making selection of acquisition days highly important. We investigate which data acquisition days and development stages positively affect the model accuracy to get insights into prediction-relevant observation days and aid future data acquisition planning. We analyze harvest-readiness using the cauliflower image time series of the GrowliFlower dataset. We use an adjusted ResNet18 classification model, including positional encoding of the data acquisition dates to add implicit information about development. The explainable machine learning approach GroupSHAP analyzes time points' contributions. Time points with the lowest mean absolute contribution are excluded from the time series to determine their effect on model accuracy. Using image time series rather than single time points, we achieve an increase in accuracy of 4%. GroupSHAP allows the selection of time points that positively affect the model accuracy. By using seven selected time points instead of all 11 ones, the accuracy improves by an additional 4%, resulting in an overall accuracy of 89.3%. The selection of time points may therefore lead to a reduction in data collection in the future.
PMID:39359647 | PMC:PMC11445755 | DOI:10.3389/frai.2024.1416323
Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
IJCAI (U S). 2024 Aug;2024:5763-5771. doi: 10.24963/ijcai.2024/637.
ABSTRACT
Deep learning-based predictive models, leveraging Electronic Health Records (EHR), are receiving increasing attention in healthcare. An effective representation of a patient's EHR should hierarchically encompass both the temporal relationships between historical visits and medical events, and the inherent structural information within these elements. Existing patient representation methods can be roughly categorized into sequential representation and graphical representation. The sequential representation methods focus only on the temporal relationships among longitudinal visits. On the other hand, the graphical representation approaches, while adept at extracting the graph-structured relationships between various medical events, fall short in effectively integrate temporal information. To capture both types of information, we model a patient's EHR as a novel temporal heterogeneous graph. This graph includes historical visits nodes and medical events nodes. It propagates structured information from medical event nodes to visit nodes and utilizes time-aware visit nodes to capture changes in the patient's health status. Furthermore, we introduce a novel temporal graph transformer (TRANS) that integrates temporal edge features, global positional encoding, and local structural encoding into heterogeneous graph convolution, capturing both temporal and structural information. We validate the effectiveness of TRANS through extensive experiments on three real-world datasets. The results show that our proposed approach achieves state-of-the-art performance.
PMID:39359569 | PMC:PMC11446542 | DOI:10.24963/ijcai.2024/637