Deep learning
All-optical complex field imaging using diffractive processors
Light Sci Appl. 2024 May 28;13(1):120. doi: 10.1038/s41377-024-01482-6.
ABSTRACT
Complex field imaging, which captures both the amplitude and phase information of input optical fields or objects, can offer rich structural insights into samples, such as their absorption and refractive index distributions. However, conventional image sensors are intensity-based and inherently lack the capability to directly measure the phase distribution of a field. This limitation can be overcome using interferometric or holographic methods, often supplemented by iterative phase retrieval algorithms, leading to a considerable increase in hardware complexity and computational demand. Here, we present a complex field imager design that enables snapshot imaging of both the amplitude and quantitative phase information of input fields using an intensity-based sensor array without any digital processing. Our design utilizes successive deep learning-optimized diffractive surfaces that are structured to collectively modulate the input complex field, forming two independent imaging channels that perform amplitude-to-amplitude and phase-to-intensity transformations between the input and output planes within a compact optical design, axially spanning ~100 wavelengths. The intensity distributions of the output fields at these two channels on the sensor plane directly correspond to the amplitude and quantitative phase profiles of the input complex field, eliminating the need for any digital image reconstruction algorithms. We experimentally validated the efficacy of our complex field diffractive imager designs through 3D-printed prototypes operating at the terahertz spectrum, with the output amplitude and phase channel images closely aligning with our numerical simulations. We envision that this complex field imager will have various applications in security, biomedical imaging, sensing and material science, among others.
PMID:38802376 | DOI:10.1038/s41377-024-01482-6
Emotion recognition for human-computer interaction using high-level descriptors
Sci Rep. 2024 May 27;14(1):12122. doi: 10.1038/s41598-024-59294-y.
ABSTRACT
Recent research has focused extensively on employing Deep Learning (DL) techniques, particularly Convolutional Neural Networks (CNN), for Speech Emotion Recognition (SER). This study addresses the burgeoning interest in leveraging DL for SER, specifically focusing on Punjabi language speakers. The paper presents a novel approach to constructing and preprocessing a labeled speech corpus using diverse social media sources. By utilizing spectrograms as the primary feature representation, the proposed algorithm effectively learns discriminative patterns for emotion recognition. The method is evaluated on a custom dataset derived from various Punjabi media sources, including films and web series. Results demonstrate that the proposed approach achieves an accuracy of 69%, surpassing traditional methods like decision trees, Naïve Bayes, and random forests, which achieved accuracies of 49%, 52%, and 61% respectively. Thus, the proposed method improves accuracy in recognizing emotions from Punjabi speech signals.
PMID:38802373 | DOI:10.1038/s41598-024-59294-y
Basics of Artificial Intelligence (AI) Modeling
J Insur Med. 2024 Jul 1;51(1):35-40. doi: 10.17849/insm-51-1-35-40.1.
ABSTRACT
A key-word search of artificial intelligence, artificial intelligence in medicine, and artificial intelligence models was done in PubMed and Google Scholar yielded more than 100 articles that were reviewed for summation in this article.
PMID:38802088 | DOI:10.17849/insm-51-1-35-40.1
Autophagy and machine learning: Unanswered questions
Biochim Biophys Acta Mol Basis Dis. 2024 May 25:167263. doi: 10.1016/j.bbadis.2024.167263. Online ahead of print.
ABSTRACT
Autophagy is a critical conserved cellular process in maintaining cellular homeostasis by clearing and recycling damaged organelles and intracellular components in lysosomes and vacuoles. Autophagy plays a vital role in cell survival, bioenergetic homeostasis, organism development, and cell death regulation. Malfunctions in autophagy are associated with various human diseases and health disorders, such as cancers and neurodegenerative diseases. Significant effort has been devoted to autophagy-related research in the context of genes, proteins, diagnosis, etc. In recent years, there has been a surge of studies utilizing state of the art machine learning (ML) tools to analyze and understand the roles of autophagy in various biological processes. We taxonomize ML techniques that are applicable in autophagy context, comprehensively review existing efforts in this route, and outline principles to consider in biomedical context. In recognition of recent groundbreaking advances in deep learning community, we discuss new opportunities in interdisciplinary collaborations and seek to engage autophagy and computer science researchers to promote autophagy research with joint efforts.
PMID:38801963 | DOI:10.1016/j.bbadis.2024.167263
Evaluation of Artificial Intelligence Algorithms for Diabetic Retinopathy Detection: Protocol for a Systematic Review and Meta-Analysis
JMIR Res Protoc. 2024 May 27;13:e57292. doi: 10.2196/57292.
ABSTRACT
BACKGROUND: Diabetic retinopathy (DR) is one of the most common complications of diabetes mellitus. The global burden is immense with a worldwide prevalence of 8.5%. Recent advancements in artificial intelligence (AI) have demonstrated the potential to transform the landscape of ophthalmology with earlier detection and management of DR.
OBJECTIVE: This study seeks to provide an update and evaluate the accuracy and current diagnostic ability of AI in detecting DR versus ophthalmologists. Additionally, this review will highlight the potential of AI integration to enhance DR screening, management, and disease progression.
METHODS: A systematic review of the current landscape of AI's role in DR will be undertaken, guided by the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analysis) model. Relevant peer-reviewed papers published in English will be identified by searching 4 international databases: PubMed, Embase, CINAHL, and the Cochrane Central Register of Controlled Trials. Eligible studies will include randomized controlled trials, observational studies, and cohort studies published on or after 2022 that evaluate AI's performance in retinal imaging detection of DR in diverse adult populations. Studies that focus on specific comorbid conditions, nonimage-based applications of AI, or those lacking a direct comparison group or clear methodology will be excluded. Selected papers will be independently assessed for bias by 2 review authors (JS and DM) using the Quality Assessment of Diagnostic Accuracy Studies tool for systematic reviews. Upon systematic review completion, if it is determined that there are sufficient data, a meta-analysis will be performed. Data synthesis will use a quantitative model. Statistical software such as RevMan and STATA will be used to produce a random-effects meta-regression model to pool data from selected studies.
RESULTS: Using selected search queries across multiple databases, we accumulated 3494 studies regarding our topic of interest, of which 1588 were duplicates, leaving 1906 unique research papers to review and analyze.
CONCLUSIONS: This systematic review and meta-analysis protocol outlines a comprehensive evaluation of AI for DR detection. This active study is anticipated to assess the current accuracy of AI methods in detecting DR.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/57292.
PMID:38801771 | DOI:10.2196/57292
Generalization of a Deep Learning Model for Continuous Glucose Monitoring-Based Hypoglycemia Prediction: Algorithm Development and Validation Study
JMIR Med Inform. 2024 May 24;12:e56909. doi: 10.2196/56909.
ABSTRACT
BACKGROUND: Predicting hypoglycemia while maintaining a low false alarm rate is a challenge for the wide adoption of continuous glucose monitoring (CGM) devices in diabetes management. One small study suggested that a deep learning model based on the long short-term memory (LSTM) network had better performance in hypoglycemia prediction than traditional machine learning algorithms in European patients with type 1 diabetes. However, given that many well-recognized deep learning models perform poorly outside the training setting, it remains unclear whether the LSTM model could be generalized to different populations or patients with other diabetes subtypes.
OBJECTIVE: The aim of this study was to validate LSTM hypoglycemia prediction models in more diverse populations and across a wide spectrum of patients with different subtypes of diabetes.
METHODS: We assembled two large data sets of patients with type 1 and type 2 diabetes. The primary data set including CGM data from 192 Chinese patients with diabetes was used to develop the LSTM, support vector machine (SVM), and random forest (RF) models for hypoglycemia prediction with a prediction horizon of 30 minutes. Hypoglycemia was categorized into mild (glucose=54-70 mg/dL) and severe (glucose<54 mg/dL) levels. The validation data set of 427 patients of European-American ancestry in the United States was used to validate the models and examine their generalizations. The predictive performance of the models was evaluated according to the sensitivity, specificity, and area under the receiver operating characteristic curve (AUC).
RESULTS: For the difficult-to-predict mild hypoglycemia events, the LSTM model consistently achieved AUC values greater than 97% in the primary data set, with a less than 3% AUC reduction in the validation data set, indicating that the model was robust and generalizable across populations. AUC values above 93% were also achieved when the LSTM model was applied to both type 1 and type 2 diabetes in the validation data set, further strengthening the generalizability of the model. Under different satisfactory levels of sensitivity for mild and severe hypoglycemia prediction, the LSTM model achieved higher specificity than the SVM and RF models, thereby reducing false alarms.
CONCLUSIONS: Our results demonstrate that the LSTM model is robust for hypoglycemia prediction and is generalizable across populations or diabetes subtypes. Given its additional advantage of false-alarm reduction, the LSTM model is a strong candidate to be widely implemented in future CGM devices for hypoglycemia prediction.
PMID:38801705 | DOI:10.2196/56909
Do as Sonographers Think: Contrast-enhanced Ultrasound for Thyroid Nodules Diagnosis via Microvascular Infiltrative Awareness
IEEE Trans Med Imaging. 2024 May 27;PP. doi: 10.1109/TMI.2024.3405621. Online ahead of print.
ABSTRACT
Dynamic contrast-enhanced ultrasound (CEUS) imaging can reflect the microvascular distribution and blood flow perfusion, thereby holding clinical significance in distinguishing between malignant and benign thyroid nodules. Notably, CEUS offers a meticulous visualization of the microvascular distribution surrounding the nodule, leading to an apparent increase in tumor size compared to gray-scale ultrasound (US). In the dual-image obtained, the lesion size enlarged from gray-scale US to CEUS, as the microvascular appeared to be continuously infiltrating the surrounding tissue. Although the infiltrative dilatation of microvasculature remains ambiguous, sonographers believe it may promote the diagnosis of thyroid nodules. We propose a deep learning model designed to emulate the diagnostic reasoning process employed by sonographers. This model integrates the observation of microvascular infiltration on dynamic CEUS, leveraging the additional insights provided by gray-scale US for enhanced diagnostic support. Specifically, temporal projection attention is implemented on time dimension of dynamic CEUS to represent the microvascular perfusion. Additionally, we employ a group of confidence maps with flexible Sigmoid Alpha Functions to aware and describe the infiltrative dilatation process. Moreover, a self-adaptive integration mechanism is introduced to dynamically integrate the assisted gray-scale US and the confidence maps of CEUS for individual patients, ensuring a trustworthy diagnosis of thyroid nodules. In this retrospective study, we collected a thyroid nodule dataset of 282 CEUS videos. The method achieves a superior diagnostic accuracy and sensitivity of 89.52% and 93.75%, respectively. These results suggest that imitating the diagnostic thinking of sonographers, encompassing dynamic microvascular perfusion and infiltrative expansion, proves beneficial for CEUS-based thyroid nodule diagnosis.
PMID:38801692 | DOI:10.1109/TMI.2024.3405621
Deep Closing: Enhancing Topological Connectivity in Medical Tubular Segmentation
IEEE Trans Med Imaging. 2024 May 27;PP. doi: 10.1109/TMI.2024.3405982. Online ahead of print.
ABSTRACT
Accurately segmenting tubular structures, such as blood vessels or nerves, holds significant clinical implications across various medical applications. However, existing methods often exhibit limitations in achieving satisfactory topological performance, particularly in terms of preserving connectivity. To address this challenge, we propose a novel deep-learning approach, termed Deep Closing, inspired by the well-established classic closing operation. Deep Closing first leverages an AutoEncoder trained in the Masked Image Modeling (MIM) paradigm, enhanced with digital topology knowledge, to effectively learn the inherent shape prior of tubular structures and indicate potential disconnected regions. Subsequently, a Simple Components Erosion module is employed to generate topology-focused outcomes, which refines the preceding segmentation results, ensuring all the generated regions are topologically significant. To evaluate the efficacy of Deep Closing, we conduct comprehensive experiments on 4 datasets: DRIVE, CHASE DB1, DCA1, and CREMI. The results demonstrate that our approach yields considerable improvements in topological performance compared with existing methods. Furthermore, Deep Closing exhibits the ability to generalize and transfer knowledge from external datasets, showcasing its robustness and adaptability. The code for this paper has been available at: https://github.com/5k5000/DeepClosing.
PMID:38801688 | DOI:10.1109/TMI.2024.3405982
Self-Supervised Image Denoising of Third Harmonic Generation Microscopic Images of Human Glioma Tissue by Transformer-based Blind Spot (TBS) Network
IEEE J Biomed Health Inform. 2024 May 27;PP. doi: 10.1109/JBHI.2024.3405562. Online ahead of print.
ABSTRACT
Third harmonic generation (THG) microscopy shows great potential for instant pathology of brain tumor tissue during surgery. However, due to the maximal permitted exposure of laser intensity and inherent noise of the imaging system, the noise level of THG images is relatively high, which affects subsequent feature extraction analysis. Denoising THG images is challenging for modern deep-learning based methods because of the rich morphologies contained and the difficulty in obtaining the noise-free counterparts. To address this, in this work, we propose an unsupervised deep-learning network for denoising of THG images which combines a self-supervised blind spot method and a U-shape Transformer using a dynamic sparse attention mechanism. The experimental results on THG images of human glioma tissue show that our approach exhibits superior denoising performance qualitatively and quantitatively compared with previous methods. Our model achieves an improvement of 2.47-9.50 dB in SNR and 0.37-7.40 dB in CNR, compared to six recent state-of-the-art unsupervised learning models including Neighbor2Neighbor, Blind2Unblind, Self2Self+, ZS-N2N, Noise2Info and SDAP. To achieve an objective evaluation of our model, we also validate our model on public datasets including natural and microscopic images, and our model shows a better denoising performance than several recent unsupervised models such as Neighbor2Neighbor, Blind2Unblind and ZS-N2N. In addition, our model is nearly instant in denoising a THG image, which has the potential for real-time applications of THG microscopy.
PMID:38801682 | DOI:10.1109/JBHI.2024.3405562
The Impact of Performance Expectancy, Workload, Risk, and Satisfaction on Trust in ChatGPT: Cross-Sectional Survey Analysis
JMIR Hum Factors. 2024 May 27;11:e55399. doi: 10.2196/55399.
ABSTRACT
BACKGROUND: ChatGPT (OpenAI) is a powerful tool for a wide range of tasks, from entertainment and creativity to health care queries. There are potential risks and benefits associated with this technology. In the discourse concerning the deployment of ChatGPT and similar large language models, it is sensible to recommend their use primarily for tasks a human user can execute accurately. As we transition into the subsequent phase of ChatGPT deployment, establishing realistic performance expectations and understanding users' perceptions of risk associated with its use are crucial in determining the successful integration of this artificial intelligence (AI) technology.
OBJECTIVE: The aim of the study is to explore how perceived workload, satisfaction, performance expectancy, and risk-benefit perception influence users' trust in ChatGPT.
METHODS: A semistructured, web-based survey was conducted with 607 adults in the United States who actively use ChatGPT. The survey questions were adapted from constructs used in various models and theories such as the technology acceptance model, the theory of planned behavior, the unified theory of acceptance and use of technology, and research on trust and security in digital environments. To test our hypotheses and structural model, we used the partial least squares structural equation modeling method, a widely used approach for multivariate analysis.
RESULTS: A total of 607 people responded to our survey. A significant portion of the participants held at least a high school diploma (n=204, 33.6%), and the majority had a bachelor's degree (n=262, 43.1%). The primary motivations for participants to use ChatGPT were for acquiring information (n=219, 36.1%), amusement (n=203, 33.4%), and addressing problems (n=135, 22.2%). Some participants used it for health-related inquiries (n=44, 7.2%), while a few others (n=6, 1%) used it for miscellaneous activities such as brainstorming, grammar verification, and blog content creation. Our model explained 64.6% of the variance in trust. Our analysis indicated a significant relationship between (1) workload and satisfaction, (2) trust and satisfaction, (3) performance expectations and trust, and (4) risk-benefit perception and trust.
CONCLUSIONS: The findings underscore the importance of ensuring user-friendly design and functionality in AI-based applications to reduce workload and enhance user satisfaction, thereby increasing user trust. Future research should further explore the relationship between risk-benefit perception and trust in the context of AI chatbots.
PMID:38801658 | DOI:10.2196/55399
R-MFE-TCN: A correlation prediction model between body surface and tumor during respiratory movement
Med Phys. 2024 May 27. doi: 10.1002/mp.17183. Online ahead of print.
ABSTRACT
BACKGROUND: 2D CT image-guided radiofrequency ablation (RFA) is an exciting minimally invasive treatment that can destroy liver tumors without removing them. However, CT images can only provide limited static information, and the tumor will move with the patient's respiratory movement. Therefore, how to accurately locate tumors under free conditions is an urgent problem to be solved at present.
PURPOSE: The purpose of this study is to propose a respiratory correlation prediction model for mixed reality surgical assistance system, Riemannian and Multivariate Feature Enhanced Temporal Convolutional Network (R-MFE-TCN), and to achieve accurate respiratory correlation prediction.
METHODS: The model adopts a respiration-oriented Riemannian information enhancement strategy to expand the diversity of the dataset. A new Multivariate Feature Enhancement module (MFE) is proposed to retain respiratory data information, so that the network can fully explore the correlation of internal and external data information, the dual-channel is used to retain multivariate respiratory feature, and the Multi-headed Self-attention obtains respiratory peak-to-valley value periodic information. This information significantly improves the prediction performance of the network. At the same time, the PSO algorithm is used for hyperparameter optimization. In the experiment, a total of seven patients' internal and external respiratory motion trajectories were obtained from the dataset, and the first six patients were selected as the training set. The respiratory signal collection frequency was 21 Hz.
RESULTS: A large number of experiments on the dataset prove the good performance of this method, which improves the prediction accuracy while also having strong robustness. This method can reduce the delay deviation under long window prediction and achieve good performance. In the case of 400 ms, the average RMSE and MAE are 0.0453 and 0.0361 mm, respectively, which is better than other research methods.
CONCLUSION: The R-MFE-TCN can be extended to respiratory correlation prediction in different clinical situations, meeting the accuracy requirements for respiratory delay prediction in surgical assistance.
PMID:38801342 | DOI:10.1002/mp.17183
DNA Virus Detection System Based on RPA-CRISPR/Cas12a-SPM and Deep Learning
J Vis Exp. 2024 May 10;(207). doi: 10.3791/64833.
ABSTRACT
We report a fast, easy-to-implement, highly sensitive, sequence-specific, and point-of-care (POC) DNA virus detection system, which combines recombinase polymerase amplification (RPA) and CRISPR/Cas12a system for trace detection of DNA viruses. Target DNA is amplified and recognized by RPA and CRISPR/Cas12a separately, which triggers the collateral cleavage activity of Cas12a that cleaves a fluorophore-quencher labeled DNA reporter and generalizes fluorescence. For POC detection, portable smartphone microscopy is built to take fluorescent images. Besides, deep learning models for binary classification of positive or negative samples, achieving high accuracy, are deployed within the system. Frog virus 3 (FV3, genera Ranavirus, family Iridoviridae) was tested as an example for this DNA virus POC detection system, and the limits of detection (LoD) can achieve 10 aM within 40 min. Without skilled operators and bulky instruments, the portable and miniature RPA-CRISPR/Cas12a-SPM with artificial intelligence (AI) assisted classification shows great potential for POC DNA virus detection and can help prevent the spread of such viruses.
PMID:38801262 | DOI:10.3791/64833
Multi-Instance Learning for Vocal Fold Leukoplakia Diagnosis Using White Light and Narrow-Band Imaging: A Multicenter Study
Laryngoscope. 2024 May 27. doi: 10.1002/lary.31537. Online ahead of print.
ABSTRACT
OBJECTIVES: Vocal fold leukoplakia (VFL) is a precancerous lesion of laryngeal cancer, and its endoscopic diagnosis poses challenges. We aim to develop an artificial intelligence (AI) model using white light imaging (WLI) and narrow-band imaging (NBI) to distinguish benign from malignant VFL.
METHODS: A total of 7057 images from 426 patients were used for model development and internal validation. Additionally, 1617 images from two other hospitals were used for model external validation. Modeling learning based on WLI and NBI modalities was conducted using deep learning combined with a multi-instance learning approach (MIL). Furthermore, 50 prospectively collected videos were used to evaluate real-time model performance. A human-machine comparison involving 100 patients and 12 laryngologists assessed the real-world effectiveness of the model.
RESULTS: The model achieved the highest area under the receiver operating characteristic curve (AUC) values of 0.868 and 0.884 in the internal and external validation sets, respectively. AUC in the video validation set was 0.825 (95% CI: 0.704-0.946). In the human-machine comparison, AI significantly improved AUC and accuracy for all laryngologists (p < 0.05). With the assistance of AI, the diagnostic abilities and consistency of all laryngologists improved.
CONCLUSIONS: Our multicenter study developed an effective AI model using MIL and fusion of WLI and NBI images for VFL diagnosis, particularly aiding junior laryngologists. However, further optimization and validation are necessary to fully assess its potential impact in clinical settings.
LEVEL OF EVIDENCE: 3 Laryngoscope, 2024.
PMID:38801129 | DOI:10.1002/lary.31537
DTDO: Driving Training Development Optimization enabled deep learning approach for brain tumour classification using MRI
Network. 2024 May 27:1-42. doi: 10.1080/0954898X.2024.2351159. Online ahead of print.
ABSTRACT
A brain tumour is an abnormal mass of tissue. Brain tumours vary in size, from tiny to large. Moreover, they display variations in location, shape, and size, which add complexity to their detection. The accurate delineation of tumour regions poses a challenge due to their irregular boundaries. In this research, these issues are overcome by introducing the DTDO-ZFNet for detection of brain tumour. The input Magnetic Resonance Imaging (MRI) image is fed to the pre-processing stage. Tumour areas are segmented by utilizing SegNet in which the factors of SegNet are biased using DTDO. The image augmentation is carried out using eminent techniques, such as geometric transformation and colour space transformation. Here, features such as GIST descriptor, PCA-NGIST, statistical feature and Haralick features, SLBT feature, and CNN features are extricated. Finally, the categorization of the tumour is accomplished based on ZFNet, which is trained by utilizing DTDO. The devised DTDO is a consolidation of DTBO and CDDO. The comparison of proposed DTDO-ZFNet with the existing methods, which results in highest accuracy of 0.944, a positive predictive value (PPV) of 0.936, a true positive rate (TPR) of 0.939, a negative predictive value (NPV) of 0.937, and a minimal false-negative rate (FNR) of 0.061%.
PMID:38801074 | DOI:10.1080/0954898X.2024.2351159
Deep DNAshape webserver: prediction and real-time visualization of DNA shape considering extended k-mers
Nucleic Acids Res. 2024 May 27:gkae433. doi: 10.1093/nar/gkae433. Online ahead of print.
ABSTRACT
Sequence-dependent DNA shape plays an important role in understanding protein-DNA binding mechanisms. High-throughput prediction of DNA shape features has become a valuable tool in the field of protein-DNA recognition, transcription factor-DNA binding specificity, and gene regulation. However, our widely used webserver, DNAshape, relies on statistically summarized pentamer query tables to query DNA shape features. These query tables do not consider flanking regions longer than two base pairs, and acquiring a query table for hexamers or higher-order k-mers is currently still unrealistic due to limitations in achieving sufficient statistical coverage in molecular simulations or structural biology experiments. A recent deep-learning method, Deep DNAshape, can predict DNA shape features at the core of a DNA fragment considering flanking regions of up to seven base pairs, trained on limited simulation data. However, Deep DNAshape is rather complicated to install, and it must run locally compared to the pentamer-based DNAshape webserver, creating a barrier for users. Here, we present the Deep DNAshape webserver, which has the benefits of both methods while being accurate, fast, and accessible to all users. Additional improvements of the webserver include the detection of user input in real time, the ability of interactive visualization tools and different modes of analyses. URL: https://deepdnashape.usc.edu.
PMID:38801070 | DOI:10.1093/nar/gkae433
Prediction of prognosis using artificial intelligence-based histopathological image analysis in patients with soft tissue sarcomas
Cancer Med. 2024 May;13(10):e7252. doi: 10.1002/cam4.7252.
ABSTRACT
BACKGROUND: Prompt histopathological diagnosis with accuracy is required for soft tissue sarcomas (STSs) which are still challenging. In addition, the advances in artificial intelligence (AI) along with the development of pathology slides digitization may empower the demand for the prediction of behavior of STSs. In this article, we explored the application of deep learning for prediction of prognosis from histopathological images in patients with STS.
METHODS: Our retrospective study included a total of 35 histopathological slides from patients with STS. We trained Inception v3 which is proposed method of convolutional neural network based survivability estimation. F1 score which identify the accuracy and area under the receiver operating characteristic curve (AUC) served as main outcome measures from a 4-fold validation.
RESULTS: The cohort included 35 patients with a mean age of 64 years, and the mean follow-up period was 34 months (2-66 months). Our deep learning method achieved AUC of 0.974 and an accuracy of 91.9% in predicting overall survival. Concerning with the prediction of metastasis-free survival, the accuracy was 84.2% with the AUC of 0.852.
CONCLUSION: AI might be used to help pathologists with accurate prognosis prediction. This study could substantially improve the clinical management of patients with STS.
PMID:38800990 | DOI:10.1002/cam4.7252
The application of natural language processing for the extraction of mechanistic information in toxicology
Front Toxicol. 2024 May 10;6:1393662. doi: 10.3389/ftox.2024.1393662. eCollection 2024.
ABSTRACT
To study the ways in which compounds can induce adverse effects, toxicologists have been constructing Adverse Outcome Pathways (AOPs). An AOP can be considered as a pragmatic tool to capture and visualize mechanisms underlying different types of toxicity inflicted by any kind of stressor, and describes the interactions between key entities that lead to the adverse outcome on multiple biological levels of organization. The construction or optimization of an AOP is a labor intensive process, which currently depends on the manual search, collection, reviewing and synthesis of available scientific literature. This process could however be largely facilitated using Natural Language Processing (NLP) to extract information contained in scientific literature in a systematic, objective, and rapid manner that would lead to greater accuracy and reproducibility. This would support researchers to invest their expertise in the substantive assessment of the AOPs by replacing the time spent on evidence gathering by a critical review of the data extracted by NLP. As case examples, we selected two frequent adversities observed in the liver: namely, cholestasis and steatosis denoting accumulation of bile and lipid, respectively. We used deep learning language models to recognize entities of interest in text and establish causal relationships between them. We demonstrate how an NLP pipeline combining Named Entity Recognition and a simple rules-based relationship extraction model helps screen compounds related to liver adversities in the literature, but also extract mechanistic information for how such adversities develop, from the molecular to the organismal level. Finally, we provide some perspectives opened by the recent progress in Large Language Models and how these could be used in the future. We propose this work brings two main contributions: 1) a proof-of-concept that NLP can support the extraction of information from text for modern toxicology and 2) a template open-source model for recognition of toxicological entities and extraction of their relationships. All resources are openly accessible via GitHub (https://github.com/ontox-project/en-tox).
PMID:38800806 | PMC:PMC11116573 | DOI:10.3389/ftox.2024.1393662
SAMI: an M-Health application to telemonitor intelligibility and speech disorder severity in head and neck cancers
Front Artif Intell. 2024 May 9;7:1359094. doi: 10.3389/frai.2024.1359094. eCollection 2024.
ABSTRACT
Perceptual measures, such as intelligibility and speech disorder severity, are widely used in the clinical assessment of speech disorders in patients treated for oral or oropharyngeal cancer. Despite their widespread usage, these measures are known to be subjective and hard to reproduce. Therefore, an M-Health assessment based on an automatic prediction has been seen as a more robust and reliable alternative. Despite recent progress, these automatic approaches still remain somewhat theoretical, and a need to implement them in real clinical practice rises. Hence, in the present work we introduce SAMI, a clinical mobile application used to predict speech intelligibility and disorder severity as well as to monitor patient progress on these measures over time. The first part of this work illustrates the design and development of the systems supported by SAMI. Here, we show how deep neural speaker embeddings are used to automatically regress speech disorder measurements (intelligibility and severity), as well as the training and validation of the system on a French corpus of head and neck cancer. Furthermore, we also test our model on a secondary corpus recorded in real clinical conditions. The second part details the results obtained from the deployment of our system in a real clinical environment, over the course of several weeks. In this section, the results obtained with SAMI are compared to an a posteriori perceptual evaluation, conducted by a set of experts on the new recorded data. The comparison suggests a high correlation and a low error between the perceptual and automatic evaluations, validating the clinical usage of the proposed application.
PMID:38800762 | PMC:PMC11119748 | DOI:10.3389/frai.2024.1359094
From explainable to interpretable deep learning for natural language processing in healthcare: How far from reality?
Comput Struct Biotechnol J. 2024 May 9;24:362-373. doi: 10.1016/j.csbj.2024.05.004. eCollection 2024 Dec.
ABSTRACT
Deep learning (DL) has substantially enhanced natural language processing (NLP) in healthcare research. However, the increasing complexity of DL-based NLP necessitates transparent model interpretability, or at least explainability, for reliable decision-making. This work presents a thorough scoping review of explainable and interpretable DL in healthcare NLP. The term "eXplainable and Interpretable Artificial Intelligence" (XIAI) is introduced to distinguish XAI from IAI. Different models are further categorized based on their functionality (model-, input-, output-based) and scope (local, global). Our analysis shows that attention mechanisms are the most prevalent emerging IAI technique. The use of IAI is growing, distinguishing it from XAI. The major challenges identified are that most XIAI does not explore "global" modelling processes, the lack of best practices, and the lack of systematic evaluation and benchmarks. One important opportunity is to use attention mechanisms to enhance multi-modal XIAI for personalized medicine. Additionally, combining DL with causal logic holds promise. Our discussion encourages the integration of XIAI in Large Language Models (LLMs) and domain-specific smaller models. In conclusion, XIAI adoption in healthcare requires dedicated in-house expertise. Collaboration with domain experts, end-users, and policymakers can lead to ready-to-use XIAI methods across NLP and medical tasks. While challenges exist, XIAI techniques offer a valuable foundation for interpretable NLP algorithms in healthcare.
PMID:38800693 | PMC:PMC11126530 | DOI:10.1016/j.csbj.2024.05.004
Contagious infection-free medical interaction system with machine vision controlled by remote hand gesture during an operation
Comput Struct Biotechnol J. 2024 May 11;24:393-403. doi: 10.1016/j.csbj.2024.05.006. eCollection 2024 Dec.
ABSTRACT
BACKGROUND AND OBJECTIVE: Medical image visualization is a requirement in many types of surgery such as orthopaedic, spinal, thoracic procedures or tumour resection to eliminate risk such as "wrong level surgery". However, direct contact with physical devices such as mice or touch screens to control images is a challenge because of the potential risk of infection. To prevent the spread of infection in sterile environments, a contagious infection-free medical interaction system has been developed for manipulating medical images.
METHODS: We proposed an integrated system with three key modules: hand landmark detection, hand pointing, and hand gesture recognition. A proposed depth enhancement algorithm is combined with a deep learning hand landmark detector to generate hand landmarks. Based on the designed system, a proposed hand-pointing system combined with projection and ray-pointing techniques allows for reducing fatigue during manipulation. A proposed landmark geometry constraint algorithm and deep learning method were applied to detect six gestures including click, open, close, zoom, drag, and rotation. Additionally, a control menu was developed to effectively activate common functions.
RESULTS: The proposed hand-pointing system allowed for a large control range of up to 1200 mm in both vertical and horizontal direction. The proposed hand gesture recognition method showed high accuracy of over 97% and real-time response.
CONCLUSION: This paper described the contagious infection-free medical interaction system that enables precise and effective manipulation of medical images within the large control range, while minimizing hand fatigue.
PMID:38800692 | PMC:PMC11127465 | DOI:10.1016/j.csbj.2024.05.006