Deep learning
Diagnostic accuracy of deep learning using speech samples in depression: a systematic review and meta-analysis
J Am Med Inform Assoc. 2024 Jul 16:ocae189. doi: 10.1093/jamia/ocae189. Online ahead of print.
ABSTRACT
OBJECTIVE: This study aims to conduct a systematic review and meta-analysis of the diagnostic accuracy of deep learning (DL) using speech samples in depression.
MATERIALS AND METHODS: This review included studies reporting diagnostic results of DL algorithms in depression using speech data, published from inception to January 31, 2024, on PubMed, Medline, Embase, PsycINFO, Scopus, IEEE, and Web of Science databases. Pooled accuracy, sensitivity, and specificity were obtained by random-effect models. The diagnostic Precision Study Quality Assessment Tool (QUADAS-2) was used to assess the risk of bias.
RESULTS: A total of 25 studies met the inclusion criteria and 8 of them were used in the meta-analysis. The pooled estimates of accuracy, specificity, and sensitivity for depression detection models were 0.87 (95% CI, 0.81-0.93), 0.85 (95% CI, 0.78-0.91), and 0.82 (95% CI, 0.71-0.94), respectively. When stratified by model structure, the highest pooled diagnostic accuracy was 0.89 (95% CI, 0.81-0.97) in the handcrafted group.
DISCUSSION: To our knowledge, our study is the first meta-analysis on the diagnostic performance of DL for depression detection from speech samples. All studies included in the meta-analysis used convolutional neural network (CNN) models, posing problems in deciphering the performance of other DL algorithms. The handcrafted model performed better than the end-to-end model in speech depression detection.
CONCLUSIONS: The application of DL in speech provided a useful tool for depression detection. CNN models with handcrafted acoustic features could help to improve the diagnostic performance.
PROTOCOL REGISTRATION: The study protocol was registered on PROSPERO (CRD42023423603).
PMID:39013193 | DOI:10.1093/jamia/ocae189
QC-GN<sup>2</sup>oMS<sup>2</sup>: a Graph Neural Net for High Resolution Mass Spectra Prediction
J Chem Inf Model. 2024 Jul 16. doi: 10.1021/acs.jcim.4c00446. Online ahead of print.
ABSTRACT
Predicting the mass spectrum of a molecular ion is often accomplished via three generalized approaches: rules-based methods for bond breaking, deep learning, or quantum chemical (QC) modeling. Rules-based approaches are often limited by the conditions for different chemical subspaces and perform poorly under chemical regimes with few defined rules. QC modeling is theoretically robust but requires significant amounts of computational time to produce a spectrum for a given target. Among deep learning techniques, graph neural networks (GNNs) have performed better than previous work with fingerprint-based neural networks in mass spectra prediction. To explore this technique further, we investigate the effects of including quantum chemically derived information as edge features in the GNN to increase predictive accuracy. The models we investigated include categorical bond order, bond force constants derived from extended tight-binding (xTB) quantum chemistry, and acyclic bond dissociation energies. We evaluated these models against a control GNN with no edge features in the input graphs. Bond dissociation enthalpies yielded the best improvement with a cosine similarity score of 0.462 relative to the baseline model (0.437). In this work we also apply dynamic graph attention which improves performance on benchmark problems and supports the inclusion of edge features. Between implementations, we investigate the nature of the molecular embedding for spectra prediction and discuss the recognition of fragment topographies in distinct chemistries for further development in tandem mass spectrometry prediction.
PMID:39013165 | DOI:10.1021/acs.jcim.4c00446
Stretchable Piezoresistive Pressure Sensor Array with Sophisticated Sensitivity, Strain-Insensitivity, and Reproducibility
Adv Sci (Weinh). 2024 Jul 16:e2405374. doi: 10.1002/advs.202405374. Online ahead of print.
ABSTRACT
This study delves into the development of a novel 10 by 10 sensor array featuring 100 pressure sensor pixels, achieving remarkable sensitivity up to 888.79 kPa-1, through the innovative design of sensor structure. The critical challenge of strain sensitivity inherent is addressed in stretchable piezoresistive pressure sensors, a domain that has seen significant interest due to their potential for practical applications. This approach involves synthesizing and electrospinning polybutadiene-urethane (PBU), a reversible cross-linking polymer, subsequently coated with MXene nanosheets to create a conductive fabric. This fabrication technique strategically enhances sensor sensitivity by minimizing initial current values and incorporating semi-cylindrical electrodes with Ag nanowires (AgNWs) selectively coated for optimal conductivity. The application of a pre-strain method to electrode construction ensures strain immunity, preserving the sensor's electrical properties under expansion. The sensor array demonstrated remarkable sensitivity by consistently detecting even subtle airflow from an air gun in a wind sensing test, while a novel deep learning methodology significantly enhanced the long-term sensing accuracy of polymer-based stretchable mechanical sensors, marking a major advancement in sensor technology. This research presents a significant step forward in enhancing the reliability and performance of stretchable piezoresistive pressure sensors, offering a comprehensive solution to their current limitations.
PMID:39013112 | DOI:10.1002/advs.202405374
Triple-0: Zero-shot denoising and dereverberation on an end-to-end frozen anechoic speech separation network
PLoS One. 2024 Jul 16;19(7):e0301692. doi: 10.1371/journal.pone.0301692. eCollection 2024.
ABSTRACT
Speech enhancement is crucial both for human and machine listening applications. Over the last decade, the use of deep learning for speech enhancement has resulted in tremendous improvement over the classical signal processing and machine learning methods. However, training a deep neural network is not only time-consuming; it also requires extensive computational resources and a large training dataset. Transfer learning, i.e. using a pretrained network for a new task, comes to the rescue by reducing the amount of training time, computational resources, and the required dataset, but the network still needs to be fine-tuned for the new task. This paper presents a novel method of speech denoising and dereverberation (SD&D) on an end-to-end frozen binaural anechoic speech separation network. The frozen network requires neither any architectural change nor any fine-tuning for the new task, as is usually required for transfer learning. The interaural cues of a source placed inside noisy and echoic surroundings are given as input to this pretrained network to extract the target speech from noise and reverberation. Although the pretrained model used in this paper has never seen noisy reverberant conditions during its training, it performs satisfactorily for zero-shot testing (ZST) under these conditions. It is because the pretrained model used here has been trained on the direct-path interaural cues of an active source and so it can recognize them even in the presence of echoes and noise. ZST on the same dataset on which the pretrained network was trained (homo-corpus) for the unseen class of interference, has shown considerable improvement over the weighted prediction error (WPE) algorithm in terms of four objective speech quality and intelligibility metrics. Also, the proposed model offers similar performance provided by a deep learning SD&D algorithm for this dataset under varying conditions of noise and reverberations. Similarly, ZST on a different dataset has provided an improvement in intelligibility and almost equivalent quality as provided by the WPE algorithm.
PMID:39012881 | DOI:10.1371/journal.pone.0301692
Streak artefact removal in x-ray dark-field computed tomography using a convolutional neural network
Med Phys. 2024 Jul 16. doi: 10.1002/mp.17305. Online ahead of print.
ABSTRACT
BACKGROUND: Computed tomography (CT) relies on the attenuation of x-rays, and is, hence, of limited use for weakly attenuating organs of the body, such as the lung. X-ray dark-field (DF) imaging is a recently developed technology that utilizes x-ray optical gratings to enable small-angle scattering as an alternative contrast mechanism. The DF signal provides structural information about the micromorphology of an object, complementary to the conventional attenuation signal. A first human-scale x-ray DF CT has been developed by our group. Despite specialized processing algorithms, reconstructed images remain affected by streaking artifacts, which often hinder image interpretation. In recent years, convolutional neural networks have gained popularity in the field of CT reconstruction, amongst others for streak artefact removal.
PURPOSE: Reducing streak artifacts is essential for the optimization of image quality in DF CT, and artefact free images are a prerequisite for potential future clinical application. The purpose of this paper is to demonstrate the feasibility of CNN post-processing for artefact reduction in x-ray DF CT and how multi-rotation scans can serve as a pathway for training data.
METHODS: We employed a supervised deep-learning approach using a three-dimensional dual-frame UNet in order to remove streak artifacts. Required training data were obtained from the experimental x-ray DF CT prototype at our institute. Two different operating modes were used to generate input and corresponding ground truth data sets. Clinically relevant scans at dose-compatible radiation levels were used as input data, and extended scans with substantially fewer artifacts were used as ground truth data. The latter is neither dose-, nor time-compatible and, therefore, unfeasible for clinical imaging of patients.
RESULTS: The trained CNN was able to greatly reduce streak artifacts in DF CT images. The network was tested against images with entirely different, previously unseen image characteristics. In all cases, CNN processing substantially increased the image quality, which was quantitatively confirmed by increased image quality metrics. Fine details are preserved during processing, despite the output images appearing smoother than the ground truth images.
CONCLUSIONS: Our results showcase the potential of a neural network to reduce streak artifacts in x-ray DF CT. The image quality is successfully enhanced in dose-compatible x-ray DF CT, which plays an essential role for the adoption of x-ray DF CT into modern clinical radiology.
PMID:39012833 | DOI:10.1002/mp.17305
Surface Reconstruction from Point Clouds: A Survey and a Benchmark
IEEE Trans Pattern Anal Mach Intell. 2024 Jul 16;PP. doi: 10.1109/TPAMI.2024.3429209. Online ahead of print.
ABSTRACT
Reconstruction of a continuous surface of two-dimensional manifold from its raw, discrete point cloud observation is a long-standing problem in computer vision and graphics research. The problem is technically ill-posed, and becomes more difficult considering that various sensing imperfections would appear in the point clouds obtained by practical depth scanning. In literature, a rich set of methods has been proposed, and reviews of existing methods are also provided. However, existing reviews are short of thorough investigations on a common benchmark. The present paper aims to review and benchmark existing methods in the new era of deep learning surface reconstruction. To this end, we contribute a large-scale benchmarking dataset consisting of both synthetic and real-scanned data; the benchmark includes object- and scene-level surfaces and takes into account various sensing imperfections that are commonly encountered in practical depth scanning. We conduct thorough empirical studies by comparing existing methods on the constructed benchmark, and pay special attention on robustness of existing methods against various scanning imperfections; we also study how different methods generalize in terms of reconstructing complex surface shapes. Our studies help identity the best conditions under which different methods work, and suggest some empirical findings. For example, while deep learning methods are increasingly popular in the research community, our systematic studies suggest that, surprisingly, a few classical methods perform even better in terms of both robustness and generalization; our studies also suggest that the practical challenges of misalignment of point sets from multi-view scanning, missing of surface points, and point outliers remain unsolved by all the existing surface reconstruction methods. We expect that the benchmark and our studies would be valuable both for practitioners and as a guidance for new innovations in future research. We make the benchmark publicly accessible at https://Gorilla-Lab-SCUT.github.io/SurfaceReconstructionBenchmark.
PMID:39012756 | DOI:10.1109/TPAMI.2024.3429209
Enhancing Generalizability in Biomedical Entity Recognition: Self-Attention PCA-CLS Model
IEEE/ACM Trans Comput Biol Bioinform. 2024 Jul 16;PP. doi: 10.1109/TCBB.2024.3429234. Online ahead of print.
ABSTRACT
One of the primary tasks in the early stages of data mining involves the identification of entities from biomedical corpora. Traditional approaches relying on robust feature engineering face challenges when learning from available (un-)annotated data using data-driven models like deep learning-based architectures. Despite leveraging large corpora and advanced deep learning models, domain generalization remains an issue. Attention mechanisms are effective in capturing longer sentence dependencies and extracting semantic and syntactic information from limited annotated datasets. To address out-of-vocabulary challenges in biomedical text, the PCA-CLS (Position and Contextual Attention with CNN-LSTM-Softmax) model combines global self-attention and character-level convolutional neural network techniques. The model's performance is evaluated on eight distinct biomedical domain datasets encompassing entities such as genes, drugs, diseases, and species. The PCA-CLS model outperforms several state-of-the-art models, achieving notable F1-scores, including 88.19% on BC2GM, 85.44% on JNLPBA, 90.80% on BC5CDR-chemical, 87.07% on BC5CDR-disease, 89.18% on BC4CHEMD, 88.81% on NCBI, and 91.59% on the s800 dataset.
PMID:39012749 | DOI:10.1109/TCBB.2024.3429234
fNIRS-Driven Depression Recognition Based on Cross-Modal Data Augmentation
IEEE Trans Neural Syst Rehabil Eng. 2024 Jul 16;PP. doi: 10.1109/TNSRE.2024.3429337. Online ahead of print.
ABSTRACT
Early diagnosis and intervention of depression promote complete recovery, with its traditional clinical assessments depending on the diagnostic scales, clinical experience of doctors and patient cooperation. Recent researches indicate that functional near-infrared spectroscopy (fNIRS) based on deep learning provides a promising approach to depression diagnosis. However, collecting large fNIRS datasets within a standard experimental paradigm remains challenging, limiting the applications of deep networks that require more data. To address these challenges, in this paper, we propose an fNIRS-driven depression recognition architecture based on cross-modal data augmentation (fCMDA), which converts fNIRS data into pseudo-sequence activation images. The approach incorporates a time-domain augmentation mechanism, including time warping and time masking, to generate diverse data. Additionally, we design a stimulation task-driven data pseudo-sequence method to map fNIRS data into pseudo-sequence activation images, facilitating the extraction of spatial-temporal, contextual and dynamic characteristics. Ultimately, we construct a depression recognition model based on deep classification networks using the imbalance loss function. Extensive experiments are performed on the two-class depression diagnosis and five-class depression severity recognition, which reveal impressive results with accuracy of 0.905 and 0.889, respectively. The fCMDA architecture provides a novel solution for effective depression recognition with limited data.
PMID:39012734 | DOI:10.1109/TNSRE.2024.3429337
Concept-based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis
IEEE Trans Med Imaging. 2024 Jul 16;PP. doi: 10.1109/TMI.2024.3429148. Online ahead of print.
ABSTRACT
Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling conceptlevel interventions. Our code is publicly available at https://github.com/Sorades/CLAT.
PMID:39012729 | DOI:10.1109/TMI.2024.3429148
Radiomics of pituitary adenoma using computer vision: a review
Med Biol Eng Comput. 2024 Jul 16. doi: 10.1007/s11517-024-03163-3. Online ahead of print.
ABSTRACT
Pituitary adenomas (PA) represent the most common type of sellar neoplasm. Extracting relevant information from radiological images is essential for decision support in addressing various objectives related to PA. Given the critical need for an accurate assessment of the natural progression of PA, computer vision (CV) and artificial intelligence (AI) play a pivotal role in automatically extracting features from radiological images. The field of "Radiomics" involves the extraction of high-dimensional features, often referred to as "Radiomic features," from digital radiological images. This survey offers an analysis of the current state of research in PA radiomics. Our work comprises a systematic review of 34 publications focused on PA radiomics and other automated information mining pertaining to PA through the analysis of radiological data using computer vision methods. We begin with a theoretical exploration essential for understanding the theoretical background of radionmics, encompassing traditional approaches from computer vision and machine learning, as well as the latest methodologies in deep radiomics utilizing deep learning (DL). Thirty-four research works under examination are comprehensively compared and evaluated. The overall results achieved in the analyzed papers are high, e.g., the best accuracy is up to 96% and the best achieved AUC is up to 0.99, which establishes optimism for the successful use of radiomic features. Methods based on deep learning seem to be the most promising for the future. In relation to this perspective DL methods, several challenges are remarkable: It is important to create high-quality and sufficiently extensive datasets necessary for training deep neural networks. Interpretability of deep radiomics is also a big open challenge. It is necessary to develop and verify methods that will explain to us how deep radiomic features reflect various physics-explainable aspects.
PMID:39012416 | DOI:10.1007/s11517-024-03163-3
The emerging paradigm in pediatric rheumatology: harnessing the power of artificial intelligence
Rheumatol Int. 2024 Jul 16. doi: 10.1007/s00296-024-05661-x. Online ahead of print.
ABSTRACT
Artificial intelligence algorithms, with roots extending into the past but experiencing a resurgence and evolution in recent years due to their superiority over traditional methods and contributions to human capabilities, have begun to make their presence felt in the field of pediatric rheumatology. In the ever-evolving realm of pediatric rheumatology, there have been incremental advancements supported by artificial intelligence in understanding and stratifying diseases, developing biomarkers, refining visual analyses, and facilitating individualized treatment approaches. However, like in many other domains, these strides have yet to gain clinical applicability and validation, and ethical issues remain unresolved. Furthermore, mastering different and novel terminologies appears challenging for clinicians. This review aims to provide a comprehensive overview of the current literature, categorizing algorithms and their applications, thus offering a fresh perspective on the nascent relationship between pediatric rheumatology and artificial intelligence, highlighting both its advancements and constraints.
PMID:39012357 | DOI:10.1007/s00296-024-05661-x
Automatic classification and grading of canine tracheal collapse on thoracic radiographs by using deep learning
Vet Radiol Ultrasound. 2024 Jul 16. doi: 10.1111/vru.13413. Online ahead of print.
ABSTRACT
Tracheal collapse is a chronic and progressively worsening disease; the severity of clinical symptoms experienced by affected individuals depends on the degree of airway collapse. Cutting-edge automated tools are necessary to modernize disease screening using radiographs across various veterinary settings, such as animal clinics and hospitals. This is primarily due to the inherent challenges associated with interpreting uncertainties among veterinarians. In this study, an artificial intelligence model was developed to screen canine tracheal collapse using archived lateral cervicothoracic radiographs. This model can differentiate between a normal and collapsed trachea, ranging from early to severe degrees. The you-only-look-once (YOLO) models, including YOLO v3, YOLO v4, and YOLO v4 tiny, were used to train and test data sets under the in-house XXX platform. The results showed that the YOLO v4 tiny-416 model had satisfactory performance in screening among the normal trachea, grade 1-2 tracheal collapse, and grade 3-4 tracheal collapse with 98.30% sensitivity, 99.20% specificity, and 98.90% accuracy. The area under the curve of the precision-recall curve was >0.8, which demonstrated high diagnostic accuracy. The intraobserver agreement between deep learning and radiologists was κ = 0.975 (P < .001), with all observers having excellent agreement (κ = 1.00, P < .001). The intraclass correlation coefficient between observers was >0.90, which represented excellent consistency. Therefore, the deep learning model can be a useful and reliable method for effective screening and classification of the degree of tracheal collapse based on routine lateral cervicothoracic radiographs.
PMID:39012062 | DOI:10.1111/vru.13413
User experience of and satisfaction with computer-aided design software when designing dental prostheses: A multicenter survey study
Int J Comput Dent. 2024 Jul 16;0(0):0. doi: 10.3290/j.ijcd.b5582929. Online ahead of print.
ABSTRACT
AIM: The current study aimed to compare the responses and satisfaction reported by users with varying levels of experience when using different types of computer-aided design (CAD) software programs to design crowns.
MATERIALS AND METHODS: A questionnaire was used to evaluate user responses to five domains (software visibility, 3Dscanned data preparation, crown design and adjustment, finish line registration, and overall experience) of various CAD software programs. The study included 50 undergraduate dental students (inexperienced group) and 50 dentists or dental technicians from two hospitals (experienced group). The participants used four different CAD software programs (Meshmixer, Exocad, BlueSkyPlan, and Dentbird) to design crowns and recorded the features using the questionnaire. Statistical analyses included one-way and two-way analysis of variance (ANOVA) tests to compare scores and verify the interaction between user response and experience.
RESULT: User evaluation scores in the domains of software visibility and 3D-scanned data preparation varied between software programs (P < 0.001), with Exocad being favored by the experienced group. When evaluating crown design and finish line registration, Dentbird and Exocad scored significantly higher than the other software in both groups as they offered automation of the process using deep learning (P < 0.001). Two-way ANOVA showed that prior experience of using CAD significantly affected the users' responses to all queries (P < 0.001).
CONCLUSION: User response and satisfaction varied with the type of CAD software used to design dental prostheses, with prior experience of using CAD playing a significant role. Automation of design functions can enhance user satisfaction with the software.
PMID:39011633 | DOI:10.3290/j.ijcd.b5582929
Brief Review and Primer of Key Terminology for Artificial Intelligence and Machine Learning in Hypertension
Hypertension. 2024 Jul 16. doi: 10.1161/HYPERTENSIONAHA.123.22347. Online ahead of print.
ABSTRACT
Recent breakthroughs in artificial intelligence (AI) have caught the attention of many fields, including health care. The vision for AI is that a computer model can process information and provide output that is indistinguishable from that of a human and, in specific repetitive tasks, outperform a human's capability. The 2 critical underlying technologies in AI are used for supervised and unsupervised machine learning. Machine learning uses neural networks and deep learning modeled after the human brain from structured or unstructured data sets to learn, make decisions, and continuously improve the model. Natural language processing, used for supervised learning, is understanding, interpreting, and generating information using human language in chatbots and generative and conversational AI. These breakthroughs result from increased computing power and access to large data sets, setting the stage for releasing large language models, such as ChatGPT and others, and new imaging models using computer vision. Hypertension management involves using blood pressure and other biometric data from connected devices and generative AI to communicate with patients and health care professionals. AI can potentially improve hypertension diagnosis and treatment through remote patient monitoring and digital therapeutics.
PMID:39011632 | DOI:10.1161/HYPERTENSIONAHA.123.22347
Moss-m7G: A Motif-Based Interpretable Deep Learning Method for RNA N7-Methlguanosine Site Prediction
J Chem Inf Model. 2024 Jul 16. doi: 10.1021/acs.jcim.4c00802. Online ahead of print.
ABSTRACT
N-7methylguanosine (m7G) modification plays a crucial role in various biological processes and is closely associated with the development and progression of many cancers. Accurate identification of m7G modification sites is essential for understanding their regulatory mechanisms and advancing cancer therapy. Previous studies often suffered from insufficient research data, underutilization of motif information, and lack of interpretability. In this work, we designed a novel motif-based interpretable method for m7G modification site prediction, called Moss-m7G. This approach enables the analysis of RNA sequences from a motif-centric perspective. Our proposed word-detection module and motif-embedding module within Moss-m7G extract motif information from sequences, transforming the raw sequences from base-level into motif-level and generating embeddings for these motif sequences. Compared with base sequences, motif sequences contain richer contextual information, which is further analyzed and integrated through the Transformer model. We constructed a comprehensive m7G data set to implement the training and testing process to address the data insufficiency noted in prior research. Our experimental results affirm the effectiveness and superiority of Moss-m7G in predicting m7G modification sites. Moreover, the introduction of the word-detection module enhances the interpretability of the model, providing insights into the predictive mechanisms.
PMID:39011571 | DOI:10.1021/acs.jcim.4c00802
Classification of pain expression images in elderly with hip fractures based on improved ResNet50 network
Front Med (Lausanne). 2024 Jul 1;11:1421800. doi: 10.3389/fmed.2024.1421800. eCollection 2024.
ABSTRACT
The aim of this study is designed an improved ResNet 50 network to achieve automatic classification model for pain expressions by elderly patients with hip fractures. This study built a dataset by combining the advantages of deep learning in image recognition, using a hybrid of the Multi-Task Cascaded Convolutional Neural Networks (MTCNN). Based on ResNet50 network framework utilized transfer learning to implement model function. This study performed the hyperparameters by Bayesian optimization in the learning process. This study calculated intraclass correlation between visual analog scale scores provided by clinicians independently and those provided by pain expression evaluation assistant(PEEA). The automatic pain expression recognition model in elderly patients with hip fractures, which constructed using the algorithm. The accuracy achieved 99.6% on the training set, 98.7% on the validation set, and 98.2% on the test set. The substantial kappa coefficient of 0.683 confirmed the efficacy of PEEA in clinic. This study demonstrates that the improved ResNet50 network can be used to construct an automatic pain expression recognition model for elderly patients with hip fractures, which has higher accuracy.
PMID:39011450 | PMC:PMC11247008 | DOI:10.3389/fmed.2024.1421800
Automated magnetic resonance imaging-based grading of the lumbar intervertebral disc and facet joints
JOR Spine. 2024 Jul 15;7(3):e1353. doi: 10.1002/jsp2.1353. eCollection 2024 Sep.
ABSTRACT
BACKGROUND: Degeneration of both intervertebral discs (IVDs) and facet joints in the lumbar spine has been associated with low back pain, but whether and how IVD/joint degeneration contributes to pain remains an open question. Joint degeneration can be identified by pairing T1 and T2 magnetic resonance imaging (MRI) with analysis techniques such as Pfirrmann grades (IVD degeneration) and Fujiwara scores (facet degeneration). However, these grades are subjective, prompting the need to develop an automated technique to enhance inter-rater reliability. This study introduces an automated convolutional neural network (CNN) technique trained on clinical MRI images of IVD and facet joints obtained from public-access Lumbar Spine MRI Dataset. The primary goal of the automated system is to classify health of lumbar discs and facet joints according to Pfirrmann and Fujiwara grading systems and to enhance inter-rater reliability associated with these grading systems.
METHODS: Performance of the CNN on both the Pfirrmann and Fujiwara scales was measured by comparing the percent agreement, Pearson's correlation and Fleiss kappa value for results from the classifier to the grades assigned by an expert grader.
RESULTS: The CNN demonstrates comparable performance to human graders for both Pfirrmann and Fujiwara grading systems, but with larger errors in Fujiwara grading. The CNN improves the reliability of the Pfirrmann system, aligning with previous findings for IVD assessment.
CONCLUSION: The study highlights the potential of using deep learning in classifying the IVD and facet joint health, and due to the high variability in the Fujiwara scoring system, highlights the need for improved imaging and scoring techniques to evaluate facet joint health. All codes required to use the automatic grading routines described herein are available in the Data Repository for University of Minnesota (DRUM).
PMID:39011368 | PMC:PMC11249006 | DOI:10.1002/jsp2.1353
Benchmarking Deep Learning-Based Image Retrieval of Oral Tumor Histology
Cureus. 2024 Jun 12;16(6):e62264. doi: 10.7759/cureus.62264. eCollection 2024 Jun.
ABSTRACT
INTRODUCTION: Oral tumors necessitate a dependable computer-assisted pathological diagnosis system considering their rarity and diversity. A content-based image retrieval (CBIR) system using deep neural networks has been successfully devised for digital pathology. No CBIR system for oral pathology has been investigated because of the lack of an extensive image database and feature extractors tailored to oral pathology.
MATERIALS AND METHODS: This study uses a large CBIR database constructed from 30 categories of oral tumors to compare deep learning methods as feature extractors.
RESULTS: The highest average area under the receiver operating characteristic curve (AUC) was achieved by models trained on database images using self-supervised learning (SSL) methods (0.900 with SimCLR and 0.897 with TiCo). The generalizability of the models was validated using query images from the same cases taken with smartphones. When smartphone images were tested as queries, both models yielded the highest mean AUC (0.871 with SimCLR and 0.857 with TiCo). We ensured the retrieved image result would be easily observed by evaluating the top 10 mean accuracies and checking for an exact diagnostic category and its differential diagnostic categories.
CONCLUSION: Training deep learning models with SSL methods using image data specific to the target site is beneficial for CBIR tasks in oral tumor histology to obtain histologically meaningful results and high performance. This result provides insight into the effective development of a CBIR system to help improve the accuracy and speed of histopathology diagnosis and advance oral tumor research in the future.
PMID:39011227 | PMC:PMC11247249 | DOI:10.7759/cureus.62264
Artificial intelligence automatic measurement technology of lumbosacral radiographic parameters
Front Bioeng Biotechnol. 2024 Jul 1;12:1404058. doi: 10.3389/fbioe.2024.1404058. eCollection 2024.
ABSTRACT
BACKGROUND: Currently, manual measurement of lumbosacral radiological parameters is time-consuming and laborious, and inevitably produces considerable variability. This study aimed to develop and evaluate a deep learning-based model for automatically measuring lumbosacral radiographic parameters on lateral lumbar radiographs.
METHODS: We retrospectively collected 1,240 lateral lumbar radiographs to train the model. The included images were randomly divided into training, validation, and test sets in a ratio of approximately 8:1:1 for model training, fine-tuning, and performance evaluation, respectively. The parameters measured in this study were lumbar lordosis (LL), sacral horizontal angle (SHA), intervertebral space angle (ISA) at L4-L5 and L5-S1 segments, and the percentage of lumbar spondylolisthesis (PLS) at L4-L5 and L5-S1 segments. The model identified key points using image segmentation results and calculated measurements. The average results of key points annotated by the three spine surgeons were used as the reference standard. The model's performance was evaluated using the percentage of correct key points (PCK), intra-class correlation coefficient (ICC), Pearson correlation coefficient (r), mean absolute error (MAE), root mean square error (RMSE), and box plots.
RESULTS: The model's mean differences from the reference standard for LL, SHA, ISA (L4-L5), ISA (L5-S1), PLS (L4-L5), and PLS (L5-S1) were 1.69°, 1.36°, 1.55°, 1.90°, 1.60%, and 2.43%, respectively. When compared with the reference standard, the measurements of the model had better correlation and consistency (LL, SHA, and ISA: ICC = 0.91-0.97, r = 0.91-0.96, MAE = 1.89-2.47, RMSE = 2.32-3.12; PLS: ICC = 0.90-0.92, r = 0.90-0.91, MAE = 1.95-2.93, RMSE = 2.52-3.70), and the differences between them were not statistically significant (p > 0.05).
CONCLUSION: The model developed in this study could correctly identify key vertebral points on lateral lumbar radiographs and automatically calculate lumbosacral radiographic parameters. The measurement results of the model had good consistency and reliability compared to manual measurements. With additional training and optimization, this technology holds promise for future measurements in clinical practice and analysis of large datasets.
PMID:39011157 | PMC:PMC11246908 | DOI:10.3389/fbioe.2024.1404058
IRTCI: Item Response Theory for Categorical Imputation
Res Sq [Preprint]. 2024 Jul 2:rs.3.rs-4529519. doi: 10.21203/rs.3.rs-4529519/v1.
ABSTRACT
Most datasets suffer from partial or complete missing values, which has downstream limitations on the available models on which to test the data and on any statistical inferences that can be made from the data. Several imputation techniques have been designed to replace missing data with stand in values. The various approaches have implications for calculating clinical scores, model building and model testing. The work showcased here offers a novel means for categorical imputation based on item response theory (IRT) and compares it against several methodologies currently used in the machine learning field including k-nearest neighbors (kNN), multiple imputed chained equations (MICE) and Amazon Web Services (AWS) deep learning method, Datawig. Analyses comparing these techniques were performed on three different datasets that represented ordinal, nominal and binary categories. The data were modified so that they also varied on both the proportion of data missing and the systematization of the missing data. Two different assessments of performance were conducted: accuracy in reproducing the missing values, and predictive performance using the imputed data. Results demonstrated that the new method, Item Response Theory for Categorical Imputation (IRTCI), fared quite well compared to currently used methods, outperforming several of them in many conditions. Given the theoretical basis for the new approach, and the unique generation of probabilistic terms for determining category belonging for missing cells, IRTCI offers a viable alternative to current approaches.
PMID:39011102 | PMC:PMC11247932 | DOI:10.21203/rs.3.rs-4529519/v1