Deep learning
Comparing generative and extractive approaches to information extraction from abstracts describing randomized clinical trials
J Biomed Semantics. 2024 Apr 23;15(1):3. doi: 10.1186/s13326-024-00305-2.
ABSTRACT
BACKGROUND: Systematic reviews of Randomized Controlled Trials (RCTs) are an important part of the evidence-based medicine paradigm. However, the creation of such systematic reviews by clinical experts is costly as well as time-consuming, and results can get quickly outdated after publication. Most RCTs are structured based on the Patient, Intervention, Comparison, Outcomes (PICO) framework and there exist many approaches which aim to extract PICO elements automatically. The automatic extraction of PICO information from RCTs has the potential to significantly speed up the creation process of systematic reviews and this way also benefit the field of evidence-based medicine.
RESULTS: Previous work has addressed the extraction of PICO elements as the task of identifying relevant text spans or sentences, but without populating a structured representation of a trial. In contrast, in this work, we treat PICO elements as structured templates with slots to do justice to the complex nature of the information they represent. We present two different approaches to extract this structured information from the abstracts of RCTs. The first approach is an extractive approach based on our previous work that is extended to capture full document representations as well as by a clustering step to infer the number of instances of each template type. The second approach is a generative approach based on a seq2seq model that encodes the abstract describing the RCT and uses a decoder to infer a structured representation of a trial including its arms, treatments, endpoints and outcomes. Both approaches are evaluated with different base models on a manually annotated dataset consisting of RCT abstracts on an existing dataset comprising 211 annotated clinical trial abstracts for Type 2 Diabetes and Glaucoma. For both diseases, the extractive approach (with flan-t5-base) reached the best F 1 score, i.e. 0.547 ( ± 0.006 ) for type 2 diabetes and 0.636 ( ± 0.006 ) for glaucoma. Generally, the F 1 scores were higher for glaucoma than for type 2 diabetes and the standard deviation was higher for the generative approach.
CONCLUSION: In our experiments, both approaches show promising performance extracting structured PICO information from RCTs, especially considering that most related work focuses on the far easier task of predicting less structured objects. In our experimental results, the extractive approach performs best in both cases, although the lead is greater for glaucoma than for type 2 diabetes. For future work, it remains to be investigated how the base model size affects the performance of both approaches in comparison. Although the extractive approach currently leaves more room for direct improvements, the generative approach might benefit from larger models.
PMID:38654304 | DOI:10.1186/s13326-024-00305-2
Deep learning-assisted diagnosis of benign and malignant parotid tumors based on ultrasound: a retrospective study
BMC Cancer. 2024 Apr 23;24(1):510. doi: 10.1186/s12885-024-12277-8.
ABSTRACT
BACKGROUND: To develop a deep learning(DL) model utilizing ultrasound images, and evaluate its efficacy in distinguishing between benign and malignant parotid tumors (PTs), as well as its practicality in assisting clinicians with accurate diagnosis.
METHODS: A total of 2211 ultrasound images of 980 pathologically confirmed PTs (Training set: n = 721; Validation set: n = 82; Internal-test set: n = 89; External-test set: n = 88) from 907 patients were retrospectively included in this study. The optimal model was selected and the diagnostic performance evaluation is conducted by utilizing the area under curve (AUC) of the receiver-operating characteristic(ROC) based on five different DL networks constructed at varying depths. Furthermore, a comparison of different seniority radiologists was made in the presence of the optimal auxiliary diagnosis model. Additionally, the diagnostic confusion matrix of the optimal model was calculated, and an analysis and summary of misjudged cases' characteristics were conducted.
RESULTS: The Resnet18 demonstrated superior diagnostic performance, with an AUC value of 0.947, accuracy of 88.5%, sensitivity of 78.2%, and specificity of 92.7% in internal-test set, and with an AUC value of 0.925, accuracy of 89.8%, sensitivity of 83.3%, and specificity of 90.6% in external-test set. The PTs were subjectively assessed twice by six radiologists, both with and without the assisted of the model. With the assisted of the model, both junior and senior radiologists demonstrated enhanced diagnostic performance. In the internal-test set, there was an increase in AUC values by 0.062 and 0.082 for junior radiologists respectively, while senior radiologists experienced an improvement of 0.066 and 0.106 in their respective AUC values.
CONCLUSIONS: The DL model based on ultrasound images demonstrates exceptional capability in distinguishing between benign and malignant PTs, thereby assisting radiologists of varying expertise levels to achieve heightened diagnostic performance, and serve as a noninvasive imaging adjunct diagnostic method for clinical purposes.
PMID:38654281 | DOI:10.1186/s12885-024-12277-8
Imaging segmentation mechanism for rectal tumors using improved U-Net
BMC Med Imaging. 2024 Apr 23;24(1):95. doi: 10.1186/s12880-024-01269-6.
ABSTRACT
OBJECTIVE: In radiation therapy, cancerous region segmentation in magnetic resonance images (MRI) is a critical step. For rectal cancer, the automatic segmentation of rectal tumors from an MRI is a great challenge. There are two main shortcomings in existing deep learning-based methods that lead to incorrect segmentation: 1) there are many organs surrounding the rectum, and the shape of some organs is similar to that of rectal tumors; 2) high-level features extracted by conventional neural networks often do not contain enough high-resolution information. Therefore, an improved U-Net segmentation network based on attention mechanisms is proposed to replace the traditional U-Net network.
METHODS: The overall framework of the proposed method is based on traditional U-Net. A ResNeSt module was added to extract the overall features, and a shape module was added after the encoder layer. We then combined the outputs of the shape module and the decoder to obtain the results. Moreover, the model used different types of attention mechanisms, so that the network learned information to improve segmentation accuracy.
RESULTS: We validated the effectiveness of the proposed method using 3773 2D MRI datasets from 304 patients. The results showed that the proposed method achieved 0.987, 0.946, 0.897, and 0.899 for Dice, MPA, MioU, and FWIoU, respectively; these values are significantly better than those of other existing methods.
CONCLUSION: Due to time savings, the proposed method can help radiologists segment rectal tumors effectively and enable them to focus on patients whose cancerous regions are difficult for the network to segment.
SIGNIFICANCE: The proposed method can help doctors segment rectal tumors, thereby ensuring good diagnostic quality and accuracy.
PMID:38654162 | DOI:10.1186/s12880-024-01269-6
Can AlphaFold's breakthrough in protein structure help decode the fundamental principles of adaptive cellular immunity?
Nat Methods. 2024 Apr 23. doi: 10.1038/s41592-024-02240-7. Online ahead of print.
ABSTRACT
T cells are essential immune cells responsible for identifying and eliminating pathogens. Through interactions between their T-cell antigen receptors (TCRs) and antigens presented by major histocompatibility complex molecules (MHCs) or MHC-like molecules, T cells discriminate foreign and self peptides. Determining the fundamental principles that govern these interactions has important implications in numerous medical contexts. However, reconstructing a map between T cells and their antagonist antigens remains an open challenge for the field of immunology, and success of in silico reconstructions of this relationship has remained incremental. In this Perspective, we discuss the role that new state-of-the-art deep-learning models for predicting protein structure may play in resolving some of the unanswered questions the field faces linking TCR and peptide-MHC properties to T-cell specificity. We provide a comprehensive overview of structural databases and the evolution of predictive models, and highlight the breakthrough AlphaFold provided the field.
PMID:38654083 | DOI:10.1038/s41592-024-02240-7
Independent Associations of Aortic Calcification with Cirrhosis and Liver Related Mortality in Veterans with Chronic Liver Disease
Dig Dis Sci. 2024 Apr 23. doi: 10.1007/s10620-024-08450-5. Online ahead of print.
ABSTRACT
INTRODUCTION: Abdominal aortic calcifications (AAC) are incidentally found on medical imaging and useful cardiovascular burden approximations. The Morphomic Aortic Calcification Score (MAC) leverages automated deep learning methods to quantify and score AACs. While associations of AAC and non-alcoholic fatty liver disease (NAFLD) have been described, relationships of AAC with other liver diseases and clinical outcome are sparse. This study's purpose was to evaluate AAC and liver-related death in a cohort of Veterans with chronic liver disease (CLD).
METHODS: We utilized the VISN 10 CLD cohort, a regional cohort of Veterans with the three forms of CLD: NAFLD, hepatitis C (HCV), alcohol-associated (ETOH), seen between 2008 and 2014, with abdominal CT scans (n = 3604). Associations between MAC and cirrhosis development, liver decompensation, liver-related death, and overall death were evaluated with Cox proportional hazard models.
RESULTS: The full cohort demonstrated strong associations of MAC and cirrhosis after adjustment: HR 2.13 (95% CI 1.63, 2.78), decompensation HR 2.19 (95% CI 1.60, 3.02), liver-related death HR 2.13 (95% CI 1.46, 3.11), and overall death HR 1.47 (95% CI 1.27, 1.71). These associations seemed to be driven by the non-NAFLD groups for decompensation and liver-related death [HR 2.80 (95% CI 1.52, 5.17; HR 2.34 (95% CI 1.14, 4.83), respectively].
DISCUSSION: MAC was strongly and independently associated with cirrhosis, liver decompensation, liver-related death, and overall death. Surprisingly, stratification results demonstrated comparable or stronger associations among those with non-NAFLD etiology. These findings suggest abdominal aortic calcification may predict liver disease severity and clinical outcomes in patients with CLD.
PMID:38653948 | DOI:10.1007/s10620-024-08450-5
Classification Method of ECG Signals Based on RANet
Cardiovasc Eng Technol. 2024 Apr 23. doi: 10.1007/s13239-024-00730-5. Online ahead of print.
ABSTRACT
BACKGROUND: Electrocardiograms (ECG) are an important source of information on human heart health and are widely used to detect different types of arrhythmias.
OBJECTIVE: With the advancement of deep learning, end-to-end ECG classification models based on neural networks have been developed. However, deeper network layers lead to gradient vanishing. Moreover, different channels and periods of an ECG signal hold varying significance for identifying different types of ECG abnormalities.
METHODS: To solve these two problems, an ECG classification method based on a residual attention neural network is proposed in this paper. The residual network (ResNet) is used to solve the gradient vanishing problem. Moreover, it has fewer model parameters, and its structure is simpler. An attention mechanism is added to focus on key information, integrate channel features, and improve voting methods to alleviate the problem of data imbalance.
RESULTS: Experiments and verifications are conducted using the PhysioNet/CinC Challenge 2017 dataset. The average F1 value is 0.817, which is 0.064 higher than that for the ResNet model. Compared with the mainstream methods, the performance is excellent.
PMID:38653933 | DOI:10.1007/s13239-024-00730-5
A Semi-Supervised Learning Framework for Classifying Colorectal Neoplasia Based on the NICE Classification
J Imaging Inform Med. 2024 Apr 23. doi: 10.1007/s10278-024-01123-9. Online ahead of print.
ABSTRACT
Labelling medical images is an arduous and costly task that necessitates clinical expertise and large numbers of qualified images. Insufficient samples can lead to underfitting during training and poor performance of supervised learning models. In this study, we aim to develop a SimCLR-based semi-supervised learning framework to classify colorectal neoplasia based on the NICE classification. First, the proposed framework was trained under self-supervised learning using a large unlabelled dataset; subsequently, it was fine-tuned on a limited labelled dataset based on the NICE classification. The model was evaluated on an independent dataset and compared with models based on supervised transfer learning and endoscopists using accuracy, Matthew's correlation coefficient (MCC), and Cohen's kappa. Finally, Grad-CAM and t-SNE were applied to visualize the models' interpretations. A ResNet-backboned SimCLR model (accuracy of 0.908, MCC of 0.862, and Cohen's kappa of 0.896) outperformed supervised transfer learning-based models (means: 0.803, 0.698, and 0.742) and junior endoscopists (0.816, 0.724, and 0.863), while performing only slightly worse than senior endoscopists (0.916, 0.875, and 0.944). Moreover, t-SNE showed a better clustering of ternary samples through self-supervised learning in SimCLR than through supervised transfer learning. Compared with traditional supervised learning, semi-supervised learning enables deep learning models to achieve improved performance with limited labelled endoscopic images.
PMID:38653910 | DOI:10.1007/s10278-024-01123-9
A new intelligent system based deep learning to detect DME and AMD in OCT images
Int Ophthalmol. 2024 Apr 23;44(1):191. doi: 10.1007/s10792-024-03115-8.
ABSTRACT
Optical Coherence Tomography (OCT) is widely recognized as the leading modality for assessing ocular retinal diseases, playing a crucial role in diagnosing retinopathy while maintaining a non-invasive modality. The increasing volume of OCT images underscores the growing importance of automating image analysis. Age-related diabetic Macular Degeneration (AMD) and Diabetic Macular Edema (DME) are the most common cause of visual impairment. Early detection and timely intervention for diabetes-related conditions are essential for preventing optical complications and reducing the risk of blindness. This study introduces a novel Computer-Aided Diagnosis (CAD) system based on a Convolutional Neural Network (CNN) model, aiming to identify and classify OCT retinal images into AMD, DME, and Normal classes. Leveraging CNN efficiency, including feature learning and classification, various CNN, including pre-trained VGG16, VGG19, Inception_V3, a custom from scratch model, BCNN (VGG16) 2 , BCNN (VGG19) 2 , and BCNN (Inception_V3) 2 , are developed for the classification of AMD, DME, and Normal OCT images. The proposed approach has been evaluated on two datasets, including a DUKE public dataset and a Tunisian private dataset. The combination of the Inception_V3 model and the extracted feature from the proposed custom CNN achieved the highest accuracy value of 99.53% in the DUKE dataset. The obtained results on DUKE public and Tunisian datasets demonstrate the proposed approach as a significant tool for efficient and automatic retinal OCT image classification.
PMID:38653842 | DOI:10.1007/s10792-024-03115-8
Improved Resolution and Image Quality of Musculoskeletal Magnetic Resonance Imaging using Deep Learning-based Denoising Reconstruction: A Prospective Clinical Study
Skeletal Radiol. 2024 Apr 24. doi: 10.1007/s00256-024-04679-3. Online ahead of print.
ABSTRACT
OBJECTIVE: To prospectively evaluate a deep learning-based denoising reconstruction (DLR) for improved resolution and image quality in musculoskeletal (MSK) magnetic resonance imaging (MRI).
METHODS: Images from 137 contrast-weighted sequences in 40 MSK patients were evaluated. Each sequence was performed twice, first with the routine parameters and reconstructed with a routine reconstruction filter (REF), then with higher resolution and reconstructed with DLR, and with three conventional reconstruction filters (NL2, GA43, GA53). The five reconstructions (REF, DLR, NL2, GA43, and GA53) were de-identified, randomized, and blindly reviewed by three MSK radiologists using eight scoring criteria and a forced ranking. Quantitative SNR, CNR, and structure's full width at half maximum (FWHM) for resolution assessment were measured and compared. To account for repeated measures, Generalized Estimating Equations (GEE) with Bonferroni adjustment was used to compare the reader's scores, SNR, CNR, and FWHM between DLR vs. NL2, GA43, GA53, and REF.
RESULTS: Compared to the routine REF images, the resolution was improved by 47.61% with DLR from 0.39 ± 0.15 mm2 to 0.20 ± 0.06 mm2 (p < 0.001). Per-sequence average scan time was shortened by 7.93% with DLR from 165.58 ± 21.86 s to 152.45 ± 25.65 s (p < 0.001). Based on the average scores, DLR images were rated significantly higher in all image quality criteria and the forced ranking (p < 0.001).
CONCLUSION: This prospective clinical evaluation demonstrated that DLR allows approximately two times finer resolution and improved image quality compared to the standard-of-care images.
PMID:38653786 | DOI:10.1007/s00256-024-04679-3
Deep Learning and Multimodal Artificial Intelligence in Orthopaedic Surgery
J Am Acad Orthop Surg. 2024 Apr 17. doi: 10.5435/JAAOS-D-23-00831. Online ahead of print.
ABSTRACT
This review article focuses on the applications of deep learning with neural networks and multimodal neural networks in the orthopaedic domain. By providing practical examples of how artificial intelligence (AI) is being applied successfully in orthopaedic surgery, particularly in the realm of imaging data sets and the integration of clinical data, this study aims to provide orthopaedic surgeons with the necessary tools to not only evaluate existing literature but also to consider AI's potential in their own clinical or research pursuits. We first review standard deep neural networks which can analyze numerical clinical variables, then describe convolutional neural networks which can analyze image data, and then introduce multimodal AI models which analyze various types of different data. Then, we contrast these deep learning techniques with related but more limited techniques such as radiomics, describe how to interpret deep learning studies, and how to initiate such studies at your institution. Ultimately, by empowering orthopaedic surgeons with the knowledge and know-how of deep learning, this review aspires to facilitate the translation of research into clinical practice, thereby enhancing the efficacy and precision of real-world orthopaedic care for patients.
PMID:38652882 | DOI:10.5435/JAAOS-D-23-00831
Deep Probabilistic Principal Component Analysis for Process Monitoring
IEEE Trans Neural Netw Learn Syst. 2024 Apr 23;PP. doi: 10.1109/TNNLS.2024.3386890. Online ahead of print.
ABSTRACT
Probabilistic latent variable models (PLVMs), such as probabilistic principal component analysis (PPCA), are widely employed in process monitoring and fault detection of industrial processes. This article proposes a novel deep PPCA (DePPCA) model, which has the advantages of both probabilistic modeling and deep learning. The construction of DePPCA includes a greedy layer-wise pretraining phase and a unified end-to-end fine-tuning phase. The former establishes a hierarchical deep structure based on cascading multiple layers of the PPCA module to extract high-level features. The latter builds an end-to-end connection between the raw inputs and the final outputs to further improve the representation of the model to high-level features. After constructing the model structure of DePPCA, we first present the detailed training processes of the pretraining and fine-tuning stages, then clarify the theoretical merits of the proposed model from the perspective of variational inference. For process monitoring purposes, we develop two statistics based on the established DePPCA. The monitoring performance of these two statistics can remain superior even if the features extracted by DePPCA are significantly compressed to univariate. This makes the feature extraction process and online monitoring procedure of DePPCA quite fast. In other words, the proposed DePPCA can achieve accurate and efficient process monitoring by only extracting one feature for each sample. Finally, the effectiveness of DePPCA is evaluated on the Tennessee Eastman (TE) process and the multiphase flow (MPF) facility.
PMID:38652625 | DOI:10.1109/TNNLS.2024.3386890
Multiscale Deep Learning for Detection and Recognition: A Comprehensive Survey
IEEE Trans Neural Netw Learn Syst. 2024 Apr 23;PP. doi: 10.1109/TNNLS.2024.3389454. Online ahead of print.
ABSTRACT
Recently, the multiscale problem in computer vision has gradually attracted people's attention. This article focuses on multiscale representation for object detection and recognition, comprehensively introduces the development of multiscale deep learning, and constructs an easy-to-understand, but powerful knowledge structure. First, we give the definition of scale, explain the multiscale mechanism of human vision, and then lead to the multiscale problem discussed in computer vision. Second, advanced multiscale representation methods are introduced, including pyramid representation, scale-space representation, and multiscale geometric representation. Third, the theory of multiscale deep learning is presented, which mainly discusses the multiscale modeling in convolutional neural networks (CNNs) and Vision Transformers (ViTs). Fourth, we compare the performance of multiple multiscale methods on different tasks, illustrating the effectiveness of different multiscale structural designs. Finally, based on the in-depth understanding of the existing methods, we point out several open issues and future directions for multiscale deep learning.
PMID:38652624 | DOI:10.1109/TNNLS.2024.3389454
Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns
IEEE Trans Neural Netw Learn Syst. 2024 Apr 23;PP. doi: 10.1109/TNNLS.2024.3380827. Online ahead of print.
ABSTRACT
High-efficiency deep learning (DL) models are necessary not only to facilitate their use in devices with limited resources but also to improve resources required for training. Convolutional neural networks (ConvNets) typically exert severe demands on local device resources and this conventionally limits their adoption within mobile and embedded platforms. This brief presents work toward utilizing static convolutional filters generated from the space of local binary patterns (LBPs) and Haar features to design efficient ConvNet architectures. These are referred to as Structured Ternary Patterns (STePs) and can be generated during network initialization in a systematic way instead of having learnable weight parameters thus reducing the total weight updates. The ternary values require significantly less storage and with the appropriate low-level implementation, can also lead to inference improvements. The proposed approach is validated using four image classification datasets, demonstrating that common network backbones can be made more efficient and provide competitive results. It is also demonstrated that it is possible to generate completely custom STeP-based networks that provide good trade-offs for on-device applications such as unmanned aerial vehicle (UAV)-based aerial vehicle detection. The experimental results show that the proposed method maintains high detection accuracy while reducing the trainable parameters by 40%-80%. This work motivates further research toward good priors for nonlearnable weights that can make DL architectures more efficient without having to alter the network during or after training.
PMID:38652622 | DOI:10.1109/TNNLS.2024.3380827
Dual-Channel Adaptive Scale Hypergraph Encoders With Cross-View Contrastive Learning for Knowledge Tracing
IEEE Trans Neural Netw Learn Syst. 2024 Apr 23;PP. doi: 10.1109/TNNLS.2024.3386810. Online ahead of print.
ABSTRACT
Knowledge tracing (KT) refers to predicting learners' performance in the future according to their historical responses, which has become an essential task in intelligent tutoring systems. Most deep learning-based methods usually model the learners' knowledge states via recurrent neural networks (RNNs) or attention mechanisms. Recently emerging graph neural networks (GNNs) assist the KT model to capture the relationships such as question-skill and question-learner. However, non-pairwise and complex higher-order information among responses is ignored. In addition, a single-channel encoded hidden vector struggles to represent multigranularity knowledge states. To tackle the above problems, we propose a novel KT model named dual-channel adaptive scale hypergraph encoders with cross-view contrastive learning (HyperKT). Specifically, we design an adaptive scale hyperedge distillation component for generating knowledge-aware hyperedges and pattern-aware hyperedges that reflect non-pairwise higher-order features among responses. Then, we propose dual-channel hypergraph encoders to capture multigranularity knowledge states from global and local state hypergraphs. The encoders consist of a simplified hypergraph convolution network and a collaborative hypergraph convolution network. To enhance the supervisory signal in the state hypergraphs, we introduce the cross-view contrastive learning mechanism, which performs among state hypergraph views and their transformed line graph views. Extensive experiments on three real-world datasets demonstrate the superior performance of our HyperKT over the state-of-the-art (SOTA).
PMID:38652621 | DOI:10.1109/TNNLS.2024.3386810
Better Rough than Scarce: Proximal Femur Fracture Segmentation with Rough Annotations
IEEE Trans Med Imaging. 2024 Apr 23;PP. doi: 10.1109/TMI.2024.3392854. Online ahead of print.
ABSTRACT
Proximal femoral fracture segmentation in computed tomography (CT) is essential in the preoperative planning of orthopedic surgeons. Recently, numerous deep learning-based approaches have been proposed for segmenting various structures within CT scans. Nevertheless, distinguishing various attributes between fracture fragments and soft tissue regions in CT scans frequently poses challenges, which have received comparatively limited research attention. Besides, the cornerstone of contemporary deep learning methodologies is the availability of annotated data, while detailed CT annotations remain scarce. To address the challenge, we propose a novel weakly-supervised framework, namely Rough Turbo Net (RT-Net), for the segmentation of proximal femoral fractures. We emphasize the utilization of human resources to produce rough annotations on a substantial scale, as opposed to relying on limited fine-grained annotations that demand a substantial time to create. In RT-Net, rough annotations pose fractured-region constraints, which have demonstrated significant efficacy in enhancing the accuracy of the network. Conversely, the fine annotations can provide more details for recognizing edges and soft tissues. Besides, we design a spatial adaptive attention module (SAAM) that adapts to the spatial distribution of the fracture regions and align feature in each decoder. Moreover, we propose a fine-edge loss which is applied through an edge discrimination network to penalize the absence or imprecision edge features. Extensive quantitative and qualitative experiments demonstrate the superiority of RT-Net to state-of-the-art approaches. Furthermore, additional experiments show that RT-Net has the capability to produce pseudo labels for raw CT images that can further improve fracture segmentation performance and has the potential to improve segmentation performance on public datasets. The code is available at: https://github.com/zyairelu/RT-Net.
PMID:38652607 | DOI:10.1109/TMI.2024.3392854
Application of a novel deep learning-based 3D videography workflow to bat flight
Ann N Y Acad Sci. 2024 Apr 23. doi: 10.1111/nyas.15143. Online ahead of print.
ABSTRACT
Studying the detailed biomechanics of flying animals requires accurate three-dimensional coordinates for key anatomical landmarks. Traditionally, this relies on manually digitizing animal videos, a labor-intensive task that scales poorly with increasing framerates and numbers of cameras. Here, we present a workflow that combines deep learning-powered automatic digitization with filtering and correction of mislabeled points using quality metrics from deep learning and 3D reconstruction. We tested our workflow using a particularly challenging scenario: bat flight. First, we documented four bats flying steadily in a 2 m3 wind tunnel test section. Wing kinematic parameters resulting from manually digitizing bats with markers applied to anatomical landmarks were not significantly different from those resulting from applying our workflow to the same bats without markers for five out of six parameters. Second, we compared coordinates from manual digitization against those yielded via our workflow for bats flying freely in a 344 m3 enclosure. Average distance between coordinates from our workflow and those from manual digitization was less than a millimeter larger than the average human-to-human coordinate distance. The improved efficiency of our workflow has the potential to increase the scalability of studies on animal flight biomechanics.
PMID:38652595 | DOI:10.1111/nyas.15143
Detection and classification of mandibular fractures in panoramic radiography using artificial intelligence
Dentomaxillofac Radiol. 2024 Apr 23:twae018. doi: 10.1093/dmfr/twae018. Online ahead of print.
ABSTRACT
PURPOSE: This study aimed to assess the performance of a deep learning algorithm (YOLOv5) in detecting different mandibular fracture types in panoramic images.
METHODS: This study utilized a dataset of panoramic radiographic images with mandibular fractures. The dataset was divided into training, validation, and testing sets, with 60%, 20%, and 20% of the images, respectively. An equal number of control panoramic radiographs, which did not contain any fractures, were also randomly distributed among the three sets. The YOLOv5 deep learning model was trained to detect six fracture types in the mandible based on the anatomical location including symphysis, body, angle, ramus, condylar neck, and condylar head. Performance metrics of accuracy, precision, sensitivity (recall), specificity, dice coefficient (F1 score), and area under the curve (AUC) were calculated for each class.
RESULTS: A total of 498 panoramic images containing 673 fractures were collected. The accuracy was highest in detecting body (96.21%) and symphysis (95.87%), and was lowest in angle (90.51%) fractures. The highest and lowest precision values were observed in detecting symphysis (95.45%) and condylar head (63.16%) fractures, respectively. The sensitivity was highest in the body (96.67%) fractures and was lowest in the condylar head (80.00%) and condylar neck (81.25%) fractures. The highest specificity was noted in symphysis (98.96%), body (96.08%), and ramus (96.04%) fractures, respectively. The dice coefficient and AUC were highest in detecting body fractures (0.921 and 0.942, respectively), and were lowest in detecting condylar head fractures (0.706 and .812, respectively).
CONCLUSION: The trained algorithm achieved promising performance metrics for the automated detection of most fracture types, with the highest performance observed in detecting body and symphysis fractures. Machine learning can provide a potential tool for assisting clinicians in mandibular fracture diagnosis.
PMID:38652576 | DOI:10.1093/dmfr/twae018
An investigation into augmentation and preprocessing for optimising X-ray classification in limited datasets: a case study on necrotising enterocolitis
Int J Comput Assist Radiol Surg. 2024 Apr 23. doi: 10.1007/s11548-024-03107-0. Online ahead of print.
ABSTRACT
PURPOSE: Obtaining large volumes of medical images, required for deep learning development, can be challenging in rare pathologies. Image augmentation and preprocessing offer viable solutions. This work explores the case of necrotising enterocolitis (NEC), a rare but life-threatening condition affecting premature neonates, with challenging radiological diagnosis. We investigate data augmentation and preprocessing techniques and propose two optimised pipelines for developing reliable computer-aided diagnosis models on a limited NEC dataset.
METHODS: We present a NEC dataset of 1090 Abdominal X-rays (AXRs) from 364 patients and investigate the effect of geometric augmentations, colour scheme augmentations and their combination for NEC classification based on the ResNet-50 backbone. We introduce two pipelines based on colour contrast and edge enhancement, to increase the visibility of subtle, difficult-to-identify, critical NEC findings on AXRs and achieve robust accuracy in a challenging three-class NEC classification task.
RESULTS: Our results show that geometric augmentations improve performance, with Translation achieving +6.2%, while Flipping and Occlusion decrease performance. Colour augmentations, like Equalisation, yield modest improvements. The proposed Pr-1 and Pr-2 pipelines enhance model accuracy by +2.4% and +1.7%, respectively. Combining Pr-1/Pr-2 with geometric augmentation, we achieve a maximum performance increase of 7.1%, achieving robust NEC classification.
CONCLUSION: Based on an extensive validation of preprocessing and augmentation techniques, our work showcases the previously unreported potential of image preprocessing in AXR classification tasks with limited datasets. Our findings can be extended to other medical tasks for designing reliable classifier models with limited X-ray datasets. Ultimately, we also provide a benchmark for automated NEC detection and classification from AXRs.
PMID:38652416 | DOI:10.1007/s11548-024-03107-0
Artificial intelligence-enhanced automation for M-mode echocardiographic analysis: ensuring fully automated, reliable, and reproducible measurements
Int J Cardiovasc Imaging. 2024 Apr 23. doi: 10.1007/s10554-024-03095-x. Online ahead of print.
ABSTRACT
To enhance M-mode echocardiography's utility for measuring cardiac structures, we developed and evaluated an artificial intelligence (AI)-based automated analysis system for M-mode images through the aorta and left atrium [M-mode (Ao-LA)], and through the left ventricle [M-mode (LV)]. Our system, integrating two deep neural networks (DNN) for view classification and image segmentation, alongside an auto-measurement algorithm, was developed using 5,958 M-mode images [3,258 M-mode (LA-Ao), and 2,700 M-mode (LV)] drawn from a nationwide echocardiographic dataset collated from five tertiary hospitals. The performance of view classification and segmentation DNNs were evaluated on 594 M-mode images, while automatic measurement accuracy was tested on separate internal test set with 100 M-mode images as well as external test set with 280 images (140 sinus rhythm and 140 atrial fibrillation). Performance evaluation showed the view classification DNN's overall accuracy of 99.8% and segmentation DNN's Dice similarity coefficient of 94.3%. Within the internal test set, all automated measurements, including LA, Ao, and LV wall and cavity, resonated strongly with expert evaluations, exhibiting Pearson's correlation coefficients (PCCs) of 0.81-0.99. This performance persisted in the external test set for both sinus rhythm (PCC, 0.84-0.98) and atrial fibrillation (PCC, 0.70-0.97). Notably, automatic measurements, consistently offering multi-cardiac cycle readings, showcased a stronger correlation with the averaged multi-cycle manual measurements than with those of a single representative cycle. Our AI-based system for automatic M-mode echocardiographic analysis demonstrated excellent accuracy, reproducibility, and speed. This automated approach has the potential to improve efficiency and reduce variability in clinical practice.
PMID:38652399 | DOI:10.1007/s10554-024-03095-x
Transfer learning and self-distillation for automated detection of schizophrenia using single-channel EEG and scalogram images
Phys Eng Sci Med. 2024 Apr 23. doi: 10.1007/s13246-024-01420-1. Online ahead of print.
ABSTRACT
Schizophrenia (SZ) has been acknowledged as a highly intricate mental disorder for a long time. In fact, individuals with SZ experience a blurred line between fantasy and reality, leading to a lack of awareness about their condition, which can pose significant challenges during the treatment process. Due to the importance of the issue, timely diagnosis of this illness can not only assist patients and their families in managing the condition but also enable early intervention, which may help prevent its advancement. EEG is a widely utilized technique for investigating mental disorders like SZ due to its non-invasive nature, affordability, and wide accessibility. In this study, our main goal is to develop an optimized system that can achieve automatic diagnosis of SZ with minimal input information. To optimize the system, we adopted a strategy of using single-channel EEG signals and integrated knowledge distillation and transfer learning techniques into the model. This approach was designed to improve the performance and efficiency of our proposed method for SZ diagnosis. Additionally, to leverage the pre-trained models effectively, we converted the EEG signals into images using Continuous Wavelet Transform (CWT). This transformation allowed us to harness the capabilities of pre-trained models in the image domain, enabling automatic SZ detection with enhanced efficiency. To achieve a more robust estimate of the model's performance, we employed fivefold cross-validation. The accuracy achieved from the 5-s records of the EEG signal, along with the combination of self-distillation and VGG16 for the P4 channel, is 97.81. This indicates a high level of accuracy in diagnosing SZ using the proposed method.
PMID:38652347 | DOI:10.1007/s13246-024-01420-1