Deep learning
The new paradigm in machine learning - foundation models, large language models and beyond: a primer for physicians
Intern Med J. 2024 May 7. doi: 10.1111/imj.16393. Online ahead of print.
ABSTRACT
Foundation machine learning models are deep learning models capable of performing many different tasks using different data modalities such as text, audio, images and video. They represent a major shift from traditional task-specific machine learning prediction models. Large language models (LLM), brought to wide public prominence in the form of ChatGPT, are text-based foundational models that have the potential to transform medicine by enabling automation of a range of tasks, including writing discharge summaries, answering patients questions and assisting in clinical decision-making. However, such models are not without risk and can potentially cause harm if their development, evaluation and use are devoid of proper scrutiny. This narrative review describes the different types of LLM, their emerging applications and potential limitations and bias and likely future translation into clinical practice.
PMID:38715436 | DOI:10.1111/imj.16393
DenseNet model incorporating hybrid attention mechanisms and clinical features for pancreatic cystic tumor classification
J Appl Clin Med Phys. 2024 May 7:e14380. doi: 10.1002/acm2.14380. Online ahead of print.
ABSTRACT
PURPOSE: The aim of this study is to develop a deep learning model capable of discriminating between pancreatic plasma cystic neoplasms (SCN) and mucinous cystic neoplasms (MCN) by leveraging patient-specific clinical features and imaging outcomes. The intent is to offer valuable diagnostic support to clinicians in their clinical decision-making processes.
METHODS: The construction of the deep learning model involved utilizing a dataset comprising abdominal magnetic resonance T2-weighted images obtained from patients diagnosed with pancreatic cystic tumors at Changhai Hospital. The dataset comprised 207 patients with SCN and 93 patients with MCN, encompassing a total of 1761 images. The foundational architecture employed was DenseNet-161, augmented with a hybrid attention mechanism module. This integration aimed to enhance the network's attentiveness toward channel and spatial features, thereby amplifying its performance. Additionally, clinical features were incorporated prior to the fully connected layer of the network to actively contribute to subsequent decision-making processes, thereby significantly augmenting the model's classification accuracy. The final patient classification outcomes were derived using a joint voting methodology, and the model underwent comprehensive evaluation.
RESULTS: Using the five-fold cross validation, the accuracy of the classification model in this paper was 92.44%, with an AUC value of 0.971, a precision rate of 0.956, a recall rate of 0.919, a specificity of 0.933, and an F1-score of 0.936.
CONCLUSION: This study demonstrates that the DenseNet model, which incorporates hybrid attention mechanisms and clinical features, is effective for distinguishing between SCN and MCN, and has potential application for the diagnosis of pancreatic cystic tumors in clinical practice.
PMID:38715381 | DOI:10.1002/acm2.14380
Learning shared template representation with augmented feature for multi-object pose estimation
Neural Netw. 2024 Apr 30;176:106352. doi: 10.1016/j.neunet.2024.106352. Online ahead of print.
ABSTRACT
Template matching pose estimation methods based on deep learning have made significant advancements via metric learning or reconstruction learning. Existing approaches primarily build distinct template representation libraries (codebooks) from rendered images for each object, which complicate the training process and increase memory cost for multi-object tasks. Additionally, they struggle to effectively handle discrepancies between the distributions of training and test sets, particularly for occluded objects, resulting in suboptimal matching accuracy. In this study, we propose a shared template representation learning method with augmented semantic features to address these issues. Our method learns representations concurrently using metric and reconstruction learning as similarity constraints, and augments response of network to objects through semantic feature constraints for better generalization performance. Furthermore, rotation matrices serve as templates for codebook construction, leading to excellent matching accuracy compared to rendered images. Notably, it contributes to the effective decoupling of object categories and templates, necessitating the maintenance of only a shared codebook in multi-object pose estimation tasks. Extensive experiments on Linemod, Linemod-Occluded and TLESS datasets demonstrate that the proposed method employing shared templates achieves superior matching accuracy. Moreover, proposed method exhibits robustness on a collected aircraft dataset, further validating its efficacy.
PMID:38713968 | DOI:10.1016/j.neunet.2024.106352
Development and Application of Traditional Chinese Medicine Using AI Machine Learning and Deep Learning Strategies
Am J Chin Med. 2024 May 8:1-19. doi: 10.1142/S0192415X24500265. Online ahead of print.
ABSTRACT
Traditional Chinese medicine (TCM) has been used for thousands of years and has been proven to be effective at treating many complicated illnesses with minimal side effects. The application and advancement of TCM are, however, constrained by the absence of objective measuring standards due to its relatively abstract diagnostic methods and syndrome differentiation theories. Ongoing developments in machine learning (ML) and deep learning (DL), specifically in computer vision (CV) and natural language processing (NLP), offer novel opportunities to modernize TCM by exploring the profound connotations of its theory. This review begins with an overview of the ML and DL methods employed in TCM; this is followed by practical instances of these applications. Furthermore, extensive discussions emphasize the mature integration of ML and DL in TCM, such as tongue diagnosis, pulse diagnosis, and syndrome differentiation treatment, highlighting their early successful application in the TCM field. Finally, this study validates the accomplishments and addresses the problems and challenges posed by the application and development of TCM powered by ML and DL. As ML and DL techniques continue to evolve, modern technology will spark new advances in TCM.
PMID:38715181 | DOI:10.1142/S0192415X24500265
Computer vision digitization of smartphone images of anesthesia paper health records from low-middle income countries
BMC Bioinformatics. 2024 May 7;25(1):178. doi: 10.1186/s12859-024-05785-8.
ABSTRACT
BACKGROUND: In low-middle income countries, healthcare providers primarily use paper health records for capturing data. Paper health records are utilized predominately due to the prohibitive cost of acquisition and maintenance of automated data capture devices and electronic medical records. Data recorded on paper health records is not easily accessible in a digital format to healthcare providers. The lack of real time accessible digital data limits healthcare providers, researchers, and quality improvement champions to leverage data to improve patient outcomes. In this project, we demonstrate the novel use of computer vision software to digitize handwritten intraoperative data elements from smartphone photographs of paper anesthesia charts from the University Teaching Hospital of Kigali. We specifically report our approach to digitize checkbox data, symbol-denoted systolic and diastolic blood pressure, and physiological data.
METHODS: We implemented approaches for removing perspective distortions from smartphone photographs, removing shadows, and improving image readability through morphological operations. YOLOv8 models were used to deconstruct the anesthesia paper chart into specific data sections. Handwritten blood pressure symbols and physiological data were identified, and values were assigned using deep neural networks. Our work builds upon the contributions of previous research by improving upon their methods, updating the deep learning models to newer architectures, as well as consolidating them into a single piece of software.
RESULTS: The model for extracting the sections of the anesthesia paper chart achieved an average box precision of 0.99, an average box recall of 0.99, and an mAP0.5-95 of 0.97. Our software digitizes checkbox data with greater than 99% accuracy and digitizes blood pressure data with a mean average error of 1.0 and 1.36 mmHg for systolic and diastolic blood pressure respectively. Overall accuracy for physiological data which includes oxygen saturation, inspired oxygen concentration and end tidal carbon dioxide concentration was 85.2%.
CONCLUSIONS: We demonstrate that under normal photography conditions we can digitize checkbox, blood pressure and physiological data to within human accuracy when provided legible handwriting. Our contributions provide improved access to digital data to healthcare practitioners in low-middle income countries.
PMID:38714921 | DOI:10.1186/s12859-024-05785-8
Precise and automated lung cancer cell classification using deep neural network with multiscale features and model distillation
Sci Rep. 2024 May 7;14(1):10471. doi: 10.1038/s41598-024-61101-7.
ABSTRACT
Lung diseases globally impose a significant pathological burden and mortality rate, particularly the differential diagnosis between adenocarcinoma, squamous cell carcinoma, and small cell lung carcinoma, which is paramount in determining optimal treatment strategies and improving clinical prognoses. Faced with the challenge of improving diagnostic precision and stability, this study has developed an innovative deep learning-based model. This model employs a Feature Pyramid Network (FPN) and Squeeze-and-Excitation (SE) modules combined with a Residual Network (ResNet18), to enhance the processing capabilities for complex images and conduct multi-scale analysis of each channel's importance in classifying lung cancer. Moreover, the performance of the model is further enhanced by employing knowledge distillation from larger teacher models to more compact student models. Subjected to rigorous five-fold cross-validation, our model outperforms existing models on all performance metrics, exhibiting exceptional diagnostic accuracy. Ablation studies on various model components have verified that each addition effectively improves model performance, achieving an average accuracy of 98.84% and a Matthews Correlation Coefficient (MCC) of 98.83%. Collectively, the results indicate that our model significantly improves the accuracy of disease diagnosis, providing physicians with more precise clinical decision-making support.
PMID:38714840 | DOI:10.1038/s41598-024-61101-7
A comparative study of an on premise AutoML solution for medical image classification
Sci Rep. 2024 May 7;14(1):10483. doi: 10.1038/s41598-024-60429-4.
ABSTRACT
Automated machine learning (AutoML) allows for the simplified application of machine learning to real-world problems, by the implicit handling of necessary steps such as data pre-processing, feature engineering, model selection and hyperparameter optimization. This has encouraged its use in medical applications such as imaging. However, the impact of common parameter choices such as the number of trials allowed, and the resolution of the input images, has not been comprehensively explored in existing literature. We therefore benchmark AutoKeras (AK), an open-source AutoML framework, against several bespoke deep learning architectures, on five public medical datasets representing a wide range of imaging modalities. It was found that AK could outperform the bespoke models in general, although at the cost of increased training time. Moreover, our experiments suggest that a large number of trials and higher resolutions may not be necessary for optimal performance to be achieved.
PMID:38714764 | DOI:10.1038/s41598-024-60429-4
Applications of artificial intelligence in urologic oncology
Investig Clin Urol. 2024 May;65(3):202-216. doi: 10.4111/icu.20230435.
ABSTRACT
PURPOSE: With the recent rising interest in artificial intelligence (AI) in medicine, many studies have explored the potential and usefulness of AI in urological diseases. This study aimed to comprehensively review recent applications of AI in urologic oncology.
MATERIALS AND METHODS: We searched the PubMed-MEDLINE databases for articles in English on machine learning (ML) and deep learning (DL) models related to general surgery and prostate, bladder, and kidney cancer. The search terms were a combination of keywords, including both "urology" and "artificial intelligence" with one of the following: "machine learning," "deep learning," "neural network," "renal cell carcinoma," "kidney cancer," "urothelial carcinoma," "bladder cancer," "prostate cancer," and "robotic surgery."
RESULTS: A total of 58 articles were included. The studies on prostate cancer were related to grade prediction, improved diagnosis, and predicting outcomes and recurrence. The studies on bladder cancer mainly used radiomics to identify aggressive tumors and predict treatment outcomes, recurrence, and survival rates. Most studies on the application of ML and DL in kidney cancer were focused on the differentiation of benign and malignant tumors as well as prediction of their grade and subtype. Most studies suggested that methods using AI may be better than or similar to existing traditional methods.
CONCLUSIONS: AI technology is actively being investigated in the field of urological cancers as a tool for diagnosis, prediction of prognosis, and decision-making and is expected to be applied in additional clinical areas soon. Despite technological, legal, and ethical concerns, AI will change the landscape of urological cancer management.
PMID:38714511 | DOI:10.4111/icu.20230435
SurfPro-NN: A 3D point cloud neural network for the scoring of protein-protein docking models based on surfaces features and protein language models
Comput Biol Chem. 2024 Apr 4:108067. doi: 10.1016/j.compbiolchem.2024.108067. Online ahead of print.
ABSTRACT
Protein-protein interactions (PPI) play a crucial role in numerous key biological processes, and the structure of protein complexes provides valuable clues for in-depth exploration of molecular-level biological processes. Protein-protein docking technology is widely used to simulate the spatial structure of proteins. However, there are still challenges in selecting candidate decoys that closely resemble the native structure from protein-protein docking simulations. In this study, we introduce a docking evaluation method based on three-dimensional point cloud neural networks named SurfPro-NN, which represents protein structures as point clouds and learns interaction information from protein interfaces by applying a point cloud neural network. With the continuous advancement of deep learning in the field of biology, a series of knowledge-rich pre-trained models have emerged. We incorporate protein surface representation models and language models into our approach, greatly enhancing feature representation capabilities and achieving superior performance in protein docking model scoring tasks. Through comprehensive testing on public datasets, we find that our method outperforms state-of-the-art deep learning approaches in protein-protein docking model scoring. Not only does it significantly improve performance, but it also greatly accelerates training speed. This study demonstrates the potential of our approach in addressing protein interaction assessment problems, providing strong support for future research and applications in the field of biology.
PMID:38714420 | DOI:10.1016/j.compbiolchem.2024.108067
Unsupervised MRI motion artifact disentanglement: introducing MAUDGAN
Phys Med Biol. 2024 May 7. doi: 10.1088/1361-6560/ad4845. Online ahead of print.
ABSTRACT
This study developed an unsupervised motion artifact reduction method for MRI images of patients with brain tumors. The proposed novel design uses multi-parametric multicenter contrast-enhanced T1W (ceT1W) and T2-FLAIR MRI images.
Approach: The proposed framework included two generators, two discriminators, and two feature extractor networks. A 3-fold cross-validation was used to train and fine-tune the hyperparameters of the proposed model using 230 brain MRI images with tumors, which were then tested on 148 patients' in-vivo datasets. An ablation was performed to evaluate the model's compartments. Our model was compared with Pix2pix and CycleGAN. Six evaluation metrics were reported, including normalized mean squared error (NMSE), structural similarity index (SSIM), multi-scale-SSIM (MS-SSIM), peak signal-to-noise ratio (PSNR), visual information fidelity (VIF), and multi-scale gradient magnitude similarity deviation (MS-GMSD). Artifact reduction and consistency of tumor regions, image contrast, and sharpness were evaluated by three evaluators using Likert scales and compared with ANOVA and Tukey's HSD tests. 
Main results: On average, our method outperforms comparative models to remove heavy motion artifacts with the lowest NMSE (18.34±5.07%) and MS-GMSD (0.07±0.03) for heavy motion artifact level. Additionally, our method creates motion-free images with the highest SSIM (0.93±0.04), PSNR (30.63±4.96), and VIF (0.45±0.05) values, along with comparable MS-SSIM (0.96±0.31). Similarly, our method outperformed comparative models in removing in-vivo motion artifacts for different distortion levels except for MS- SSIM and VIF, which have comparable performance with CycleGAN. Moreover, our method had a consistent performance for different artifact levels. For the heavy level of motion artifacts, our method got the highest Likert scores of 2.82±0.52, 1.88±0.71, and 1.02±0.14 (p-values<<0.0001) for our method, CycleGAN, and Pix2pix respectively. Similar trends were also found for other motion artifact levels.
Significance: Our proposed unsupervised method was demonstrated to reduce motion artifacts from the ceT1W brain images under a multi-parametric framework.
PMID:38714192 | DOI:10.1088/1361-6560/ad4845
Deep learning based linear energy transfer calculation for proton therapy
Phys Med Biol. 2024 May 7. doi: 10.1088/1361-6560/ad4844. Online ahead of print.
ABSTRACT
This study aims to address the limitations of traditional methods for calculating linear energy transfer (LET), a critical component in assessing relative biological effectiveness (RBE). Currently, Monte Carlo (MC) simulation, the gold-standard for accuracy, is resource-intensive and slow for dose optimization, while the speedier analytical approximation has compromised accuracy. Our objective was to prototype a deep-learning-based model for calculating dose-averaged LET (LETd) using patient anatomy and dose-to-water (DW) data, facilitating real-time biological dose evaluation and LET optimization within proton treatment planning systems.
Approach: 275 4-field prostate proton Stereotactic Body Radiotherapy (SBRT) plans were analyzed, rendering a total of 1100 fields. Those were randomly split into 880, 110, and 110 fields for training, validation, and testing. A 3D Cascaded UNet model, along with data processing and inference pipelines, was developed to generate patient-specific LETd distributions from CT images and DW. The accuracy of the LETd of the test dataset was evaluated against MC-generated ground truth through voxel-based mean absolute error (MAE) and gamma analysis.
Main Results: The proposed model accurately inferred LETd distributions for each proton field in the test dataset. A single-field LETd calculation took around 100 ms with trained models running on a NVidia A100 GPU. The selected model yielded an average MAE of 0.94±0.14 MeV/cm and a gamma passing rate of 97.4% ± 1.3% when applied to the test dataset, with the largest discrepancy at the edge of fields where the dose gradient was the largest and counting statistics was the lowest.
Significance: This study demonstrates that deep-learning-based models can efficiently calculate LETd with high accuracy as a fast-forward approach. The model shows great potential to be utilized for optimizing the RBE of proton treatment plans. Future efforts will focus on enhancing the model's performance and evaluating its adaptability to different clinical scenarios.
PMID:38714191 | DOI:10.1088/1361-6560/ad4844
CAVE: Cerebral artery-vein segmentation in digital subtraction angiography
Comput Med Imaging Graph. 2024 May 1;115:102392. doi: 10.1016/j.compmedimag.2024.102392. Online ahead of print.
ABSTRACT
Cerebral X-ray digital subtraction angiography (DSA) is a widely used imaging technique in patients with neurovascular disease, allowing for vessel and flow visualization with high spatio-temporal resolution. Automatic artery-vein segmentation in DSA plays a fundamental role in vascular analysis with quantitative biomarker extraction, facilitating a wide range of clinical applications. The widely adopted U-Net applied on static DSA frames often struggles with disentangling vessels from subtraction artifacts. Further, it falls short in effectively separating arteries and veins as it disregards the temporal perspectives inherent in DSA. To address these limitations, we propose to simultaneously leverage spatial vasculature and temporal cerebral flow characteristics to segment arteries and veins in DSA. The proposed network, coined CAVE, encodes a 2D+time DSA series using spatial modules, aggregates all the features using temporal modules, and decodes it into 2D segmentation maps. On a large multi-center clinical dataset, CAVE achieves a vessel segmentation Dice of 0.84 (±0.04) and an artery-vein segmentation Dice of 0.79 (±0.06). CAVE surpasses traditional Frangi-based k-means clustering (P < 0.001) and U-Net (P < 0.001) by a significant margin, demonstrating the advantages of harvesting spatio-temporal features. This study represents the first investigation into automatic artery-vein segmentation in DSA using deep learning. The code is publicly available at https://github.com/RuishengSu/CAVE_DSA.
PMID:38714020 | DOI:10.1016/j.compmedimag.2024.102392
3DFRINet: A Framework for the Detection and Diagnosis of Fracture Related Infection in Low Extremities Based on <sup>18</sup>F-FDG PET/CT 3D Images
Comput Med Imaging Graph. 2024 May 3;115:102394. doi: 10.1016/j.compmedimag.2024.102394. Online ahead of print.
ABSTRACT
Fracture related infection (FRI) is one of the most devastating complications after fracture surgery in the lower extremities, which can lead to extremely high morbidity and medical costs. Therefore, early comprehensive evaluation and accurate diagnosis of patients are critical for appropriate treatment, prevention of complications, and good prognosis. 18Fluoro-deoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) is one of the most commonly used medical imaging modalities for diagnosing FRI. With the development of deep learning, more neural networks have been proposed and become powerful computer-aided diagnosis tools in medical imaging. Therefore, a fully automated two-stage framework for FRI detection and diagnosis, 3DFRINet (Three Dimension FRI Network), is proposed for 18F-FDG PET/CT 3D imaging. The first stage can effectively extract and fuse the features of both modalities to accurately locate the lesion by the dual-branch design and attention module. The second stage reduces the dimensionality of the image by using the maximum intensity projection, which retains the effective features while reducing the computational effort and achieving excellent diagnostic performance. The diagnostic performance of lesions reached 91.55% accuracy, 0.9331 AUC, and 0.9250 F1 score. 3DFRINet has an advantage over six nuclear medicine experts in each classification metric. The statistical analysis shows that 3DFRINet is equivalent or superior to the primary nuclear medicine physicians and comparable to the senior nuclear medicine physicians. In conclusion, this study first proposed a method based on 18F-FDG PET/CT three-dimensional imaging for FRI location and diagnosis. This method shows superior lesion detection rate and diagnostic efficiency and therefore has good prospects for clinical application.
PMID:38714019 | DOI:10.1016/j.compmedimag.2024.102394
Influence of learned landmark correspondences on lung CT registration
Med Phys. 2024 May 7. doi: 10.1002/mp.17120. Online ahead of print.
ABSTRACT
BACKGROUND: Disease or injury may cause a change in the biomechanical properties of the lungs, which can alter lung function. Image registration can be used to measure lung ventilation and quantify volume change, which can be a useful diagnostic aid. However, lung registration is a challenging problem because of the variation in deformation along the lungs, sliding motion of the lungs along the ribs, and change in density.
PURPOSE: Landmark correspondences have been used to make deformable image registration robust to large displacements.
METHODS: To tackle the challenging task of intra-patient lung computed tomography (CT) registration, we extend the landmark correspondence prediction model deep convolutional neural network-Match by introducing a soft mask loss term to encourage landmark correspondences in specific regions and avoid the use of a mask during inference. To produce realistic deformations to train the landmark correspondence model, we use data-driven synthetic transformations. We study the influence of these learned landmark correspondences on lung CT registration by integrating them into intensity-based registration as a distance-based penalty.
RESULTS: Our results on the public thoracic CT dataset COPDgene show that using learned landmark correspondences as a soft constraint can reduce median registration error from approximately 5.46 to 4.08 mm compared to standard intensity-based registration, in the absence of lung masks.
CONCLUSIONS: We show that using landmark correspondences results in minor improvements in local alignment, while significantly improving global alignment.
PMID:38713916 | DOI:10.1002/mp.17120
Examining the Gateway Hypothesis and Mapping Substance Use Pathways on Social Media: Machine Learning Approach
JMIR Form Res. 2024 May 7;8:e54433. doi: 10.2196/54433.
ABSTRACT
BACKGROUND: Substance misuse presents significant global public health challenges. Understanding transitions between substance types and the timing of shifts to polysubstance use is vital to developing effective prevention and recovery strategies. The gateway hypothesis suggests that high-risk substance use is preceded by lower-risk substance use. However, the source of this correlation is hotly contested. While some claim that low-risk substance use causes subsequent, riskier substance use, most people using low-risk substances also do not escalate to higher-risk substances. Social media data hold the potential to shed light on the factors contributing to substance use transitions.
OBJECTIVE: By leveraging social media data, our study aimed to gain a better understanding of substance use pathways. By identifying and analyzing the transitions of individuals between different risk levels of substance use, our goal was to find specific linguistic cues in individuals' social media posts that could indicate escalating or de-escalating patterns in substance use.
METHODS: We conducted a large-scale analysis using data from Reddit, collected between 2015 and 2019, consisting of over 2.29 million posts and approximately 29.37 million comments by around 1.4 million users from subreddits. These data, derived from substance use subreddits, facilitated the creation of a risk transition data set reflecting the substance use behaviors of over 1.4 million users. We deployed deep learning and machine learning techniques to predict the escalation or de-escalation transitions in risk levels, based on initial transition phases documented in posts and comments. We conducted a linguistic analysis to analyze the language patterns associated with transitions in substance use, emphasizing the role of n-gram features in predicting future risk trajectories.
RESULTS: Our results showed promise in predicting the escalation or de-escalation transition in risk levels, based on the historical data of Reddit users created on initial transition phases among drug-related subreddits, with an accuracy of 78.48% and an F1-score of 79.20%. We highlighted the vital predictive features, such as specific substance names and tools indicative of future risk escalations. Our linguistic analysis showed that terms linked with harm reduction strategies were instrumental in signaling de-escalation, whereas descriptors of frequent substance use were characteristic of escalating transitions.
CONCLUSIONS: This study sheds light on the complexities surrounding the gateway hypothesis of substance use through an examination of web-based behavior on Reddit. While certain findings validate the hypothesis, indicating a progression from lower-risk substances such as marijuana to higher-risk ones, a significant number of individuals did not show this transition. The research underscores the potential of using machine learning with social media analysis to predict substance use transitions. Our results point toward future directions for leveraging social media data in substance use research, underlining the importance of continued exploration before suggesting direct implications for interventions.
PMID:38713904 | DOI:10.2196/54433
A multiview deep learning-based prediction pipeline augmented with confident learning can improve performance in determining knee arthroplasty candidates
Knee Surg Sports Traumatol Arthrosc. 2024 May 7. doi: 10.1002/ksa.12221. Online ahead of print.
ABSTRACT
PURPOSE: Preoperative prudent patient selection plays a crucial role in knee osteoarthritis management but faces challenges in appropriate referrals such as total knee arthroplasty (TKA), unicompartmental knee arthroplasty (UKA) and nonoperative intervention. Deep learning (DL) techniques can build prediction models for treatment decision-making. The aim is to develop and evaluate a knee arthroplasty prediction pipeline using three-view X-rays to determine the suitable candidates for TKA, UKA or are not arthroplasty candidates.
METHODS: A study was conducted using three-view (anterior-posterior, lateral and patellar) X-rays and surgical data of patients undergoing TKA, UKA or nonarthroplasty interventions from sites A and B. Data from site A were used to derive and validate models. Data from site B were used as external test set. A DL pipeline combining YOLOv3 and ResNet-18 with confident learning (CL) was developed. Multiview Convolutional Neural Network, EfficientNet-b4, ResNet-101 and the proposed model without CL were also trained and tested. The models were evaluated using metrics such as area under the receiver operating characteristic curve (AUC), accuracy, precision, specificity, sensitivity and F1 score.
RESULTS: The data set comprised a total of 1779 knees. Of which 1645 knees were from site A as a derivation set and an internal validation cohort. The external validation cohort consisted of 134 knees. The internal validation cohort demonstrated superior performance for the proposed model augmented with CL, achieving an AUC of 0.94 and an accuracy of 85.9%. External validation further confirmed the model's generalisation, with an AUC of 0.93 and an accuracy of 82.1%. Comparative analysis with other neural network models showed the proposed model's superiority.
CONCLUSIONS: The proposed DL pipeline, integrating YOLOv3, ResNet-18 and CL, provides accurate predictions for knee arthroplasty candidates based on three-view X-rays. This prediction model could be useful in performing decision making for the type of arthroplasty procedure in an automated fashion.
LEVEL OF EVIDENCE: Level III, diagnostic study.
PMID:38713857 | DOI:10.1002/ksa.12221
GELT: A graph embeddings based lite-transformer for knowledge tracing
PLoS One. 2024 May 7;19(5):e0301714. doi: 10.1371/journal.pone.0301714. eCollection 2024.
ABSTRACT
The development of intelligent education has led to the emergence of knowledge tracing as a fundamental task in the learning process. Traditionally, the knowledge state of each student has been determined by assessing their performance in previous learning activities. In recent years, Deep Learning approaches have shown promising results in capturing complex representations of human learning activities. However, the interpretability of these models is often compromised due to the end-to-end training strategy they employ. To address this challenge, we draw inspiration from advancements in graph neural networks and propose a novel model called GELT (Graph Embeddings based Lite-Transformer). The purpose of this model is to uncover and understand the relationships between skills and questions. Additionally, we introduce an energy-saving attention mechanism for predicting knowledge states that is both simple and effective. This approach maintains high prediction accuracy while significantly reducing computational costs compared to conventional attention mechanisms. Extensive experimental results demonstrate the superior performance of our proposed model compared to other state-of-the-art baselines on three publicly available real-world datasets for knowledge tracking.
PMID:38713679 | DOI:10.1371/journal.pone.0301714
Combining enhanced spectral resolution of EMG and a deep learning approach for knee pathology diagnosis
PLoS One. 2024 May 7;19(5):e0302707. doi: 10.1371/journal.pone.0302707. eCollection 2024.
ABSTRACT
Knee osteoarthritis (OA) is a prevalent, debilitating joint condition primarily affecting the elderly. This investigation aims to develop an electromyography (EMG)-based method for diagnosing knee pathologies. EMG signals of the muscles surrounding the knee joint were examined and recorded. The principal components of the proposed method were preprocessing, high-order spectral analysis (HOSA), and diagnosis/recognition through deep learning. EMG signals from individuals with normal and OA knees while walking were extracted from a publicly available database. This examination focused on the quadriceps femoris, the medial gastrocnemius, the rectus femoris, the semitendinosus, and the vastus medialis. Filtration and rectification were utilized beforehand to eradicate noise and smooth EMG signals. Signals' higher-order spectra were analyzed with HOSA to obtain information about nonlinear interactions and phase coupling. Initially, the bicoherence representation of EMG signals was devised. The resulting images were fed into a deep-learning system for identification and analysis. A deep learning algorithm using adapted ResNet101 CNN model examined the images to determine whether the EMG signals were conventional or indicative of knee osteoarthritis. The validated test results demonstrated high accuracy and robust metrics, indicating that the proposed method is effective. The medial gastrocnemius (MG) muscle was able to distinguish Knee osteoarthritis (KOA) patients from normal with 96.3±1.7% accuracy and 0.994±0.008 AUC. MG has the highest prediction accuracy of KOA and can be used as the muscle of interest in future analysis. Despite the proposed method's superiority, some limitations still require special consideration and will be addressed in future research.
PMID:38713653 | DOI:10.1371/journal.pone.0302707
GraphEGFR: Multi-task and transfer learning based on molecular graph attention mechanism and fingerprints improving inhibitor bioactivity prediction for EGFR family proteins on data scarcity
J Comput Chem. 2024 May 7. doi: 10.1002/jcc.27388. Online ahead of print.
ABSTRACT
The proteins within the human epidermal growth factor receptor (EGFR) family, members of the tyrosine kinase receptor family, play a pivotal role in the molecular mechanisms driving the development of various tumors. Tyrosine kinase inhibitors, key compounds in targeted therapy, encounter challenges in cancer treatment due to emerging drug resistance mutations. Consequently, machine learning has undergone significant evolution to address the challenges of cancer drug discovery related to EGFR family proteins. However, the application of deep learning in this area is hindered by inherent difficulties associated with small-scale data, particularly the risk of overfitting. Moreover, the design of a model architecture that facilitates learning through multi-task and transfer learning, coupled with appropriate molecular representation, poses substantial challenges. In this study, we introduce GraphEGFR, a deep learning regression model designed to enhance molecular representation and model architecture for predicting the bioactivity of inhibitors against both wild-type and mutant EGFR family proteins. GraphEGFR integrates a graph attention mechanism for molecular graphs with deep and convolutional neural networks for molecular fingerprints. We observed that GraphEGFR models employing multi-task and transfer learning strategies generally achieve predictive performance comparable to existing competitive methods. The integration of molecular graphs and fingerprints adeptly captures relationships between atoms and enables both global and local pattern recognition. We further validated potential multi-targeted inhibitors for wild-type and mutant HER1 kinases, exploring key amino acid residues through molecular dynamics simulations to understand molecular interactions. This predictive model offers a robust strategy that could significantly contribute to overcoming the challenges of developing deep learning models for drug discovery with limited data and exploring new frontiers in multi-targeted kinase drug discovery for EGFR family proteins.
PMID:38713612 | DOI:10.1002/jcc.27388
MOCNN: A Multiscale Deep Convolutional Neural Network for ERP-Based Brain-Computer Interfaces
IEEE Trans Cybern. 2024 May 7;PP. doi: 10.1109/TCYB.2024.3390805. Online ahead of print.
ABSTRACT
Event-related potentials (ERPs) reflect neurophysiological changes of the brain in response to external events and their associated underlying complex spatiotemporal feature information is governed by ongoing oscillatory activity within the brain. Deep learning methods have been increasingly adopted for ERP-based brain-computer interfaces (BCIs) due to their excellent feature representation abilities, which allow for deep analysis of oscillatory activity within the brain. Features with higher spatiotemporal frequencies usually represent detailed and localized information, while features with lower spatiotemporal frequencies usually represent global structures. Mining EEG features from multiple spatiotemporal frequencies is conducive to obtaining more discriminative information. A multiscale feature fusion octave convolution neural network (MOCNN) is proposed in this article. MOCNN divides the ERP signals into high-, medium-and low-frequency components corresponding to different resolutions and processes them in different branches. By adding mid-and low-frequency components, the feature information used by MOCNN can be enriched, and the required amount of calculations can be reduced. After successive feature mapping using temporal and spatial convolutions, MOCNN realizes interactive learning among different components through the exchange of feature information among branches. Classification is accomplished by feeding the fused deep spatiotemporal features from various components into a fully connected layer. The results, obtained on two public datasets and a self-collected ERP dataset, show that MOCNN can achieve state-of-the-art ERP classification performance. In this study, the generalized concept of octave convolution is introduced into the field of ERP-BCI research, which allows effective spatiotemporal features to be extracted from multiscale networks through branch width optimization and information interaction at various scales.
PMID:38713574 | DOI:10.1109/TCYB.2024.3390805