Deep learning

Independently Trained Multi-Scale Registration Network Based on Image Pyramid

Tue, 2024-03-05 06:00

J Imaging Inform Med. 2024 Mar 5. doi: 10.1007/s10278-024-01019-8. Online ahead of print.

ABSTRACT

Image registration is a fundamental task in various applications of medical image analysis and plays a crucial role in auxiliary diagnosis, treatment, and surgical navigation. However, cardiac image registration is challenging due to the large non-rigid deformation of the heart and the complex anatomical structure. To address this challenge, this paper proposes an independently trained multi-scale registration network based on an image pyramid. By down-sampling the original input image multiple times, we can construct image pyramid pairs, and design a multi-scale registration network using image pyramid pairs of different resolutions as the training set. Using image pairs of different resolutions, train each registration network independently to extract image features from the image pairs at different resolutions. During the testing stage, the large deformation registration is decomposed into a multi-scale registration process. The deformation fields of different resolutions are fused by a step-by-step deformation method, thereby addressing the challenge of directly handling large deformations. Experiments were conducted on the open cardiac dataset ACDC (Automated Cardiac Diagnosis Challenge); the proposed method achieved an average Dice score of 0.828 in the experimental results. Through comparative experiments, it has been demonstrated that the proposed method effectively addressed the challenge of heart image registration and achieved superior registration results for cardiac images.

PMID:38441699 | DOI:10.1007/s10278-024-01019-8

Categories: Literature Watch

Modern acceleration in musculoskeletal MRI: applications, implications, and challenges

Tue, 2024-03-05 06:00

Skeletal Radiol. 2024 Mar 5. doi: 10.1007/s00256-024-04634-2. Online ahead of print.

ABSTRACT

Magnetic resonance imaging (MRI) is crucial for accurately diagnosing a wide spectrum of musculoskeletal conditions due to its superior soft tissue contrast resolution. However, the long acquisition times of traditional two-dimensional (2D) and three-dimensional (3D) fast and turbo spin-echo (TSE) pulse sequences can limit patient access and comfort. Recent technical advancements have introduced acceleration techniques that significantly reduce MRI times for musculoskeletal examinations. Key acceleration methods include parallel imaging (PI), simultaneous multi-slice acquisition (SMS), and compressed sensing (CS), enabling up to eightfold faster scans while maintaining image quality, resolution, and safety standards. These innovations now allow for 3- to 6-fold accelerated clinical musculoskeletal MRI exams, reducing scan times to 4 to 6 min for joints and spine imaging. Evolving deep learning-based image reconstruction promises even faster scans without compromising quality. Current research indicates that combining acceleration techniques, deep learning image reconstruction, and superresolution algorithms will eventually facilitate tenfold accelerated musculoskeletal MRI in routine clinical practice. Such rapid MRI protocols can drastically reduce scan times by 80-90% compared to conventional methods. Implementing these rapid imaging protocols does impact workflow, indirect costs, and workload for MRI technologists and radiologists, which requires careful management. However, the shift from conventional to accelerated, deep learning-based MRI enhances the value of musculoskeletal MRI by improving patient access and comfort and promoting sustainable imaging practices. This article offers a comprehensive overview of the technical aspects, benefits, and challenges of modern accelerated musculoskeletal MRI, guiding radiologists and researchers in this evolving field.

PMID:38441617 | DOI:10.1007/s00256-024-04634-2

Categories: Literature Watch

AbDPP: Target-oriented antibody design with pretraining and prior biological structure knowledge

Tue, 2024-03-05 06:00

Proteins. 2024 Mar 5. doi: 10.1002/prot.26676. Online ahead of print.

ABSTRACT

Antibodies represent a crucial class of complex protein therapeutics and are essential in the treatment of a wide range of human diseases. Traditional antibody discovery methods, such as hybridoma and phage display technologies, suffer from limitations including inefficiency and a restricted exploration of the immense space of potential antibodies. To overcome these limitations, we propose a novel method for generating antibody sequences using deep learning algorithms called AbDPP (target-oriented antibody design with pretraining and prior biological knowledge). AbDPP integrates a pretrained model for antibodies with biological region information, enabling the effective use of vast antibody sequence data and intricate biological system understanding to generate sequences. To target specific antigens, AbDPP incorporates an antibody property evaluation model, which is further optimized based on evaluation results to generate more focused sequences. The efficacy of AbDPP was assessed through multiple experiments, evaluating its ability to generate amino acids, improve neutralization and binding, maintain sequence consistency, and improve sequence diversity. Results demonstrated that AbDPP outperformed other methods in terms of the performance and quality of generated sequences, showcasing its potential to enhance antibody design and screening efficiency. In summary, this study contributes to the field by offering an innovative deep learning-based method for antibody generation, addressing some limitations of traditional approaches, and underscoring the importance of integrating a specific antibody pretrained model and the biological properties of antibodies in generating novel sequences. The code and documentation underlying this article are freely available at https://github.com/zlfyj/AbDPP.

PMID:38441337 | DOI:10.1002/prot.26676

Categories: Literature Watch

Studying Psychosis Using Natural Language Generation: A Review of Emerging Opportunities

Tue, 2024-03-05 06:00

Biol Psychiatry Cogn Neurosci Neuroimaging. 2023 Oct;8(10):994-1004. doi: 10.1016/j.bpsc.2023.04.009. Epub 2023 Apr 30.

ABSTRACT

Disrupted language in psychotic disorders, such as schizophrenia, can manifest as false contents and formal deviations, often described as thought disorder. These features play a critical role in the social dysfunction associated with psychosis, but we continue to lack insights regarding how and why these symptoms develop. Natural language generation (NLG) is a field of computer science that focuses on generating human-like language for various applications. The theory that psychosis is related to the evolution of language in humans suggests that NLG systems that are sufficiently evolved to generate human-like language may also exhibit psychosis-like features. In this conceptual review, we propose using NLG systems that are at various stages of development as in silico tools to study linguistic features of psychosis. We argue that a program of in silico experimental research on the network architecture, function, learning rules, and training of NLG systems can help us understand better why thought disorder occurs in patients. This will allow us to gain a better understanding of the relationship between language and psychosis and potentially pave the way for new therapeutic approaches to address this vexing challenge.

PMID:38441079 | DOI:10.1016/j.bpsc.2023.04.009

Categories: Literature Watch

Post-translational modifications of proteins in cardiovascular diseases examined by proteomic approaches

Tue, 2024-03-05 06:00

FEBS J. 2024 Mar 5. doi: 10.1111/febs.17108. Online ahead of print.

ABSTRACT

Over 400 different types of post-translational modifications (PTMs) have been reported and over 200 various types of PTMs have been discovered using mass spectrometry (MS)-based proteomics. MS-based proteomics has proven to be a powerful method capable of global PTM mapping with the identification of modified proteins/peptides, the localization of PTM sites and PTM quantitation. PTMs play regulatory roles in protein functions, activities and interactions in various heart related diseases, such as ischemia/reperfusion injury, cardiomyopathy and heart failure. The recognition of PTMs that are specific to cardiovascular pathology and the clarification of the mechanisms underlying these PTMs at molecular levels are crucial for discovery of novel biomarkers and application in a clinical setting. With sensitive MS instrumentation and novel biostatistical methods for precise processing of the data, low-abundance PTMs can be successfully detected and the beneficial or unfavorable effects of specific PTMs on cardiac function can be determined. Moreover, computational proteomic strategies that can predict PTM sites based on MS data have gained an increasing interest and can contribute to characterization of PTM profiles in cardiovascular disorders. More recently, machine learning- and deep learning-based methods have been employed to predict the locations of PTMs and explore PTM crosstalk. In this review article, the types of PTMs are briefly overviewed, approaches for PTM identification/quantitation in MS-based proteomics are discussed and recently published proteomic studies on PTMs associated with cardiovascular diseases are included.

PMID:38440918 | DOI:10.1111/febs.17108

Categories: Literature Watch

Using a deep learning prior for accelerating hyperpolarized (13) C MRSI on synthetic cancer datasets

Tue, 2024-03-05 06:00

Magn Reson Med. 2024 Mar 5. doi: 10.1002/mrm.30053. Online ahead of print.

ABSTRACT

PURPOSE: We aimed to incorporate a deep learning prior with k-space data fidelity for accelerating hyperpolarized carbon-13 MRSI, demonstrated on synthetic cancer datasets.

METHODS: A two-site exchange model, derived from the Bloch equation of MR signal evolution, was firstly used in simulating training and testing data, that is, synthetic phantom datasets. Five singular maps generated from each simulated dataset were used to train a deep learning prior, which was then employed with the fidelity term to reconstruct the undersampled MRI k-space data. The proposed method was assessed on synthetic human brain tumor images (N = 33), prostate cancer images (N = 72), and mouse tumor images (N = 58) for three undersampling factors and 2.5% additive Gaussian noise. Furthermore, varied levels of Gaussian noise with SDs of 2.5%, 5%, and 10% were added on synthetic prostate cancer data, and corresponding reconstruction results were evaluated.

RESULTS: For quantitative evaluation, peak SNRs were approximately 32 dB, and the accuracy was generally improved for 5 to 8 dB compared with those from compressed sensing with L1-norm regularization or total variation regularization. Reasonable normalized RMS error were obtained. Our method also worked robustly against noise, even on a data with noise SD of 10%.

CONCLUSION: The proposed singular value decomposition + iterative deep learning model could be considered as a general framework that extended the application of deep learning MRI reconstruction to metabolic imaging. The morphology of tumors and metabolic images could be measured robustly in six times acceleration using our method.

PMID:38440832 | DOI:10.1002/mrm.30053

Categories: Literature Watch

Automated and reusable deep learning (AutoRDL) framework for predicting response to neoadjuvant chemotherapy and axillary lymph node metastasis in breast cancer using ultrasound images: a retrospective, multicentre study

Tue, 2024-03-05 06:00

EClinicalMedicine. 2024 Feb 27;69:102499. doi: 10.1016/j.eclinm.2024.102499. eCollection 2024 Mar.

ABSTRACT

BACKGROUND: Previous deep learning models have been proposed to predict the pathological complete response (pCR) and axillary lymph node metastasis (ALNM) in breast cancer. Yet, the models often leveraged multiple frameworks, required manual annotation, and discarded low-quality images. We aimed to develop an automated and reusable deep learning (AutoRDL) framework for tumor detection and prediction of pCR and ALNM using ultrasound images with diverse qualities.

METHODS: The AutoRDL framework includes a You Only Look Once version 5 (YOLOv5) network for tumor detection and a progressive multi-granularity (PMG) network for pCR and ALNM prediction. The training cohort and the internal validation cohort were recruited from Guangdong Provincial People's Hospital (GPPH) between November 2012 and May 2021. The two external validation cohorts were recruited from the First Affiliated Hospital of Kunming Medical University (KMUH), between January 2016 and December 2019, and Shunde Hospital of Southern Medical University (SHSMU) between January 2014 and July 2015. Prior to model training, super-resolution via iterative refinement (SR3) was employed to improve the spatial resolution of low-quality images from the KMUH. We developed three models for predicting pCR and ALNM: a clinical model using multivariable logistic regression analysis, an image model utilizing the PMG network, and a combined model that integrates both clinical and image data using the PMG network.

FINDINGS: The YOLOv5 network demonstrated excellent accuracy in tumor detection, achieving average precisions of 0.880-0.921 during validation. In terms of pCR prediction, the combined modelpost-SR3 outperformed the combined modelpre-SR3, image modelpost-SR3, image modelpre-SR3, and clinical model (AUC: 0.833 vs 0.822 vs 0.806 vs 0.790 vs 0.712, all p < 0.05) in the external validation cohort (KMUH). Consistently, the combined modelpost-SR3 exhibited the highest accuracy in ALNM prediction, surpassing the combined modelpre-SR3, image modelpost-SR3, image modelpre-SR3, and clinical model (AUC: 0.825 vs 0.806 vs 0.802 vs 0.787 vs 0.703, all p < 0.05) in the external validation cohort 1 (KMUH). In the external validation cohort 2 (SHSMU), the combined model also showed superiority over the clinical and image models (0.819 vs 0.712 vs 0.806, both p < 0.05).

INTERPRETATION: Our proposed AutoRDL framework is feasible in automatically predicting pCR and ALNM in real-world settings, which has the potential to assist clinicians in optimizing individualized treatment options for patients.

FUNDING: National Key Research and Development Program of China (2023YFF1204600); National Natural Science Foundation of China (82227802, 82302306); Clinical Frontier Technology Program of the First Affiliated Hospital of Jinan University, China (JNU1AF-CFTP-2022-a01201); Science and Technology Projects in Guangzhou (202201020022, 2023A03J1036, 2023A03J1038); Science and Technology Youth Talent Nurturing Program of Jinan University (21623209); and Postdoctoral Science Foundation of China (2022M721349).

PMID:38440400 | PMC:PMC10909626 | DOI:10.1016/j.eclinm.2024.102499

Categories: Literature Watch

A robust approach for multi-type classification of brain tumor using deep feature fusion

Tue, 2024-03-05 06:00

Front Neurosci. 2024 Feb 19;18:1288274. doi: 10.3389/fnins.2024.1288274. eCollection 2024.

ABSTRACT

Brain tumors can be classified into many different types based on their shape, texture, and location. Accurate diagnosis of brain tumor types can help doctors to develop appropriate treatment plans to save patients' lives. Therefore, it is very crucial to improve the accuracy of this classification system for brain tumors to assist doctors in their treatment. We propose a deep feature fusion method based on convolutional neural networks to enhance the accuracy and robustness of brain tumor classification while mitigating the risk of over-fitting. Firstly, the extracted features of three pre-trained models including ResNet101, DenseNet121, and EfficientNetB0 are adjusted to ensure that the shape of extracted features for the three models is the same. Secondly, the three models are fine-tuned to extract features from brain tumor images. Thirdly, pairwise summation of the extracted features is carried out to achieve feature fusion. Finally, classification of brain tumors based on fused features is performed. The public datasets including Figshare (Dataset 1) and Kaggle (Dataset 2) are used to verify the reliability of the proposed method. Experimental results demonstrate that the fusion method of ResNet101 and DenseNet121 features achieves the best performance, which achieves classification accuracy of 99.18 and 97.24% in Figshare dataset and Kaggle dataset, respectively.

PMID:38440396 | PMC:PMC10909817 | DOI:10.3389/fnins.2024.1288274

Categories: Literature Watch

Deep-learning for automated detection of MSU deposits on DECT: evaluating impact on efficiency and reader confidence

Tue, 2024-03-05 06:00

Front Radiol. 2024 Feb 19;4:1330399. doi: 10.3389/fradi.2024.1330399. eCollection 2024.

ABSTRACT

INTRODUCTION: Dual-energy CT (DECT) is a non-invasive way to determine the presence of monosodium urate (MSU) crystals in the workup of gout. Color-coding distinguishes MSU from calcium following material decomposition and post-processing. Manually identifying these foci (most commonly labeled green) is tedious, and an automated detection system could streamline the process. This study aims to evaluate the impact of a deep-learning (DL) algorithm developed for detecting green pixelations on DECT on reader time, accuracy, and confidence.

METHODS: We collected a sample of positive and negative DECTs, reviewed twice-once with and once without the DL tool-with a 2-week washout period. An attending musculoskeletal radiologist and a fellow separately reviewed the cases, simulating clinical workflow. Metrics such as time taken, confidence in diagnosis, and the tool's helpfulness were recorded and statistically analyzed.

RESULTS: We included thirty DECTs from different patients. The DL tool significantly reduced the reading time for the trainee radiologist (p = 0.02), but not for the attending radiologist (p = 0.15). Diagnostic confidence remained unchanged for both (p = 0.45). However, the DL model identified tiny MSU deposits that led to a change in diagnosis in two cases for the in-training radiologist and one case for the attending radiologist. In 3/3 of these cases, the diagnosis was correct when using DL.

CONCLUSIONS: The implementation of the developed DL model slightly reduced reading time for our less experienced reader and led to improved diagnostic accuracy. There was no statistically significant difference in diagnostic confidence when studies were interpreted without and with the DL model.

PMID:38440382 | PMC:PMC10909828 | DOI:10.3389/fradi.2024.1330399

Categories: Literature Watch

Learning Representations from Heart Sound: A Comparative Study on Shallow and Deep Models

Tue, 2024-03-05 06:00

Cyborg Bionic Syst. 2024 Mar 4;5:0075. doi: 10.34133/cbsystems.0075. eCollection 2024.

ABSTRACT

Leveraging the power of artificial intelligence to facilitate an automatic analysis and monitoring of heart sounds has increasingly attracted tremendous efforts in the past decade. Nevertheless, lacking on standard open-access database made it difficult to maintain a sustainable and comparable research before the first release of the PhysioNet CinC Challenge Dataset. However, inconsistent standards on data collection, annotation, and partition are still restraining a fair and efficient comparison between different works. To this line, we introduced and benchmarked a first version of the Heart Sounds Shenzhen (HSS) corpus. Motivated and inspired by the previous works based on HSS, we redefined the tasks and make a comprehensive investigation on shallow and deep models in this study. First, we segmented the heart sound recording into shorter recordings (10 s), which makes it more similar to the human auscultation case. Second, we redefined the classification tasks. Besides using the 3 class categories (normal, moderate, and mild/severe) adopted in HSS, we added a binary classification task in this study, i.e., normal and abnormal. In this work, we provided detailed benchmarks based on both the classic machine learning and the state-of-the-art deep learning technologies, which are reproducible by using open-source toolkits. Last but not least, we analyzed the feature contributions of best performance achieved by the benchmark to make the results more convincing and interpretable.

PMID:38440319 | PMC:PMC10911857 | DOI:10.34133/cbsystems.0075

Categories: Literature Watch

Football referee gesture recognition algorithm based on YOLOv8s

Tue, 2024-03-05 06:00

Front Comput Neurosci. 2024 Feb 19;18:1341234. doi: 10.3389/fncom.2024.1341234. eCollection 2024.

ABSTRACT

Gesture serves as a crucial means of communication between individuals and between humans and machines. In football matches, referees communicate judgment information through gestures. Due to the diversity and complexity of referees' gestures and interference factors, such as the players, spectators, and camera angles, automated football referee gesture recognition (FRGR) has become a challenging task. The existing methods based on visual sensors often cannot provide a satisfactory performance. To tackle FRGR problems, we develop a deep learning model based on YOLOv8s. Three improving and optimizing strategies are integrated to solve these problems. First, a Global Attention Mechanism (GAM) is employed to direct the model's attention to the hand gestures and minimize the background interference. Second, a P2 detection head structure is integrated into the YOLOv8s model to enhance the accuracy of detecting smaller objects at a distance. Third, a new loss function based on the Minimum Point Distance Intersection over Union (MPDIoU) is used to effectively utilize anchor boxes with the same shape, but different sizes. Finally, experiments are executed on a dataset of six hand gestures among 1,200 images. The proposed method was compared with seven different existing models and 10 different optimization models. The proposed method achieves a precision rate of 89.3%, a recall rate of 88.9%, a mAP@0.5 rate of 89.9%, and a mAP@0.5:0.95 rate of 77.3%. These rates are approximately 1.4%, 2.0%, 1.1%, and 5.4% better than those of the newest YOLOv8s, respectively. The proposed method has right prospect in automated gesture recognition for football matches.

PMID:38440133 | PMC:PMC10910025 | DOI:10.3389/fncom.2024.1341234

Categories: Literature Watch

Deep Learning-Based Psoriasis Assessment: Harnessing Clinical Trial Imaging for Accurate Psoriasis Area Severity Index Prediction

Tue, 2024-03-05 06:00

Digit Biomark. 2024 Mar 4;8(1):13-21. doi: 10.1159/000536499. eCollection 2024 Jan-Dec.

ABSTRACT

INTRODUCTION: Image-based machine learning holds great promise for facilitating clinical care; however, the datasets often used for model training differ from the interventional clinical trial-based findings frequently used to inform treatment guidelines. Here, we draw on longitudinal imaging of psoriasis patients undergoing treatment in the Ultima 2 clinical trial (NCT02684357), including 2,700 body images with psoriasis area severity index (PASI) annotations by uniformly trained dermatologists.

METHODS: An image-processing workflow integrating clinical photos of multiple body regions into one model pipeline was developed, which we refer to as the "One-Step PASI" framework due to its simultaneous body detection, lesion detection, and lesion severity classification. Group-stratified cross-validation was performed with 145 deep convolutional neural network models combined in an ensemble learning architecture.

RESULTS: The highest-performing model demonstrated a mean absolute error of 3.3, Lin's concordance correlation coefficient of 0.86, and Pearson correlation coefficient of 0.90 across a wide range of PASI scores comprising disease classifications of clear skin, mild, and moderate-to-severe disease. Within-person, time-series analysis of model performance demonstrated that PASI predictions closely tracked the trajectory of physician scores from severe to clear skin without systematically over- or underestimating PASI scores or percent changes from baseline.

CONCLUSION: This study demonstrates the potential of image processing and deep learning to translate otherwise inaccessible clinical trial data into accurate, extensible machine learning models to assess therapeutic efficacy.

PMID:38440046 | PMC:PMC10911790 | DOI:10.1159/000536499

Categories: Literature Watch

Deep-learning architecture for PM(2.5) concentration prediction: A review

Tue, 2024-03-05 06:00

Environ Sci Ecotechnol. 2024 Feb 17;21:100400. doi: 10.1016/j.ese.2024.100400. eCollection 2024 Sep.

ABSTRACT

Accurately predicting the concentration of fine particulate matter (PM2.5) is crucial for evaluating air pollution levels and public exposure. Recent advancements have seen a significant rise in using deep learning (DL) models for forecasting PM2.5 concentrations. Nonetheless, there is a lack of unified and standardized frameworks for assessing the performance of DL-based PM2.5 prediction models. Here we extensively reviewed those DL-based hybrid models for forecasting PM2.5 levels according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We examined the similarities and differences among various DL models in predicting PM2.5 by comparing their complexity and effectiveness. We categorized PM2.5 DL methodologies into seven types based on performance and application conditions, including four types of DL-based models and three types of hybrid learning models. Our research indicates that established deep learning architectures are commonly used and respected for their efficiency. However, many of these models often fall short in terms of innovation and interpretability. Conversely, models hybrid with traditional approaches, like deterministic and statistical models, exhibit high interpretability but compromise on accuracy and speed. Besides, hybrid DL models, representing the pinnacle of innovation among the studied models, encounter issues with interpretability. We introduce a novel three-dimensional evaluation framework, i.e., Dataset-Method-Experiment Standard (DMES) to unify and standardize the evaluation for PM2.5 predictions using DL models. This review provides a framework for future evaluations of DL-based models, which could inspire researchers to standardize DL model usage in PM2.5 prediction and improve the quality of related studies.

PMID:38439920 | PMC:PMC10910069 | DOI:10.1016/j.ese.2024.100400

Categories: Literature Watch

Artificial intelligence algorithms for predicting post-operative ileus after laparoscopic surgery

Tue, 2024-03-05 06:00

Heliyon. 2024 Feb 22;10(5):e26580. doi: 10.1016/j.heliyon.2024.e26580. eCollection 2024 Mar 15.

ABSTRACT

OBJECTIVE: By constructing a predictive model using machine learning and deep learning technologies, we aim to understand the risk factors for postoperative intestinal obstruction in laparoscopic colorectal cancer patients, and establish an effective artificial intelligence-based predictive model to guide individualized prevention and treatment, thus improving patient outcomes.

METHODS: We constructed a model of the artificial intelligence algorithm in Python. Subjects were randomly assigned to either a training set for variable identification and model construction, or a test set for testing model performance, at a ratio of 7:3. The model was trained with ten algorithms. We used the AUC values of the ROC curves, as well as accuracy, precision, recall rate and F1 scores.

RESULTS: The results of feature engineering composited with the GBDT algorithm showed that opioid use, anesthesia duration, and body weight were the top three factors in the development of POI. We used ten machine learning and deep learning algorithms to validate the model, and the results were as follows: the three algorithms with best accuracy were XGB (0.807), Decision Tree (0.807) and Neural DecisionTree (0.807); the two algorithms with best precision were XGB (0.500) and Decision Tree (0.500); the two algorithms with best recall rate were adab (0.243) and Decision Tree (0.135); the two algorithms with highest F1 score were adab (0.290) and Decision Tree (0.213); and the three algorithms with best AUC were Gradient Boosting (0.678), XGB (0.638) and LinearSVC (0.633).

CONCLUSION: This study shows that XGB and Decision Tree are the two best algorithms for predicting the risk of developing ileus after laparoscopic colon cancer surgery. It provides new insight and approaches to the field of postoperative intestinal obstruction in colorectal cancer through the application of machine learning techniques, thereby improving our understanding of the disease and offering strong support for clinical decision-making.

PMID:38439857 | PMC:PMC10909660 | DOI:10.1016/j.heliyon.2024.e26580

Categories: Literature Watch

Application of visual transformer in renal image analysis

Mon, 2024-03-04 06:00

Biomed Eng Online. 2024 Mar 5;23(1):27. doi: 10.1186/s12938-024-01209-z.

ABSTRACT

Deep Self-Attention Network (Transformer) is an encoder-decoder architectural model that excels in establishing long-distance dependencies and is first applied in natural language processing. Due to its complementary nature with the inductive bias of convolutional neural network (CNN), Transformer has been gradually applied to medical image processing, including kidney image processing. It has become a hot research topic in recent years. To further explore new ideas and directions in the field of renal image processing, this paper outlines the characteristics of the Transformer network model and summarizes the application of the Transformer-based model in renal image segmentation, classification, detection, electronic medical records, and decision-making systems, and compared with CNN-based renal image processing algorithm, analyzing the advantages and disadvantages of this technique in renal image processing. In addition, this paper gives an outlook on the development trend of Transformer in renal image processing, which provides a valuable reference for a lot of renal image analysis.

PMID:38439100 | DOI:10.1186/s12938-024-01209-z

Categories: Literature Watch

Self-supervised pre-training for joint optic disc and cup segmentation via attention-aware network

Mon, 2024-03-04 06:00

BMC Ophthalmol. 2024 Mar 4;24(1):98. doi: 10.1186/s12886-024-03376-y.

ABSTRACT

Image segmentation is a fundamental task in deep learning, which is able to analyse the essence of the images for further development. However, for the supervised learning segmentation method, collecting pixel-level labels is very time-consuming and labour-intensive. In the medical image processing area for optic disc and cup segmentation, we consider there are two challenging problems that remain unsolved. One is how to design an efficient network to capture the global field of the medical image and execute fast in real applications. The other is how to train the deep segmentation network using a few training data due to some medical privacy issues. In this paper, to conquer such issues, we first design a novel attention-aware segmentation model equipped with the multi-scale attention module in the pyramid structure-like encoder-decoder network, which can efficiently learn the global semantics and the long-range dependencies of the input images. Furthermore, we also inject the prior knowledge that the optic cup lies inside the optic disc by a novel loss function. Then, we propose a self-supervised contrastive learning method for optic disc and cup segmentation. The unsupervised feature representation is learned by matching an encoded query to a dictionary of encoded keys using a contrastive technique. Finetuning the pre-trained model using the proposed loss function can help achieve good performance for the task. To validate the effectiveness of the proposed method, extensive systemic evaluations on different public challenging optic disc and cup benchmarks, including DRISHTI-GS and REFUGE datasets demonstrate the superiority of the proposed method, which can achieve new state-of-the-art performance approaching 0.9801 and 0.9087 F1 score respectively while gaining 0.9657 D C disc and 0.8976 D C cup . The code will be made publicly available.

PMID:38438876 | DOI:10.1186/s12886-024-03376-y

Categories: Literature Watch

Transfer learning-based PET/CT three-dimensional convolutional neural network fusion of image and clinical information for prediction of EGFR mutation in lung adenocarcinoma

Mon, 2024-03-04 06:00

BMC Med Imaging. 2024 Mar 4;24(1):54. doi: 10.1186/s12880-024-01232-5.

ABSTRACT

BACKGROUND: To introduce a three-dimensional convolutional neural network (3D CNN) leveraging transfer learning for fusing PET/CT images and clinical data to predict EGFR mutation status in lung adenocarcinoma (LADC).

METHODS: Retrospective data from 516 LADC patients, encompassing preoperative PET/CT images, clinical information, and EGFR mutation status, were divided into training (n = 404) and test sets (n = 112). Several deep learning models were developed utilizing transfer learning, involving CT-only and PET-only models. A dual-stream model fusing PET and CT and a three-stream transfer learning model (TS_TL) integrating clinical data were also developed. Image preprocessing includes semi-automatic segmentation, resampling, and image cropping. Considering the impact of class imbalance, the performance of the model was evaluated using ROC curves and AUC values.

RESULTS: TS_TL model demonstrated promising performance in predicting the EGFR mutation status, with an AUC of 0.883 (95%CI = 0.849-0.917) in the training set and 0.730 (95%CI = 0.629-0.830) in the independent test set. Particularly in advanced LADC, the model achieved an AUC of 0.871 (95%CI = 0.823-0.919) in the training set and 0.760 (95%CI = 0.638-0.881) in the test set. The model identified distinct activation areas in solid or subsolid lesions associated with wild and mutant types. Additionally, the patterns captured by the model were significantly altered by effective tyrosine kinase inhibitors treatment, leading to notable changes in predicted mutation probabilities.

CONCLUSION: PET/CT deep learning model can act as a tool for predicting EGFR mutation in LADC. Additionally, it offers clinicians insights for treatment decisions through evaluations both before and after treatment.

PMID:38438844 | DOI:10.1186/s12880-024-01232-5

Categories: Literature Watch

Shallow and deep learning classifiers in medical image analysis

Mon, 2024-03-04 06:00

Eur Radiol Exp. 2024 Mar 5;8(1):26. doi: 10.1186/s41747-024-00428-2.

ABSTRACT

An increasingly strong connection between artificial intelligence and medicine has enabled the development of predictive models capable of supporting physicians' decision-making. Artificial intelligence encompasses much more than machine learning, which nevertheless is its most cited and used sub-branch in the last decade. Since most clinical problems can be modeled through machine learning classifiers, it is essential to discuss their main elements. This review aims to give primary educational insights on the most accessible and widely employed classifiers in radiology field, distinguishing between "shallow" learning (i.e., traditional machine learning) algorithms, including support vector machines, random forest and XGBoost, and "deep" learning architectures including convolutional neural networks and vision transformers. In addition, the paper outlines the key steps for classifiers training and highlights the differences between the most common algorithms and architectures. Although the choice of an algorithm depends on the task and dataset dealing with, general guidelines for classifier selection are proposed in relation to task analysis, dataset size, explainability requirements, and available computing resources. Considering the enormous interest in these innovative models and architectures, the problem of machine learning algorithms interpretability is finally discussed, providing a future perspective on trustworthy artificial intelligence.Relevance statement The growing synergy between artificial intelligence and medicine fosters predictive models aiding physicians. Machine learning classifiers, from shallow learning to deep learning, are offering crucial insights for the development of clinical decision support systems in healthcare. Explainability is a key feature of models that leads systems toward integration into clinical practice. Key points • Training a shallow classifier requires extracting disease-related features from region of interests (e.g., radiomics).• Deep classifiers implement automatic feature extraction and classification.• The classifier selection is based on data and computational resources availability, task, and explanation needs.

PMID:38438821 | DOI:10.1186/s41747-024-00428-2

Categories: Literature Watch

Optimizing Coronary Computed Tomography Angiography Using a Novel Deep Learning-Based Algorithm

Mon, 2024-03-04 06:00

J Imaging Inform Med. 2024 Mar 4. doi: 10.1007/s10278-024-01033-w. Online ahead of print.

ABSTRACT

Coronary computed tomography angiography (CCTA) is an essential part of the diagnosis of chronic coronary syndrome (CCS) in patients with low-to-intermediate pre-test probability. The minimum technical requirement is 64-row multidetector CT (64-MDCT), which is still frequently used, although it is prone to motion artifacts because of its limited temporal resolution and z-coverage. In this study, we evaluate the potential of a deep-learning-based motion correction algorithm (MCA) to eliminate these motion artifacts. 124 64-MDCT-acquired CCTA examinations with at least minor motion artifacts were included. Images were reconstructed using a conventional reconstruction algorithm (CA) and a MCA. Image quality (IQ), according to a 5-point Likert score, was evaluated per-segment, per-artery, and per-patient and was correlated with potentially disturbing factors (heart rate (HR), intra-cycle HR changes, BMI, age, and sex). Comparison was done by Wilcoxon-Signed-Rank test, and correlation by Spearman's Rho. Per-patient, insufficient IQ decreased by 5.26%, and sufficient IQ increased by 9.66% with MCA. Per-artery, insufficient IQ of the right coronary artery (RCA) decreased by 18.18%, and sufficient IQ increased by 27.27%. Per-segment, insufficient IQ in segments 1 and 2 decreased by 11.51% and 24.78%, respectively, and sufficient IQ increased by 10.62% and 18.58%, respectively. Total artifacts per-artery decreased in the RCA from 3.11 ± 1.65 to 2.26 ± 1.52. HR dependence of RCA IQ decreased to intermediate correlation in images with MCA reconstruction. The applied MCA improves the IQ of 64-MDCT-acquired images and reduces the influence of HR on IQ, increasing 64-MDCT validity in the diagnosis of CCS.

PMID:38438697 | DOI:10.1007/s10278-024-01033-w

Categories: Literature Watch

From CNN to Transformer: A Review of Medical Image Segmentation Models

Mon, 2024-03-04 06:00

J Imaging Inform Med. 2024 Mar 4. doi: 10.1007/s10278-024-00981-7. Online ahead of print.

ABSTRACT

Medical image segmentation is an important step in medical image analysis, especially as a crucial prerequisite for efficient disease diagnosis and treatment. The use of deep learning for image segmentation has become a prevalent trend. The widely adopted approach currently is U-Net and its variants. Moreover, with the remarkable success of pre-trained models in natural language processing tasks, transformer-based models like TransUNet have achieved desirable performance on multiple medical image segmentation datasets. Recently, the Segment Anything Model (SAM) and its variants have also been attempted for medical image segmentation. In this paper, we conduct a survey of the most representative seven medical image segmentation models in recent years. We theoretically analyze the characteristics of these models and quantitatively evaluate their performance on Tuberculosis Chest X-rays, Ovarian Tumors, and Liver Segmentation datasets. Finally, we discuss the main challenges and future trends in medical image segmentation. Our work can assist researchers in the related field to quickly establish medical segmentation models tailored to specific regions.

PMID:38438696 | DOI:10.1007/s10278-024-00981-7

Categories: Literature Watch

Pages