Deep learning
CNN-Based Neurodegenerative Disease Classification Using QR-Represented Gait Data
Brain Behav. 2024 Oct;14(10):e70100. doi: 10.1002/brb3.70100.
ABSTRACT
PURPOSE: The primary aim of this study is to develop an effective and reliable diagnostic system for neurodegenerative diseases by utilizing gait data transformed into QR codes and classified using convolutional neural networks (CNNs). The objective of this method is to enhance the precision of diagnosing neurodegenerative diseases, including amyotrophic lateral sclerosis (ALS), Parkinson's disease (PD), and Huntington's disease (HD), through the introduction of a novel approach to analyze gait patterns.
METHODS: The research evaluates the CNN-based classification approach using QR-represented gait data to address the diagnostic challenges associated with neurodegenerative diseases. The gait data of subjects were converted into QR codes, which were then classified using a CNN deep learning model. The dataset includes recordings from patients with Parkinson's disease (n = 15), Huntington's disease (n = 20), and amyotrophic lateral sclerosis (n = 13), and from 16 healthy controls.
RESULTS: The accuracy rates obtained through 10-fold cross-validation were as follows: 94.86% for NDD versus control, 95.81% for PD versus control, 93.56% for HD versus control, 97.65% for ALS versus control, and 84.65% for PD versus HD versus ALS versus control. These results demonstrate the potential of the proposed system in distinguishing between different neurodegenerative diseases and control groups.
CONCLUSION: The results indicate that the designed system may serve as a complementary tool for the diagnosis of neurodegenerative diseases, particularly in individuals who already present with varying degrees of motor impairment. Further validation and research are needed to establish its wider applicability.
PMID:39465642 | DOI:10.1002/brb3.70100
Evaluation of root canal filling length on periapical radiograph using artificial intelligence
Oral Radiol. 2024 Oct 27. doi: 10.1007/s11282-024-00781-3. Online ahead of print.
ABSTRACT
OBJECTIVES: This work proposes a novel method to evaluate root canal filling (RCF) success using artificial intelligence (AI) and image analysis techniques.
METHODS: 1121 teeth with root canal treatment in 597 periapical radiographs (PARs) were anonymized and manually labeled. First, RCFs were segmented using 5 different state-of-the-art deep learning models based on convolutional neural networks. Their performances were compared based on the intersection over union (IoU), dice score and accuracy. Additionally, fivefold cross validation was applied for the best-performing model and their outputs were later used for further analysis. Secondly, images were processed via a graphical user interface (GUI) that allows dental clinicians to mark the apex of the tooth, which was used to find the distance between the apex of the tooth and the nearest RCF prediction of the deep learning model towards it. The distance can show whether the RCF is normal, short or long.
RESULTS: Model performances were evaluated by well-known evaluation metrics for segmentation such as IoU, Dice score and accuracy. CNN-based models can achieve an accuracy of 88%, an IoU of 79% and Dice score of 88% in segmenting root canal fillings.
CONCLUSIONS: Our study demonstrates that AI-based solutions present accurate and reliable performance for root canal filling evaluation.
PMID:39465425 | DOI:10.1007/s11282-024-00781-3
Dual-attention transformer-based hybrid network for multi-modal medical image segmentation
Sci Rep. 2024 Oct 28;14(1):25704. doi: 10.1038/s41598-024-76234-y.
ABSTRACT
Accurate medical image segmentation plays a vital role in clinical practice. Convolutional Neural Network and Transformer are mainstream architectures for this task. However, convolutional neural network lacks the ability of modeling global dependency while Transformer cannot extract local details. In this paper, we propose DATTNet, Dual ATTention Network, an encoder-decoder deep learning model for medical image segmentation. DATTNet is exploited in hierarchical fashion with two novel components: (1) Dual Attention module is designed to model global dependency in spatial and channel dimensions. (2) Context Fusion Bridge is presented to remix the feature maps with multiple scales and construct their correlations. The experiments on ACDC, Synapse and Kvasir-SEG datasets are conducted to evaluate the performance of DATTNet. Our proposed model shows superior performance, effectiveness and robustness compared to SOTA methods, with mean Dice Similarity Coefficient scores of 92.2%, 84.5% and 89.1% on cardiac, abdominal organs and gastrointestinal poly segmentation tasks. The quantitative and qualitative results demonstrate that our proposed DATTNet attains favorable capability across different modalities (MRI, CT, and endoscopy) and can be generalized to various tasks. Therefore, it is envisaged as being potential for practicable clinical applications. The code has been released on https://github.com/MhZhang123/DATTNet/tree/main .
PMID:39465274 | DOI:10.1038/s41598-024-76234-y
Bone scintigraphy based on deep learning model and modified growth optimizer
Sci Rep. 2024 Oct 27;14(1):25627. doi: 10.1038/s41598-024-73991-8.
ABSTRACT
Bone scintigraphy is recognized as an efficient diagnostic method for whole-body screening for bone metastases. At the moment, whole-body bone scan image analysis is primarily dependent on manual reading by nuclear medicine doctors. However, manual analysis needs substantial experience and is both stressful and time-consuming. To address the aforementioned issues, this work proposed a machine-learning technique that uses phases to detect Bone scintigraphy. The first phase in the proposed model is the feature extraction and it was conducted based on integrating the Mobile Vision Transformer (MobileViT) model in our framework to capture highly complex representations from raw medical imagery using two primary components including ViT and lightweight CNN featuring a limited number of parameters. In addition, the second phase is named feature selection, and it is dependent on the Arithmetic Optimization Algorithm (AOA) being used to improve the Growth Optimizer (GO). We evaluate the performance of the proposed FS model, named GOAOA using a set of 18 UCI datasets. Additionally, the applicability of Bone scintigraphy for real-world application is evaluated using 2800 bone scan images (1400 normal and 1400 abnormal). The results and statistical analysis revealed that the proposed GOAOA algorithm as an FS technique outperforms the other FS algorithms employed in this study.
PMID:39465262 | DOI:10.1038/s41598-024-73991-8
A robust deep learning approach for identification of RNA 5-methyluridine sites
Sci Rep. 2024 Oct 28;14(1):25688. doi: 10.1038/s41598-024-76148-9.
ABSTRACT
RNA 5-methyluridine (m5U) sites play a significant role in understanding RNA modifications, which influence numerous biological processes such as gene expression and cellular functioning. Consequently, the identification of m5U sites can play a vital role in the integrity, structure, and function of RNA molecules. Therefore, this study introduces GRUpred-m5U, a novel deep learning-based framework based on a gated recurrent unit in mature RNA and full transcript RNA datasets. We used three descriptor groups: nucleic acid composition, pseudo nucleic acid composition, and physicochemical properties, which include five feature extraction methods ENAC, Kmer, DPCP, DPCP type 2, and PseDNC. Initially, we aggregated all the feature extraction methods and created a new merged set. Three hybrid models were developed employing deep-learning methods and evaluated through 10-fold cross-validation with seven evaluation metrics. After a comprehensive evaluation, the GRUpred-m5U model outperformed the other applied models, obtaining 98.41% and 96.70% accuracy on the two datasets, respectively. To our knowledge, the proposed model outperformed all the existing state-of-the-art technology. The proposed supervised machine learning model was evaluated using unsupervised machine learning techniques such as principal component analysis (PCA), and it was observed that the proposed method provided a valid performance for identifying m5U. Considering its multi-layered construction, the GRUpred-m5U model has tremendous potential for future applications in the biological industry. The model, which consisted of neurons processing complicated input, excelled at pattern recognition and produced reliable results. Despite its greater size, the model obtained accurate results, essential in detecting m5U.
PMID:39465261 | DOI:10.1038/s41598-024-76148-9
Radar-Based Fall Detection: A Survey
IEEE Robot Autom Mag. 2024 Sep;31(3):170-185. doi: 10.1109/MRA.2024.3352851. Epub 2024 Feb 5.
ABSTRACT
Fall detection, particularly critical for high-risk demographics like the elderly, is a key public health concern where timely detection can greatly minimize harm. With the advancements in radio frequency technology, radar has emerged as a powerful tool for human detection and tracking. Traditional machine learning algorithms, such as Support Vector Machines (SVM) and k-Nearest Neighbors (kNN), have shown promising outcomes. However, deep learning approaches, notably Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), have outperformed in learning intricate features and managing large, unstructured datasets. This survey offers an in-depth analysis of radar-based fall detection, with emphasis on Micro-Doppler, Range-Doppler, and Range-Doppler-Angles techniques. We discuss the intricacies and challenges in fall detection and emphasize the necessity for a clear definition of falls and appropriate detection criteria, informed by diverse influencing factors. We present an overview of radar signal processing principles and the underlying technology of radar-based fall detection, providing an accessible insight into machine learning and deep learning algorithms. After examining 74 research articles on radar-based fall detection published since 2000, we aim to bridge current research gaps and underscore the potential future research strategies, emphasizing the real-world applications possibility and the unexplored potential of deep learning in improving radar-based fall detection.
PMID:39465183 | PMC:PMC11507471 | DOI:10.1109/MRA.2024.3352851
Integrated deep learning approach for generating cross-polarized images and analyzing skin melanin and hemoglobin distributions
Biomed Eng Lett. 2024 Jul 26;14(6):1355-1364. doi: 10.1007/s13534-024-00409-9. eCollection 2024 Nov.
ABSTRACT
Cross-polarized images are beneficial for skin pigment analysis due to the enhanced visualization of melanin and hemoglobin regions. However, the required imaging equipment can be bulky and optically complex. Additionally, preparing ground truths for training pigment analysis models is labor-intensive. This study aims to introduce an integrated approach for generating cross-polarized images and creating skin melanin and hemoglobin maps without the need for ground truth preparation for pigment distributions. We propose a two-component approach: a cross-polarized image generation module and a skin analysis module. Three generative adversarial networks (CycleGAN, pix2pix, and pix2pixHD) are compared for creating cross-polarized images. The regression analysis network for skin analysis is trained with theoretically reconstructed ground truths based on the optical properties of pigments. The methodology is evaluated using the VISIA VAESTRO clinical system. The cross-polarized image generation module achieved a peak signal-to-noise ratio of 35.514 dB. The skin analysis module demonstrated correlation coefficients of 0.942 for hemoglobin and 0.922 for melanin. The integrated approach yielded correlation coefficients of 0.923 for hemoglobin and 0.897 for melanin, respectively. The proposed approach achieved a reasonable correlation with the professional system using actually captured images, offering a promising alternative to existing professional equipment without the need for additional optical instruments or extensive ground truth preparation.
PMID:39465115 | PMC:PMC11502720 | DOI:10.1007/s13534-024-00409-9
A systematic review of deep learning-based denoising for low-dose computed tomography from a perceptual quality perspective
Biomed Eng Lett. 2024 Aug 30;14(6):1153-1173. doi: 10.1007/s13534-024-00419-7. eCollection 2024 Nov.
ABSTRACT
Low-dose computed tomography (LDCT) scans are essential in reducing radiation exposure but often suffer from significant image noise that can impair diagnostic accuracy. While deep learning approaches have enhanced LDCT denoising capabilities, the predominant reliance on objective metrics like PSNR and SSIM has resulted in over-smoothed images that lack critical detail. This paper explores advanced deep learning methods tailored specifically to improve perceptual quality in LDCT images, focusing on generating diagnostic-quality images preferred in clinical practice. We review and compare current methodologies, including perceptual loss functions and generative adversarial networks, addressing the significant limitations of current benchmarks and the subjective nature of perceptual quality evaluation. Through a systematic analysis, this study underscores the urgent need for developing methods that balance both perceptual and diagnostic quality, proposing new directions for future research in the field.
PMID:39465112 | PMC:PMC11502640 | DOI:10.1007/s13534-024-00419-7
CT synthesis with deep learning for MR-only radiotherapy planning: a review
Biomed Eng Lett. 2024 Sep 26;14(6):1259-1278. doi: 10.1007/s13534-024-00430-y. eCollection 2024 Nov.
ABSTRACT
MR-only radiotherapy planning is beneficial from the perspective of both time and safety since it uses synthetic CT for radiotherapy dose calculation instead of real CT scans. To elevate the accuracy of treatment planning and apply the results in practice, various methods have been adopted, among which deep learning models for image-to-image translation have shown good performance by retaining domain-invariant structures while changing domain-specific details. In this paper, we present an overview of diverse deep learning approaches to MR-to-CT synthesis, divided into four classes: convolutional neural networks, generative adversarial networks, transformer models, and diffusion models. By comparing each model and analyzing the general approaches applied to this task, the potential of these models and ways to improve the current methods can be can be evaluated.
PMID:39465111 | PMC:PMC11502731 | DOI:10.1007/s13534-024-00430-y
A review of deep learning-based reconstruction methods for accelerated MRI using spatiotemporal and multi-contrast redundancies
Biomed Eng Lett. 2024 Sep 17;14(6):1221-1242. doi: 10.1007/s13534-024-00425-9. eCollection 2024 Nov.
ABSTRACT
Accelerated magnetic resonance imaging (MRI) has played an essential role in reducing data acquisition time for MRI. Acceleration can be achieved by acquiring fewer data points in k-space, which results in various artifacts in the image domain. Conventional reconstruction methods have resolved the artifacts by utilizing multi-coil information, but with limited robustness. Recently, numerous deep learning-based reconstruction methods have been developed, enabling outstanding reconstruction performances with higher acceleration. Advances in hardware and developments of specialized network architectures have produced such achievements. Besides, MRI signals contain various redundant information including multi-coil redundancy, multi-contrast redundancy, and spatiotemporal redundancy. Utilization of the redundant information combined with deep learning approaches allow not only higher acceleration, but also well-preserved details in the reconstructed images. Consequently, this review paper introduces the basic concepts of deep learning and conventional accelerated MRI reconstruction methods, followed by review of recent deep learning-based reconstruction methods that exploit various redundancies. Lastly, the paper concludes by discussing the challenges, limitations, and potential directions of future developments.
PMID:39465106 | PMC:PMC11502678 | DOI:10.1007/s13534-024-00425-9
Self-supervised learning for CT image denoising and reconstruction: a review
Biomed Eng Lett. 2024 Sep 12;14(6):1207-1220. doi: 10.1007/s13534-024-00424-w. eCollection 2024 Nov.
ABSTRACT
This article reviews the self-supervised learning methods for CT image denoising and reconstruction. Currently, deep learning has become a dominant tool in medical imaging as well as computer vision. In particular, self-supervised learning approaches have attracted great attention as a technique for learning CT images without clean/noisy references. After briefly reviewing the fundamentals of CT image denoising and reconstruction, we examine the progress of deep learning in CT image denoising and reconstruction. Finally, we focus on the theoretical and methodological evolution of self-supervised learning for image denoising and reconstruction.
PMID:39465103 | PMC:PMC11502646 | DOI:10.1007/s13534-024-00424-w
Deep learning using one-stop-shop CT scan to predict hemorrhagic transformation in stroke patients undergoing reperfusion therapy: A multicenter study
Acad Radiol. 2024 Oct 26:S1076-6332(24)00702-5. doi: 10.1016/j.acra.2024.09.052. Online ahead of print.
ABSTRACT
RATIONALE AND OBJECTIVES: Hemorrhagic transformation (HT) is one of the most serious complications in patients with acute ischemic stroke (AIS) following reperfusion therapy. The purpose of this study is to develop and validate deep learning (DL) models utilizing multiphase computed tomography angiography (CTA) and computed tomography perfusion (CTP) images for the fully automated prediction of HT.
MATERIALS AND METHODS: In this multicenter retrospective study, a total of 229 AIS patients who underwent reperfusion therapy from June 2019 to May 2022 were reviewed. Data set 1, comprising 183 patients from two hospitals, was utilized for training, tuning, and internal validation. Data set 2, consisting of 46 patients from a third hospital, was employed for external testing. DL models were trained to extract valuable information from multiphase CTA and CTP images. The DenseNet architecture was used to construct the DL models. We developed single-phase, single-parameter models, and combined models to predict HT. The models were evaluated using receiver operating characteristic curves.
RESULTS: Sixty-nine (30.1%) of 229 patients (mean age, 66.9 years ± 10.3; male, 144 [66.9%]) developed HT. Among the single-phase models, the arteriovenous phase model demonstrated the highest performance. For single-parameter models, the time-to-peak model was superior. When considering combined models, the CTA-CTP model provided the highest predictive accuracy.
CONCLUSIONS: DL models for predicting HT based on multiphase CTA and CTP images can be established and performed well, providing a reliable tool for clinicians to make treatment decisions.
PMID:39462736 | DOI:10.1016/j.acra.2024.09.052
Clinical Pilot of a Deep Learning Elastic Registration Algorithm to Improve Misregistration Artifact and Image Quality on Routine Oncologic PET/CT
Acad Radiol. 2024 Oct 26:S1076-6332(24)00693-7. doi: 10.1016/j.acra.2024.09.044. Online ahead of print.
ABSTRACT
RATIONALE AND OBJECTIVES: Misregistration artifacts between the PET and attenuation correction CT (CTAC) exams can degrade image quality and cause diagnostic errors. Deep learning (DL)-warped elastic registration methods have been proposed to improve misregistration errors.
MATERIALS AND METHODS: 30 patients undergoing routine oncologic examination (20 18F-FDG PET/CT and 10 64Cu-DOTATATE PET/CT) were retrospectively identified and compared using unmodified CTAC, and a DL-augmented spatial transformation CT attenuation map. Primary endpoints included differences in subjective image quality and standardized uptake values (SUV). Exams were randomized to reduce reader bias, and three radiologists rated image quality across six anatomic sites using a modified Likert scale. Measures of local bias and lesion SUV were also quantitatively evaluated.
RESULTS: The DL attenuation correction methods were associated with higher image quality and reduced misregistration artifacts (Mean 18F-FDG quality rating=3.5-3.8 for DL vs 3.2-3.5 for standard reconstruction (STD); Mean 64Cu-DOTATATE quality rating= 3.2-3.4 for DL vs 2.1-3.3; P < 0.05 for STD, for all except 64Cu-DOTATATE inferior spleen). Percent change in superior liver SUVmean for 18F-FDG and 64Cu-DOTATATE were 5.3 ± 4.9 and 8.2 ± 4.1%, respectively. Measures of signal-to-noise ratio were significantly improved for the DL over STD (Hepatopulmonary index (HPI) [18F-FDG] = 4.5 ± 1.2 vs 4.0 ± 1.1, P < 0.001; HPI [64Cu-DOTATATE] = 16.4 ± 16.9 vs 12.5 ± 5.5, P = 0.039).
CONCLUSION: Deep learning elastic registration for CT attenuation correction maps on routine oncology PET/CT decreases misregistration artifacts, with a greater impact on PET scans with longer acquisition times.
PMID:39462735 | DOI:10.1016/j.acra.2024.09.044
Advances in the diagnosis of prostate cancer based on image fusion
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1078-1084. doi: 10.7507/1001-5515.202403054.
ABSTRACT
Image fusion currently plays an important role in the diagnosis of prostate cancer (PCa). Selecting and developing a good image fusion algorithm is the core task of achieving image fusion, which determines whether the fusion image obtained is of good quality and can meet the actual needs of clinical application. In recent years, it has become one of the research hotspots of medical image fusion. In order to make a comprehensive study on the methods of medical image fusion, this paper reviewed the relevant literature published at home and abroad in recent years. Image fusion technologies were classified, and image fusion algorithms were divided into traditional fusion algorithms and deep learning (DL) fusion algorithms. The principles and workflow of some algorithms were analyzed and compared, their advantages and disadvantages were summarized, and relevant medical image data sets were introduced. Finally, the future development trend of medical image fusion algorithm was prospected, and the development direction of medical image fusion technology for the diagnosis of prostate cancer and other major diseases was pointed out.
PMID:39462678 | DOI:10.7507/1001-5515.202403054
Research progress of breast pathology image diagnosis based on deep learning
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1072-1077. doi: 10.7507/1001-5515.202311061.
ABSTRACT
Breast cancer is a malignancy caused by the abnormal proliferation of breast epithelial cells, predominantly affecting female patients, and it is commonly diagnosed using histopathological images. Currently, deep learning techniques have made significant breakthroughs in medical image processing, outperforming traditional detection methods in breast cancer pathology classification tasks. This paper first reviewed the advances in applying deep learning to breast pathology images, focusing on three key areas: multi-scale feature extraction, cellular feature analysis, and classification. Next, it summarized the advantages of multimodal data fusion methods for breast pathology images. Finally, the study discussed the challenges and future prospects of deep learning in breast cancer pathology image diagnosis, providing important guidance for advancing the use of deep learning in breast diagnosis.
PMID:39462677 | DOI:10.7507/1001-5515.202311061
Research progress on electronic health records multimodal data fusion based on deep learning
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1062-1071. doi: 10.7507/1001-5515.202310011.
ABSTRACT
Currently, the development of deep learning-based multimodal learning is advancing rapidly, and is widely used in the field of artificial intelligence-generated content, such as image-text conversion and image-text generation. Electronic health records are digital information such as numbers, charts, and texts generated by medical staff using information systems in the process of medical activities. The multimodal fusion method of electronic health records based on deep learning can assist medical staff in the medical field to comprehensively analyze a large number of medical multimodal data generated in the process of diagnosis and treatment, thereby achieving accurate diagnosis and timely intervention for patients. In this article, we firstly introduce the methods and development trends of deep learning-based multimodal data fusion. Secondly, we summarize and compare the fusion of structured electronic medical records with other medical data such as images and texts, focusing on the clinical application types, sample sizes, and the fusion methods involved in the research. Through the analysis and summary of the literature, the deep learning methods for fusion of different medical modal data are as follows: first, selecting the appropriate pre-trained model according to the data modality for feature representation and post-fusion, and secondly, fusing based on the attention mechanism. Lastly, the difficulties encountered in multimodal medical data fusion and its developmental directions, including modeling methods, evaluation and application of models, are discussed. Through this review article, we expect to provide reference information for the establishment of models that can comprehensively utilize various modal medical data.
PMID:39462676 | DOI:10.7507/1001-5515.202310011
A review on depth perception techniques in organoid images
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1053-1061. doi: 10.7507/1001-5515.202404036.
ABSTRACT
Organoids are an in vitro model that can simulate the complex structure and function of tissues in vivo. Functions such as classification, screening and trajectory recognition have been realized through organoid image analysis, but there are still problems such as low accuracy in recognition classification and cell tracking. Deep learning algorithm and organoid image fusion analysis are the most advanced organoid image analysis methods. In this paper, the organoid image depth perception technology is investigated and sorted out, the organoid culture mechanism and its application concept in depth perception are introduced, and the key progress of four depth perception algorithms such as organoid image and classification recognition, pattern detection, image segmentation and dynamic tracking are reviewed respectively, and the performance advantages of different depth models are compared and analyzed. In addition, this paper also summarizes the depth perception technology of various organ images from the aspects of depth perception feature learning, model generalization and multiple evaluation parameters, and prospects the development trend of organoids based on deep learning methods in the future, so as to promote the application of depth perception technology in organoid images. It provides an important reference for the academic research and practical application in this field.
PMID:39462675 | DOI:10.7507/1001-5515.202404036
Construction of a prediction model for induction of labor based on a small sample of clinical indicator data
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1012-1018. doi: 10.7507/1001-5515.202403033.
ABSTRACT
Because of the diversity and complexity of clinical indicators, it is difficult to establish a comprehensive and reliable prediction model for induction of labor (IOL) outcomes with existing methods. This study aims to analyze the clinical indicators related to IOL and to develop and evaluate a prediction model based on a small-sample of data. The study population consisted of a total of 90 pregnant women who underwent IOL between February 2023 and January 2024 at the Shanghai First Maternity and Infant Healthcare Hospital, and a total of 52 clinical indicators were recorded. Maximal information coefficient (MIC) was used to select features for clinical indicators to reduce the risk of overfitting caused by high-dimensional features. Then, based on the features selected by MIC, the support vector machine (SVM) model based on small samples was compared and analyzed with the fully connected neural network (FCNN) model based on large samples in deep learning, and the receiver operating characteristic (ROC) curve was given. By calculating the MIC score, the final feature dimension was reduced from 55 to 15, and the area under curve (AUC) of the SVM model was improved from 0.872 before feature selection to 0.923. Model comparison results showed that SVM had better prediction performance than FCNN. This study demonstrates that SVM successfully predicted IOL outcomes, and the MIC feature selection effectively improves the model's generalization ability, making the prediction results more stable. This study provides a reliable method for predicting the outcome of induced labor with potential clinical applications.
PMID:39462670 | DOI:10.7507/1001-5515.202403033
Colon polyp detection based on multi-scale and multi-level feature fusion and lightweight convolutional neural network
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):911-918. doi: 10.7507/1001-5515.202312014.
ABSTRACT
Early diagnosis and treatment of colorectal polyps are crucial for preventing colorectal cancer. This paper proposes a lightweight convolutional neural network for the automatic detection and auxiliary diagnosis of colorectal polyps. Initially, a 53-layer convolutional backbone network is used, incorporating a spatial pyramid pooling module to achieve feature extraction with different receptive field sizes. Subsequently, a feature pyramid network is employed to perform cross-scale fusion of feature maps from the backbone network. A spatial attention module is utilized to enhance the perception of polyp image boundaries and details. Further, a positional pattern attention module is used to automatically mine and integrate key features across different levels of feature maps, achieving rapid, efficient, and accurate automatic detection of colorectal polyps. The proposed model is evaluated on a clinical dataset, achieving an accuracy of 0.9982, recall of 0.9988, F1 score of 0.9984, and mean average precision (mAP) of 0.9953 at an intersection over union (IOU) threshold of 0.5, with a frame rate of 74 frames per second and a parameter count of 9.08 M. Compared to existing mainstream methods, the proposed method is lightweight, has low operating configuration requirements, high detection speed, and high accuracy, making it a feasible technical method and important tool for the early detection and diagnosis of colorectal cancer.
PMID:39462658 | DOI:10.7507/1001-5515.202312014
Recurrence prediction of gastric cancer based on multi-resolution feature fusion and context information
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):886-894. doi: 10.7507/1001-5515.202403014.
ABSTRACT
Pathological images of gastric cancer serve as the gold standard for diagnosing this malignancy. However, the recurrence prediction task often encounters challenges such as insignificant morphological features of the lesions, insufficient fusion of multi-resolution features, and inability to leverage contextual information effectively. To address these issues, a three-stage recurrence prediction method based on pathological images of gastric cancer is proposed. In the first stage, the self-supervised learning framework SimCLR was adopted to train low-resolution patch images, aiming to diminish the interdependence among diverse tissue images and yield decoupled enhanced features. In the second stage, the obtained low-resolution enhanced features were fused with the corresponding high-resolution unenhanced features to achieve feature complementation across multiple resolutions. In the third stage, to address the position encoding difficulty caused by the large difference in the number of patch images, we performed position encoding based on multi-scale local neighborhoods and employed self-attention mechanism to obtain features with contextual information. The resulting contextual features were further combined with the local features extracted by the convolutional neural network. The evaluation results on clinically collected data showed that, compared with the best performance of traditional methods, the proposed network provided the best accuracy and area under curve (AUC), which were improved by 7.63% and 4.51%, respectively. These results have effectively validated the usefulness of this method in predicting gastric cancer recurrence.
PMID:39462655 | DOI:10.7507/1001-5515.202403014