Deep learning

Deep learning using one-stop-shop CT scan to predict hemorrhagic transformation in stroke patients undergoing reperfusion therapy: A multicenter study

Sun, 2024-10-27 06:00

Acad Radiol. 2024 Oct 26:S1076-6332(24)00702-5. doi: 10.1016/j.acra.2024.09.052. Online ahead of print.

ABSTRACT

RATIONALE AND OBJECTIVES: Hemorrhagic transformation (HT) is one of the most serious complications in patients with acute ischemic stroke (AIS) following reperfusion therapy. The purpose of this study is to develop and validate deep learning (DL) models utilizing multiphase computed tomography angiography (CTA) and computed tomography perfusion (CTP) images for the fully automated prediction of HT.

MATERIALS AND METHODS: In this multicenter retrospective study, a total of 229 AIS patients who underwent reperfusion therapy from June 2019 to May 2022 were reviewed. Data set 1, comprising 183 patients from two hospitals, was utilized for training, tuning, and internal validation. Data set 2, consisting of 46 patients from a third hospital, was employed for external testing. DL models were trained to extract valuable information from multiphase CTA and CTP images. The DenseNet architecture was used to construct the DL models. We developed single-phase, single-parameter models, and combined models to predict HT. The models were evaluated using receiver operating characteristic curves.

RESULTS: Sixty-nine (30.1%) of 229 patients (mean age, 66.9 years ± 10.3; male, 144 [66.9%]) developed HT. Among the single-phase models, the arteriovenous phase model demonstrated the highest performance. For single-parameter models, the time-to-peak model was superior. When considering combined models, the CTA-CTP model provided the highest predictive accuracy.

CONCLUSIONS: DL models for predicting HT based on multiphase CTA and CTP images can be established and performed well, providing a reliable tool for clinicians to make treatment decisions.

PMID:39462736 | DOI:10.1016/j.acra.2024.09.052

Categories: Literature Watch

Clinical Pilot of a Deep Learning Elastic Registration Algorithm to Improve Misregistration Artifact and Image Quality on Routine Oncologic PET/CT

Sun, 2024-10-27 06:00

Acad Radiol. 2024 Oct 26:S1076-6332(24)00693-7. doi: 10.1016/j.acra.2024.09.044. Online ahead of print.

ABSTRACT

RATIONALE AND OBJECTIVES: Misregistration artifacts between the PET and attenuation correction CT (CTAC) exams can degrade image quality and cause diagnostic errors. Deep learning (DL)-warped elastic registration methods have been proposed to improve misregistration errors.

MATERIALS AND METHODS: 30 patients undergoing routine oncologic examination (20 18F-FDG PET/CT and 10 64Cu-DOTATATE PET/CT) were retrospectively identified and compared using unmodified CTAC, and a DL-augmented spatial transformation CT attenuation map. Primary endpoints included differences in subjective image quality and standardized uptake values (SUV). Exams were randomized to reduce reader bias, and three radiologists rated image quality across six anatomic sites using a modified Likert scale. Measures of local bias and lesion SUV were also quantitatively evaluated.

RESULTS: The DL attenuation correction methods were associated with higher image quality and reduced misregistration artifacts (Mean 18F-FDG quality rating=3.5-3.8 for DL vs 3.2-3.5 for standard reconstruction (STD); Mean 64Cu-DOTATATE quality rating= 3.2-3.4 for DL vs 2.1-3.3; P < 0.05 for STD, for all except 64Cu-DOTATATE inferior spleen). Percent change in superior liver SUVmean for 18F-FDG and 64Cu-DOTATATE were 5.3 ± 4.9 and 8.2 ± 4.1%, respectively. Measures of signal-to-noise ratio were significantly improved for the DL over STD (Hepatopulmonary index (HPI) [18F-FDG] = 4.5 ± 1.2 vs 4.0 ± 1.1, P < 0.001; HPI [64Cu-DOTATATE] = 16.4 ± 16.9 vs 12.5 ± 5.5, P = 0.039).

CONCLUSION: Deep learning elastic registration for CT attenuation correction maps on routine oncology PET/CT decreases misregistration artifacts, with a greater impact on PET scans with longer acquisition times.

PMID:39462735 | DOI:10.1016/j.acra.2024.09.044

Categories: Literature Watch

Advances in the diagnosis of prostate cancer based on image fusion

Sun, 2024-10-27 06:00

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1078-1084. doi: 10.7507/1001-5515.202403054.

ABSTRACT

Image fusion currently plays an important role in the diagnosis of prostate cancer (PCa). Selecting and developing a good image fusion algorithm is the core task of achieving image fusion, which determines whether the fusion image obtained is of good quality and can meet the actual needs of clinical application. In recent years, it has become one of the research hotspots of medical image fusion. In order to make a comprehensive study on the methods of medical image fusion, this paper reviewed the relevant literature published at home and abroad in recent years. Image fusion technologies were classified, and image fusion algorithms were divided into traditional fusion algorithms and deep learning (DL) fusion algorithms. The principles and workflow of some algorithms were analyzed and compared, their advantages and disadvantages were summarized, and relevant medical image data sets were introduced. Finally, the future development trend of medical image fusion algorithm was prospected, and the development direction of medical image fusion technology for the diagnosis of prostate cancer and other major diseases was pointed out.

PMID:39462678 | DOI:10.7507/1001-5515.202403054

Categories: Literature Watch

Research progress of breast pathology image diagnosis based on deep learning

Sun, 2024-10-27 06:00

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1072-1077. doi: 10.7507/1001-5515.202311061.

ABSTRACT

Breast cancer is a malignancy caused by the abnormal proliferation of breast epithelial cells, predominantly affecting female patients, and it is commonly diagnosed using histopathological images. Currently, deep learning techniques have made significant breakthroughs in medical image processing, outperforming traditional detection methods in breast cancer pathology classification tasks. This paper first reviewed the advances in applying deep learning to breast pathology images, focusing on three key areas: multi-scale feature extraction, cellular feature analysis, and classification. Next, it summarized the advantages of multimodal data fusion methods for breast pathology images. Finally, the study discussed the challenges and future prospects of deep learning in breast cancer pathology image diagnosis, providing important guidance for advancing the use of deep learning in breast diagnosis.

PMID:39462677 | DOI:10.7507/1001-5515.202311061

Categories: Literature Watch

Research progress on electronic health records multimodal data fusion based on deep learning

Sun, 2024-10-27 06:00

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1062-1071. doi: 10.7507/1001-5515.202310011.

ABSTRACT

Currently, the development of deep learning-based multimodal learning is advancing rapidly, and is widely used in the field of artificial intelligence-generated content, such as image-text conversion and image-text generation. Electronic health records are digital information such as numbers, charts, and texts generated by medical staff using information systems in the process of medical activities. The multimodal fusion method of electronic health records based on deep learning can assist medical staff in the medical field to comprehensively analyze a large number of medical multimodal data generated in the process of diagnosis and treatment, thereby achieving accurate diagnosis and timely intervention for patients. In this article, we firstly introduce the methods and development trends of deep learning-based multimodal data fusion. Secondly, we summarize and compare the fusion of structured electronic medical records with other medical data such as images and texts, focusing on the clinical application types, sample sizes, and the fusion methods involved in the research. Through the analysis and summary of the literature, the deep learning methods for fusion of different medical modal data are as follows: first, selecting the appropriate pre-trained model according to the data modality for feature representation and post-fusion, and secondly, fusing based on the attention mechanism. Lastly, the difficulties encountered in multimodal medical data fusion and its developmental directions, including modeling methods, evaluation and application of models, are discussed. Through this review article, we expect to provide reference information for the establishment of models that can comprehensively utilize various modal medical data.

PMID:39462676 | DOI:10.7507/1001-5515.202310011

Categories: Literature Watch

A review on depth perception techniques in organoid images

Sun, 2024-10-27 06:00

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1053-1061. doi: 10.7507/1001-5515.202404036.

ABSTRACT

Organoids are an in vitro model that can simulate the complex structure and function of tissues in vivo. Functions such as classification, screening and trajectory recognition have been realized through organoid image analysis, but there are still problems such as low accuracy in recognition classification and cell tracking. Deep learning algorithm and organoid image fusion analysis are the most advanced organoid image analysis methods. In this paper, the organoid image depth perception technology is investigated and sorted out, the organoid culture mechanism and its application concept in depth perception are introduced, and the key progress of four depth perception algorithms such as organoid image and classification recognition, pattern detection, image segmentation and dynamic tracking are reviewed respectively, and the performance advantages of different depth models are compared and analyzed. In addition, this paper also summarizes the depth perception technology of various organ images from the aspects of depth perception feature learning, model generalization and multiple evaluation parameters, and prospects the development trend of organoids based on deep learning methods in the future, so as to promote the application of depth perception technology in organoid images. It provides an important reference for the academic research and practical application in this field.

PMID:39462675 | DOI:10.7507/1001-5515.202404036

Categories: Literature Watch

Construction of a prediction model for induction of labor based on a small sample of clinical indicator data

Sun, 2024-10-27 06:00

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):1012-1018. doi: 10.7507/1001-5515.202403033.

ABSTRACT

Because of the diversity and complexity of clinical indicators, it is difficult to establish a comprehensive and reliable prediction model for induction of labor (IOL) outcomes with existing methods. This study aims to analyze the clinical indicators related to IOL and to develop and evaluate a prediction model based on a small-sample of data. The study population consisted of a total of 90 pregnant women who underwent IOL between February 2023 and January 2024 at the Shanghai First Maternity and Infant Healthcare Hospital, and a total of 52 clinical indicators were recorded. Maximal information coefficient (MIC) was used to select features for clinical indicators to reduce the risk of overfitting caused by high-dimensional features. Then, based on the features selected by MIC, the support vector machine (SVM) model based on small samples was compared and analyzed with the fully connected neural network (FCNN) model based on large samples in deep learning, and the receiver operating characteristic (ROC) curve was given. By calculating the MIC score, the final feature dimension was reduced from 55 to 15, and the area under curve (AUC) of the SVM model was improved from 0.872 before feature selection to 0.923. Model comparison results showed that SVM had better prediction performance than FCNN. This study demonstrates that SVM successfully predicted IOL outcomes, and the MIC feature selection effectively improves the model's generalization ability, making the prediction results more stable. This study provides a reliable method for predicting the outcome of induced labor with potential clinical applications.

PMID:39462670 | DOI:10.7507/1001-5515.202403033

Categories: Literature Watch

Colon polyp detection based on multi-scale and multi-level feature fusion and lightweight convolutional neural network

Sun, 2024-10-27 06:00

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):911-918. doi: 10.7507/1001-5515.202312014.

ABSTRACT

Early diagnosis and treatment of colorectal polyps are crucial for preventing colorectal cancer. This paper proposes a lightweight convolutional neural network for the automatic detection and auxiliary diagnosis of colorectal polyps. Initially, a 53-layer convolutional backbone network is used, incorporating a spatial pyramid pooling module to achieve feature extraction with different receptive field sizes. Subsequently, a feature pyramid network is employed to perform cross-scale fusion of feature maps from the backbone network. A spatial attention module is utilized to enhance the perception of polyp image boundaries and details. Further, a positional pattern attention module is used to automatically mine and integrate key features across different levels of feature maps, achieving rapid, efficient, and accurate automatic detection of colorectal polyps. The proposed model is evaluated on a clinical dataset, achieving an accuracy of 0.9982, recall of 0.9988, F1 score of 0.9984, and mean average precision (mAP) of 0.9953 at an intersection over union (IOU) threshold of 0.5, with a frame rate of 74 frames per second and a parameter count of 9.08 M. Compared to existing mainstream methods, the proposed method is lightweight, has low operating configuration requirements, high detection speed, and high accuracy, making it a feasible technical method and important tool for the early detection and diagnosis of colorectal cancer.

PMID:39462658 | DOI:10.7507/1001-5515.202312014

Categories: Literature Watch

Recurrence prediction of gastric cancer based on multi-resolution feature fusion and context information

Sun, 2024-10-27 06:00

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):886-894. doi: 10.7507/1001-5515.202403014.

ABSTRACT

Pathological images of gastric cancer serve as the gold standard for diagnosing this malignancy. However, the recurrence prediction task often encounters challenges such as insignificant morphological features of the lesions, insufficient fusion of multi-resolution features, and inability to leverage contextual information effectively. To address these issues, a three-stage recurrence prediction method based on pathological images of gastric cancer is proposed. In the first stage, the self-supervised learning framework SimCLR was adopted to train low-resolution patch images, aiming to diminish the interdependence among diverse tissue images and yield decoupled enhanced features. In the second stage, the obtained low-resolution enhanced features were fused with the corresponding high-resolution unenhanced features to achieve feature complementation across multiple resolutions. In the third stage, to address the position encoding difficulty caused by the large difference in the number of patch images, we performed position encoding based on multi-scale local neighborhoods and employed self-attention mechanism to obtain features with contextual information. The resulting contextual features were further combined with the local features extracted by the convolutional neural network. The evaluation results on clinically collected data showed that, compared with the best performance of traditional methods, the proposed network provided the best accuracy and area under curve (AUC), which were improved by 7.63% and 4.51%, respectively. These results have effectively validated the usefulness of this method in predicting gastric cancer recurrence.

PMID:39462655 | DOI:10.7507/1001-5515.202403014

Categories: Literature Watch

Reinforcement learning-based method for type B aortic dissection localization

Sun, 2024-10-27 06:00

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Oct 25;41(5):878-885. doi: 10.7507/1001-5515.202309047.

ABSTRACT

In the segmentation of aortic dissection, there are issues such as low contrast between the aortic dissection and surrounding organs and vessels, significant differences in dissection morphology, and high background noise. To address these issues, this paper proposed a reinforcement learning-based method for type B aortic dissection localization. With the assistance of a two-stage segmentation model, the deep reinforcement learning was utilized to perform the first-stage aortic dissection localization task, ensuring the integrity of the localization target. In the second stage, the coarse segmentation results from the first stage were used as input to obtain refined segmentation results. To improve the recall rate of the first-stage segmentation results and include the segmentation target more completely in the localization results, this paper designed a reinforcement learning reward function based on the direction of recall changes. Additionally, the localization window was separated from the field of view window to reduce the occurrence of segmentation target loss. Unet, TransUnet, SwinUnet, and MT-Unet were selected as benchmark segmentation models. Through experiments, it was verified that the majority of the metrics in the two-stage segmentation process of this paper performed better than the benchmark results. Specifically, the Dice index improved by 1.34%, 0.89%, 27.66%, and 7.37% for each respective model. In conclusion, by incorporating the type B aortic dissection localization method proposed in this paper into the segmentation process, the overall segmentation accuracy is improved compared to the benchmark models. The improvement is particularly significant for models with poorer segmentation performance.

PMID:39462654 | DOI:10.7507/1001-5515.202309047

Categories: Literature Watch

Deep learning-based whole-brain B(1) (+)-mapping at 7T

Sun, 2024-10-27 06:00

Magn Reson Med. 2024 Oct 27. doi: 10.1002/mrm.30359. Online ahead of print.

ABSTRACT

PURPOSE: This study investigates the feasibility of using complex-valued neural networks (NNs) to estimate quantitative transmit magnetic RF field (B1 +) maps from multi-slice localizer scans with different slice orientations in the human head at 7T, aiming to accelerate subject-specific B1 +-calibration using parallel transmission (pTx).

METHODS: Datasets containing channel-wise B1 +-maps and corresponding multi-slice localizers were acquired in axial, sagittal, and coronal orientation in 15 healthy subjects utilizing an eight-channel pTx transceiver head coil. Training included five-fold cross-validation for four network configurations: NN cx tra $$ {\mathrm{NN}}_{\mathrm{cx}}^{\mathrm{tra}} $$ used transversal, NN cx sag $$ {\mathrm{NN}}_{\mathrm{cx}}^{\mathrm{sag}} $$ sagittal, NN cx cor $$ {\mathrm{NN}}_{\mathrm{cx}}^{\mathrm{cor}} $$ coronal data, and NN cx all $$ {\mathrm{NN}}_{\mathrm{cx}}^{\mathrm{all}} $$ was trained on all slice orientations. The resulting maps were compared to B1 +-reference scans using different quality metrics. The proposed network was applied in-vivo at 7T in two unseen test subjects using dynamic kt-point pulses.

RESULTS: Predicted B1 +-maps demonstrated a high similarity with measured B1 +-maps across multiple orientations. The estimation matched the reference with a mean relative error in the magnitude of (2.70 ± 2.86)% and mean absolute phase difference of (6.70 ± 1.99)° for transversal, (1.82 ± 0.69)% and (4.25 ± 1.62)° for sagittal ( NN cx sag $$ {\mathrm{NN}}_{\mathrm{cx}}^{\mathrm{sag}} $$ ), as well as (1.33 ± 0.27)% and (2.66 ± 0.60)° for coronal slices ( NN cx cor $$ {\mathrm{NN}}_{\mathrm{cx}}^{\mathrm{cor}} $$ ) considering brain tissue. NN cx all $$ {\mathrm{NN}}_{\mathrm{cx}}^{\mathrm{all}} $$ trained on all orientations enables a robust prediction of B1 +-maps across different orientations. Achieving a homogenous excitation over the whole brain for an in-vivo application displayed the approach's feasibility.

CONCLUSION: This study demonstrates the feasibility of utilizing complex-valued NNs to estimate multi-slice B1 +-maps in different slice orientations from localizer scans in the human brain at 7T.

PMID:39462473 | DOI:10.1002/mrm.30359

Categories: Literature Watch

Identification of lineage-specific cis-trans regulatory networks related to kiwifruit ripening initiation

Sun, 2024-10-27 06:00

Plant J. 2024 Oct 27. doi: 10.1111/tpj.17093. Online ahead of print.

ABSTRACT

Previous research on the ripening process of many fruit crop varieties typically involved analyses of the conserved genetic factors among species. However, even for seemingly identical ripening processes, the associated gene expression networks often evolved independently, as reflected by the diversity in the interactions between transcription factors (TFs) and the targeted cis-regulatory elements (CREs). In this study, explainable deep learning (DL) frameworks were used to predict expression patterns on the basis of CREs in promoter sequences. We initially screened potential lineage-specific CRE-TF interactions influencing the kiwifruit ripening process, which is triggered by ethylene, similar to the corresponding processes in other climacteric fruit crops. Some novel regulatory relationships affecting ethylene-induced fruit ripening were identified. Specifically, ABI5-like bZIP, G2-like, and MYB81-like TFs were revealed as trans-factors modulating the expression of representative ethylene signaling/biosynthesis-related genes (e.g., ACS1, ERT2, and ERF143). Transient reporter assays and DNA affinity purification sequencing (DAP-Seq) analyses validated these CRE-TF interactions and their regulatory relationships. A comparative analysis with co-expression networking suggested that this DL-based screening can identify regulatory networks independently of co-expression patterns. Our results highlight the utility of an explainable DL approach for identifying novel CRE-TF interactions. These imply that fruit crop species may have evolved lineage-specific fruit ripening-related cis-trans regulatory networks.

PMID:39462454 | DOI:10.1111/tpj.17093

Categories: Literature Watch

WiTUnet: A U-shaped architecture integrating CNN and Transformer for improved feature alignment and local information fusion

Sun, 2024-10-27 06:00

Sci Rep. 2024 Oct 26;14(1):25525. doi: 10.1038/s41598-024-76886-w.

ABSTRACT

Low-dose computed tomography (LDCT) has emerged as the preferred technology for diagnostic medical imaging due to the potential health risks associated with X-ray radiation and conventional computed tomography (CT) techniques. While LDCT utilizes a lower radiation dose compared to standard CT, it results in increased image noise, which can impair the accuracy of diagnoses. To mitigate this issue, advanced deep learning-based LDCT denoising algorithms have been developed. These primarily utilize Convolutional Neural Networks (CNNs) or Transformer Networks and often employ the Unet architecture, which enhances image detail by integrating feature maps from the encoder and decoder via skip connections. However, existing methods focus excessively on the optimization of the encoder and decoder structures while overlooking potential enhancements to the Unet architecture itself. This oversight can be problematic due to significant differences in feature map characteristics between the encoder and decoder, where simple fusion strategies may hinder effective image reconstruction. In this paper, we introduce WiTUnet, a novel LDCT image denoising method that utilizes nested, dense skip pathway in place of traditional skip connections to improve feature integration. Additionally, to address the high computational demands of conventional Transformers on large images, WiTUnet incorporates a windowed Transformer structure that processes images in smaller, non-overlapping segments, significantly reducing computational load. Moreover, our approach includes a Local Image Perception Enhancement (LiPe) module within both the encoder and decoder to replace the standard multi-layer perceptron (MLP) in Transformers, thereby improving the capture and representation of local image features. Through extensive experimental comparisons, WiTUnet has demonstrated superior performance over existing methods in critical metrics such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Root Mean Square Error (RMSE), significantly enhancing noise removal and image quality. The code is available on github https://github.com/woldier/WiTUNet .

PMID:39462127 | DOI:10.1038/s41598-024-76886-w

Categories: Literature Watch

Graph masked self-distillation learning for prediction of mutation impact on protein-protein interactions

Sun, 2024-10-27 06:00

Commun Biol. 2024 Oct 26;7(1):1400. doi: 10.1038/s42003-024-07066-9.

ABSTRACT

Assessing mutation impact on the binding affinity change (ΔΔG) of protein-protein interactions (PPIs) plays a crucial role in unraveling structural-functional intricacies of proteins and developing innovative protein designs. In this study, we present a deep learning framework, PIANO, for improved prediction of ΔΔG in PPIs. The PIANO framework leverages a graph masked self-distillation scheme for protein structural geometric representation pre-training, which effectively captures the structural context representations surrounding mutation sites, and makes predictions using a multi-branch network consisting of multiple encoders for amino acids, atoms, and protein sequences. Extensive experiments demonstrated its superior prediction performance and the capability of pre-trained encoder in capturing meaningful representations. Compared to previous methods, PIANO can be widely applied on both holo complex structures and apo monomer structures. Moreover, we illustrated the practical applicability of PIANO in highlighting pathogenic mutations and crucial proteins, and distinguishing de novo mutations in disease cases and controls in PPI systems. Overall, PIANO offers a powerful deep learning tool, which may provide valuable insights into the study of drug design, therapeutic intervention, and protein engineering.

PMID:39462102 | DOI:10.1038/s42003-024-07066-9

Categories: Literature Watch

A hybrid container throughput forecasting approach using bi-directional hinterland data of port

Sun, 2024-10-27 06:00

Sci Rep. 2024 Oct 26;14(1):25502. doi: 10.1038/s41598-024-77376-9.

ABSTRACT

Accurate forecasting of port container throughput plays a crucial role in optimising port operations, resource allocation, supply chain management, etc. However, existing studies only focus on the impact of port hinterland economic development on container throughput, ignoring the impact of port foreland. This study proposed a container throughput forecasting model based on deep learning, which considers the impact of port hinterland and foreland on container throughput. Real-world experimental results showed that the proposed model with multiple data sources outperformed other forecasting methods, achieving significantly higher accuracy. The implications of this study are significant for port authorities, logistics companies, and policymakers.

PMID:39462082 | DOI:10.1038/s41598-024-77376-9

Categories: Literature Watch

End-to-end multiple object tracking in high-resolution optical sensors of drones with transformer models

Sun, 2024-10-27 06:00

Sci Rep. 2024 Oct 26;14(1):25543. doi: 10.1038/s41598-024-75934-9.

ABSTRACT

Drone aerial imaging has become increasingly important across numerous fields as drone optical sensor technology continues to advance. One critical challenge in this domain is achieving both accurate and efficient multi-object tracking. Traditional deep learning methods often separate object identification from tracking, leading to increased complexity and potential performance degradation. Conventional approaches rely heavily on manual feature engineering and intricate algorithms, which can further limit efficiency. To overcome these limitations, we propose a novel Transformer-based end-to-end multi-object tracking framework. This innovative method leverages self-attention mechanisms to capture complex inter-object relationships, seamlessly integrating object detection and tracking into a unified process. By utilizing end-to-end training, our approach simplifies the tracking pipeline, leading to significant performance improvements. A key innovation in our system is the introduction of a trajectory detection label matching technique. This technique assigns labels based on a comprehensive assessment of object appearance, spatial characteristics, and Gaussian features, ensuring more precise and logical label assignments. Additionally, we incorporate cross-frame self-attention mechanisms to extract long-term object properties, providing robust information for stable and consistent tracking. We further enhance tracking performance through a newly developed self-characteristics module, which extracts semantic features from trajectory information across both current and previous frames. This module ensures that the long-term interaction modules maintain semantic consistency, allowing for more accurate and continuous tracking over time. The refined data and stored trajectories are then used as input for subsequent frame processing, creating a feedback loop that sustains tracking accuracy. Extensive experiments conducted on the VisDrone and UAVDT datasets demonstrate the superior performance of our approach in drone-based multi-object tracking.

PMID:39461992 | DOI:10.1038/s41598-024-75934-9

Categories: Literature Watch

Detecting command injection attacks in web applications based on novel deep learning methods

Sun, 2024-10-27 06:00

Sci Rep. 2024 Oct 26;14(1):25487. doi: 10.1038/s41598-024-74350-3.

ABSTRACT

Web command injection attacks pose significant security threats to web applications, leading to potential server information leakage or severe server disruption. Traditional detection methods struggle with the increasing complexity and obfuscation of these attacks, resulting in poor identification of malicious code, complicated feature extraction processes, and low detection efficiency. To address these challenges, a novel detection model, the Convolutional Channel-BiLSTM Attention (CCBA) model, is proposed, leveraging deep learning techniques to enhance the identification of web command injection attacks. The model utilizes dual CNN convolutional channels for comprehensive feature extraction and employs a BiLSTM network for bidirectional recognition of temporal features. An attention mechanism is also incorporated to assign weights to critical features, improving the model's detection performance. Experimental results demonstrate that the CCBA model achieves 99.3% accuracy and 98.2% recall on a real-world dataset. To validate the robustness and generalization of the model, tests were conducted on two widely recognized public cybersecurity datasets, consistently achieving over 98% accuracy. Compared to existing methods, the proposed model offers a more effective solution for identifying web command injection attacks.

PMID:39461962 | DOI:10.1038/s41598-024-74350-3

Categories: Literature Watch

Advancing EEG prediction with deep learning and uncertainty estimation

Sun, 2024-10-27 06:00

Brain Inform. 2024 Oct 26;11(1):27. doi: 10.1186/s40708-024-00239-6.

ABSTRACT

Deep Learning (DL) has the potential to enhance patient outcomes in healthcare by implementing proficient systems for disease detection and diagnosis. However, the complexity and lack of interpretability impede their widespread adoption in critical high-stakes predictions in healthcare. Incorporating uncertainty estimations in DL systems can increase trustworthiness, providing valuable insights into the model's confidence and improving the explanation of predictions. Additionally, introducing explainability measures, recognized and embraced by healthcare experts, can help address this challenge. In this study, we investigate DL models' ability to predict sex directly from electroencephalography (EEG) data. While sex prediction have limited direct clinical application, its binary nature makes it a valuable benchmark for optimizing deep learning techniques in EEG data analysis. Furthermore, we explore the use of DL ensembles to improve performance over single models and as an approach to increase interpretability and performance through uncertainty estimation. Lastly, we use a data-driven approach to evaluate the relationship between frequency bands and sex prediction, offering insights into their relative importance. InceptionNetwork, a single DL model, achieved 90.7% accuracy and an AUC of 0.947, and the best-performing ensemble, combining variations of InceptionNetwork and EEGNet, achieved 91.1% accuracy in predicting sex from EEG data using five-fold cross-validation. Uncertainty estimation through deep ensembles led to increased prediction performance, and the models were able to classify sex in all frequency bands, indicating sex-specific features across all bands.

PMID:39461914 | DOI:10.1186/s40708-024-00239-6

Categories: Literature Watch

Patient-specific deep learning tracking framework for real-time 2D target localization in MRI-guided radiotherapy

Sat, 2024-10-26 06:00

Int J Radiat Oncol Biol Phys. 2024 Oct 24:S0360-3016(24)03508-9. doi: 10.1016/j.ijrobp.2024.10.021. Online ahead of print.

ABSTRACT

PURPOSE: We propose a tumor tracking framework for 2D cine MRI based on a pair of deep learning (DL) models relying on patient-specific (PS) training.

METHODS AND MATERIALS: The chosen DL models are: 1) an image registration transformer and 2) an auto-segmentation convolutional neural network (CNN). We collected over 1,400,000 cine MRI frames from 219 patients treated on a 0.35 T MRI-linac plus 7,500 frames from additional 35 patients which were manually labelled and subdivided into fine-tuning, validation, and testing sets. The transformer was first trained on the unlabeled data (without segmentations). We then continued training (with segmentations) either on the fine-tuning set or for PS models based on eight randomly selected frames from the first 5 s of each patient's cine MRI. The PS auto-segmentation CNN was trained from scratch with the same eight frames for each patient, without pre-training. Furthermore, we implemented B-spline image registration as a conventional model, as well as different baselines. Output segmentations of all models were compared on the testing set using the Dice similarity coefficient (DSC), the 50% and 95% Hausdorff distance (HD50%/HD95%), and the root-mean-square-error of the target centroid in superior-inferior direction (RMSESI).

RESULTS: The PS transformer and CNN significantly outperformed all other models, achieving a median (inter-quartile range) DSC of 0.92 (0.03)/0.90 (0.04), HD50% of 1.0 (0.1)/1.0 (0.4) mm, HD95% of 3.1 (1.9)/3.8 (2.0) mm and RMSESI of 0.7 (0.4)/0.9 (1.0) mm on the testing set. Their inference time was about 36/8 ms per frame and PS fine-tuning required 3 min for labelling and 8/4 min for training. The transformer was better than the CNN in 9/12 patients, the CNN better in 1/12 patients and the two PS models achieved the same performance on the remaining 2/12 testing patients.

CONCLUSION: For targets in the thorax, abdomen and pelvis, we found two PS DL models to provide accurate real-time target localization during MRI-guided radiotherapy.

PMID:39461599 | DOI:10.1016/j.ijrobp.2024.10.021

Categories: Literature Watch

Detection of carotid plaques on panoramic radiographs using deep learning

Sat, 2024-10-26 06:00

J Dent. 2024 Oct 24:105432. doi: 10.1016/j.jdent.2024.105432. Online ahead of print.

ABSTRACT

OBJECTIVES: Panoramic radiographs (PRs) can reveal an incidental finding of atherosclerosis, or carotid artery calcification (CAC), in 3-15% of examined patients. However, limited training in identification of such calcifications among dental professionals results in missed diagnoses. This study aimed to detect CAC on PRs using an artificial intelligence (AI) model based on a vision transformer.

METHODS: 6,404 PRs were obtained from one hospital and screened for the presence of CAC based on electronic medical records. CAC was manually annotated with bounding boxes by an oral radiologist and reviewed and revised by three experienced clinicians to achieve consensus. An AI approach based on Faster R-CNN and Swin Transformer was trained and evaluated based on 185 PRs with CAC and 185 PRs without CAC. Reported and replicated diagnostic performances of published AI approaches based on convolutional neural networks (CNNs) were used for comparison. Quantitative evaluation of the performance of the models included precision, F1-score, recall, area-under-the-curve (AUC), and average precision (AP).

RESULTS: The proposed method based on Faster R-CNN and Swin Transformer achieved a precision of 0.895, recall of 0.881, F1-score of 0.888, AUC of 0.950, and AP of 0.942, surpassing models based on a CNN.

CONCLUSIONS: The detection performance of this newly developed and validated model was improved compared to previously reported models.

CLINICAL SIGNIFICANCE: Integrating AI models into dental imaging to assist dental professionals in the detection of CAC on PRs has the potential to significantly enhance the early detection of carotid artery atherosclerosis and its clinical management.

PMID:39461583 | DOI:10.1016/j.jdent.2024.105432

Categories: Literature Watch

Pages