Deep learning
HiRENet: Novel convolutional neural network architecture using Hilbert-transformed and raw electroencephalogram (EEG) for subject-independent emotion classification
Comput Biol Med. 2024 Jun 27;178:108788. doi: 10.1016/j.compbiomed.2024.108788. Online ahead of print.
ABSTRACT
BACKGROUND AND OBJECTIVES: Convolutional neural networks (CNNs) are the most widely used deep-learning framework for decoding electroencephalograms (EEGs) due to their exceptional ability to extract hierarchical features from high-dimensional EEG data. Traditionally, CNNs have primarily utilized multi-channel raw EEG data as the input tensor; however, the performance of CNN-based EEG decoding may be enhanced by incorporating phase information alongside amplitude information.
METHODS: This study introduces a novel CNN architecture called the Hilbert-transformed (HT) and raw EEG network (HiRENet), which incorporates both raw and HT EEG as inputs. This concurrent use of HT and raw EEG aims to integrate phase information with existing amplitude information, potentially offering a more comprehensive reflection of functional connectivity across various brain regions. The HiRENet model was developed using two CNN frameworks: ShallowFBCSPNet and a CNN with a residual block (ResCNN). The performance of the HiRENet model was assessed using a lab-made EEG database to classify human emotions, comparing three input modalities: raw EEG, HT EEG, and a combination of both signals. Additionally, the computational complexity was evaluated to validate the computational efficiency of the ResCNN design.
RESULTS: The HiRENet model based on ResCNN achieved the highest classification accuracy, with 86.03% for valence and 84.01% for arousal classifications, surpassing traditional CNN methodologies. Considering computational efficiency, ResCNN demonstrated superiority over ShallowFBCSPNet in terms of speed and inference time, despite having a higher parameter count.
CONCLUSION: Our experimental results showed that the proposed HiRENet can be potentially used as a new option to improve the overall performance for deep learning-based EEG decoding problems.
PMID:38941902 | DOI:10.1016/j.compbiomed.2024.108788
Channel prior convolutional attention for medical image segmentation
Comput Biol Med. 2024 Jun 27;178:108784. doi: 10.1016/j.compbiomed.2024.108784. Online ahead of print.
ABSTRACT
Characteristics such as low contrast and significant organ shape variations are often exhibited in medical images. The improvement of segmentation performance in medical imaging is limited by the generally insufficient adaptive capabilities of existing attention mechanisms. An efficient Channel Prior Convolutional Attention (CPCA) method is proposed in this paper, supporting the dynamic distribution of attention weights in both channel and spatial dimensions. Spatial relationships are effectively extracted while preserving the channel prior by employing a multi-scale depth-wise convolutional module. The ability to focus on informative channels and important regions is possessed by CPCA. A segmentation network called CPCANet for medical image segmentation is proposed based on CPCA. CPCANet is validated on two publicly available datasets. Improved segmentation performance is achieved by CPCANet while requiring fewer computational resources through comparisons with state-of-the-art algorithms. Our code is publicly available at https://github.com/Cuthbert-Huang/CPCANet.
PMID:38941900 | DOI:10.1016/j.compbiomed.2024.108784
Differentiation of glioblastoma from solitary brain metastasis using deep ensembles: Empirical estimation of uncertainty for clinical reliability
Comput Methods Programs Biomed. 2024 Jun 21;254:108288. doi: 10.1016/j.cmpb.2024.108288. Online ahead of print.
ABSTRACT
BACKGROUND AND OBJECTIVES: To develop a clinically reliable deep learning model to differentiate glioblastoma (GBM) from solitary brain metastasis (SBM) by providing predictive uncertainty estimates and interpretability.
METHODS: A total of 469 patients (300 GBM, 169 SBM) were enrolled in the institutional training set. Deep ensembles based on DenseNet121 were trained on multiparametric MRI. The model performance was validated in the external test set consisting of 143 patients (101 GBM, 42 SBM). Entropy values for each input were evaluated for uncertainty measurement; based on entropy values, the datasets were split to high- and low-uncertainty groups. In addition, entropy values of out-of-distribution (OOD) data from unknown class (257 patients with meningioma) were compared to assess uncertainty estimates of the model. The model interpretability was further evaluated by localization accuracy of the model.
RESULTS: On external test set, the area under the curve (AUC), accuracy, sensitivity and specificity of the deep ensembles were 0.83 (95 % confidence interval [CI] 0.76-0.90), 76.2 %, 54.8 % and 85.2 %, respectively. The performance was higher in the low-uncertainty group than in the high-uncertainty group, with AUCs of 0.91 (95 % CI 0.83-0.98) and 0.58 (95 % CI 0.44-0.71), indicating that assessment of uncertainty with entropy values ascertained reliable prediction in the low-uncertainty group. Further, deep ensembles classified a high proportion (90.7 %) of predictions on OOD data to be uncertain, showing robustness in dataset shift. Interpretability evaluated by localization accuracy provided further reliability in the "low-uncertainty and high-localization accuracy" subgroup, with an AUC of 0.98 (95 % CI 0.95-1.00).
CONCLUSIONS: Empirical assessment of uncertainty and interpretability in deep ensembles provides evidence for the robustness of prediction, offering a clinically reliable model in differentiating GBM from SBM.
PMID:38941861 | DOI:10.1016/j.cmpb.2024.108288
Automatic detection of pulmonary embolism on computed tomography pulmonary angiogram scan using a three-dimensional convolutional neural network
Eur J Radiol. 2024 Jun 21;177:111586. doi: 10.1016/j.ejrad.2024.111586. Online ahead of print.
ABSTRACT
OBJECTIVE: To propose a convolutional neural network (EmbNet) for automatic pulmonary embolism detection on computed tomography pulmonary angiogram (CTPA) scans and to assess its diagnostic performance.
METHODS: 305 consecutive CTPA scans between January 2019 and December 2021 were enrolled in this study (142 for training, 163 for internal validation), and 250 CTPA scans from a public dataset were used for external validation. The framework comprised a preprocessing step to segment the pulmonary vessels and the EmbNet to detect emboli. Emboli were divided into three location-based subgroups for detailed evaluation: central arteries, lobar branches, and peripheral regions. Ground truth was established by three radiologists.
RESULTS: The EmbNet's per-scan level sensitivity, specificity, positive predictive value (PPV), and negative predictive value were 90.9%, 75.4%, 48.4%, and 97.0% (internal validation) and 88.0%, 70.5%, 42.7%, and 95.9% (external validation). At the per-embolus level, the overall sensitivity and PPV of the EmbNet were 86.0% and 61.3% (internal validation), and 83.5% and 57.5% (external validation). The sensitivity and PPV of central emboli were 89.7% and 52.0% (internal validation), and 94.4% and 43.0% (external validation); of lobar emboli were 95.2% and 76.9% (internal validation), and 93.5% and 72.5% (external validation); and of peripheral emboli were 82.6% and 61.7% (internal validation), and 80.2% and 59.4% (external validation). The average false positive rate was 0.45 false emboli per scan (internal validation) and 0.69 false emboli per scan (external validation).
CONCLUSION: The EmbNet provides high sensitivity across embolus locations, suggesting its potential utility for initial screening in clinical practice.
PMID:38941822 | DOI:10.1016/j.ejrad.2024.111586
Advancing deep learning-based acoustic leak detection methods towards application for water distribution systems from a data-centric perspective
Water Res. 2024 Jun 24;261:121999. doi: 10.1016/j.watres.2024.121999. Online ahead of print.
ABSTRACT
Against the backdrop of severe leakage issue in water distribution systems (WDSs), numerous researchers have focused on the development of deep learning-based acoustic leak detection technologies. However, these studies often prioritize model development while neglecting the importance of data. This research explores the impact of data augmentation techniques on enhancing deep learning-based acoustic leak detection methods. Five random transformation-based methods-jittering, scaling, warping, iterated amplitude adjusted Fourier transform (IAAFT), and masking-are proposed. Jittering, scaling, warping, and IAAFT directly process original signals, while masking operating on time-frequency spectrograms. Acoustic signals from a real-world WDS are augmented, and the efficacy is validated using convolutional neural network classifiers to identify the spectrograms of acoustic signals. Results indicate the importance of implementing data augmentation before data splitting to prevent data leakage and overly optimistic outcomes. Among the techniques, IAAFT stands out, significantly increasing data volume and diversity, improving recognition accuracy by over 7%. Masking enhances performance mainly by compelling the classifier to learn global features of the spectrograms. Sequential application of IAAFT and masking further strengthens leak detection performance. Furthermore, when applying a complex model to acoustic leakage detection through transfer learning, data augmentation can also enhance the effectiveness of transfer learning. These findings advance artificial intelligence-driven acoustic leak detection technology from a data-centric perspective towards more mature applications.
PMID:38941677 | DOI:10.1016/j.watres.2024.121999
The human hypothalamus coordinates switching between different survival actions
PLoS Biol. 2024 Jun 28;22(6):e3002624. doi: 10.1371/journal.pbio.3002624. eCollection 2024 Jun.
ABSTRACT
Comparative research suggests that the hypothalamus is critical in switching between survival behaviors, yet it is unclear if this is the case in humans. Here, we investigate the role of the human hypothalamus in survival switching by introducing a paradigm where volunteers switch between hunting and escape in response to encounters with a virtual predator or prey. Given the small size and low tissue contrast of the hypothalamus, we used deep learning-based segmentation to identify the individual-specific hypothalamus and its subnuclei as well as an imaging sequence optimized for hypothalamic signal acquisition. Across 2 experiments, we employed computational models with identical structures to explain internal movement generation processes associated with hunting and escaping. Despite the shared structure, the models exhibited significantly different parameter values where escaping or hunting were accurately decodable just by computing the parameters of internal movement generation processes. In experiment 2, multi-voxel pattern analyses (MVPA) showed that the hypothalamus, hippocampus, and periaqueductal gray encode switching of survival behaviors while not encoding simple motor switching outside of the survival context. Furthermore, multi-voxel connectivity analyses revealed a network including the hypothalamus as encoding survival switching and how the hypothalamus is connected to other regions in this network. Finally, model-based fMRI analyses showed that a strong hypothalamic multi-voxel pattern of switching is predictive of optimal behavioral coordination after switching, especially when this signal was synchronized with the multi-voxel pattern of switching in the amygdala. Our study is the first to identify the role of the human hypothalamus in switching between survival behaviors and action organization after switching.
PMID:38941452 | DOI:10.1371/journal.pbio.3002624
The analysis of the internet of things database query and optimization using deep learning network model
PLoS One. 2024 Jun 28;19(6):e0306291. doi: 10.1371/journal.pone.0306291. eCollection 2024.
ABSTRACT
To explore the application effect of the deep learning (DL) network model in the Internet of Things (IoT) database query and optimization. This study first analyzes the architecture of IoT database queries, then explores the DL network model, and finally optimizes the DL network model through optimization strategies. The advantages of the optimized model in this study are verified through experiments. Experimental results show that the optimized model has higher efficiency than other models in the model training and parameter optimization stages. Especially when the data volume is 2000, the model training time and parameter optimization time of the optimized model are remarkably lower than that of the traditional model. In terms of resource consumption, the Central Processing Unit and Graphics Processing Unit usage and memory usage of all models have increased as the data volume rises. However, the optimized model exhibits better performance on energy consumption. In throughput analysis, the optimized model can maintain high transaction numbers and data volumes per second when handling large data requests, especially at 4000 data volumes, and its peak time processing capacity exceeds that of other models. Regarding latency, although the latency of all models increases with data volume, the optimized model performs better in database query response time and data processing latency. The results of this study not only reveal the optimized model's superior performance in processing IoT database queries and their optimization but also provide a valuable reference for IoT data processing and DL model optimization. These findings help to promote the application of DL technology in the IoT field, especially in the need to deal with large-scale data and require efficient processing scenarios, and offer a vital reference for the research and practice in related fields.
PMID:38941309 | DOI:10.1371/journal.pone.0306291
Interactive Isosurface Visualization in Memory Constrained Environments Using Deep Learning and Speculative Raycasting
IEEE Trans Vis Comput Graph. 2024 Jun 28;PP. doi: 10.1109/TVCG.2024.3420225. Online ahead of print.
ABSTRACT
New web technologies have enabled the deployment of powerful GPU-based computational pipelines that run entirely in the web browser, opening a new frontier for accessible scientific visualization applications. However, these new capabilities do not address the memory constraints of lightweight end-user devices encountered when attempting to visualize the massive data sets produced by today's simulations and data acquisition systems. We propose a novel implicit isosurface rendering algorithm for interactive visualization of massive volumes within a small memory footprint. We achieve this by progressively traversing a wavefront of rays through the volume and decompressing blocks of the data on-demand to perform implicit ray-isosurface intersections, displaying intermediate results each pass. We improve the quality of these intermediate results using a pretrained deep neural network that reconstructs the output of early passes, allowing for interactivity with better approximates of the final image. To accelerate rendering and increase GPU utilization, we introduce speculative ray-block intersection into our algorithm, where additional blocks are traversed and intersected speculatively along rays to exploit additional parallelism in the workload. Our algorithm is able to trade-off image quality to greatly decrease rendering time for interactive rendering even on lightweight devices. Our entire pipeline is run in parallel on the GPU to leverage the parallel computing power that is available even on lightweight end-user devices. We compare our algorithm to the state of the art in low-overhead isosurface extraction and demonstrate that it achieves 1.7×- 5.7× reductions in memory overhead and up to 8.4× reductions in data decompressed.
PMID:38941206 | DOI:10.1109/TVCG.2024.3420225
Single-Subject Deep-Learning Image Reconstruction with a Neural Optimization Transfer Algorithm for PET-enabled Dual-Energy CT Imaging
IEEE Trans Image Process. 2024 Jun 28;PP. doi: 10.1109/TIP.2024.3418347. Online ahead of print.
ABSTRACT
Combining dual-energy computed tomography (DECT) with positron emission tomography (PET) offers many potential clinical applications but typically requires expensive hardware upgrades or increases radiation doses on PET/CT scanners due to an extra X-ray CT scan. The recent PET-enabled DECT method allows DECT imaging on PET/CT without requiring a second X-ray CT scan. It combines the already existing X-ray CT image with a 511 keV γ-ray CT (gCT) image reconstructed from time-of-flight PET emission data. A kernelized framework has been developed for reconstructing gCT image but this method has not fully exploited the potential of prior knowledge. Use of deep neural networks may explore the power of deep learning in this application. However, common approaches require a large database for training, which is impractical for a new imaging method like PET-enabled DECT. Here, we propose a single-subject method by using neural-network representation as a deep coefficient prior to improving gCT image reconstruction without population-based pre-training. The resulting optimization problem becomes the tomographic estimation of nonlinear neural-network parameters from gCT projection data. This complicated problem can be efficiently solved by utilizing the optimization transfer strategy with quadratic surrogates. Each iteration of the proposed neural optimization transfer algorithm includes: PET activity image update; gCT image update; and least-square neural-network learning in the gCT image domain. This algorithm is guaranteed to monotonically increase the data likelihood. Results from computer simulation, real phantom data and real patient data have demonstrated that the proposed method can significantly improve gCT image quality and consequent multi-material decomposition as compared to other methods.
PMID:38941203 | DOI:10.1109/TIP.2024.3418347
DCDiff: Dual-Granularity Cooperative Diffusion Models for Pathology Image Analysis
IEEE Trans Med Imaging. 2024 Jun 28;PP. doi: 10.1109/TMI.2024.3420804. Online ahead of print.
ABSTRACT
Whole Slide Images (WSIs) are paramount in the medical field, with extensive applications in disease diagnosis and treatment. Recently, many deep-learning methods have been used to classify WSIs. However, these methods are inadequate for accurately analyzing WSIs as they treat regions in WSIs as isolated entities and ignore contextual information. To address this challenge, we propose a novel Dual-Granularity Cooperative Diffusion Model (DCDiff) for the precise classification of WSIs. Specifically, we first design a cooperative forward and reverse diffusion strategy, utilizing fine-granularity and coarse-granularity to regulate each diffusion step and gradually improve context awareness. To exchange information between granularities, we propose a coupled U-Net for dual-granularity denoising, which efficiently integrates dual-granularity consistency information using the designed Fine- and Coarse-granularity Cooperative Aware (FCCA) model. Ultimately, the cooperative diffusion features extracted by DCDiff can achieve cross-sample perception from the reconstructed distribution of training samples. Experiments on three public WSI datasets show that the proposed method can achieve superior performance over state-of-the-art methods. The code is available at https://github.com/hemo0826/DCDiff.
PMID:38941198 | DOI:10.1109/TMI.2024.3420804
Automatic Sleep Stage Classification Using Nasal Pressure Decoding Based on a Multi-Kernel Convolutional BiLSTM Network
IEEE Trans Neural Syst Rehabil Eng. 2024 Jun 28;PP. doi: 10.1109/TNSRE.2024.3420715. Online ahead of print.
ABSTRACT
Sleep quality is an essential parameter of a healthy human life, while sleep disorders such as sleep apnea are abundant. In the investigation of sleep and its malfunction, the gold-standard is polysomnography, which utilizes an extensive range of variables for sleep stage classification. However, undergoing full polysomnography, which requires many sensors that are directly connected to the heaviness of the setup and the discomfort of sleep, brings a significant burden. In this study, sleep stage classification was performed using the single dimension of nasal pressure, dramatically decreasing the complexity of the process. In turn, such improvements could increase the much needed clinical applicability. Specifically, we propose a deep learning structure consisting of multi-kernel convolutional neural networks and bidirectional long short-term memory for sleep stage classification. Sleep stages of 25 healthy subjects were classified into 3-class (wake, rapid eye movement (REM), and non-REM) and 4-class (wake, REM, light, and deep sleep) based on nasal pressure. Following a leave-one-subject-out cross-validation, in the 3-class the accuracy was 0.704, the F1-score was 0.490, and the kappa value was 0.283 for the overall metrics. In the 4-class, the accuracy was 0.604, the F1-score was 0.349, and the kappa value was 0.217 for the overall metrics. This was higher than the four comparative models, including the class-wise F1-score. This result demonstrates the possibility of a sleep stage classification model only using easily applicable and highly practical nasal pressure recordings. This is also likely to be used with interventions that could help treat sleep-related diseases.
PMID:38941194 | DOI:10.1109/TNSRE.2024.3420715
Improvements of (177)Lu SPECT images from sparsely acquired projections by reconstruction with deep-learning-generated synthetic projections
EJNMMI Phys. 2024 Jun 28;11(1):53. doi: 10.1186/s40658-024-00655-x.
ABSTRACT
BACKGROUND: For dosimetry, the demand for whole-body SPECT/CT imaging, which require long acquisition durations with dual-head Anger cameras, is increasing. Here we evaluated sparsely acquired projections and assessed whether the addition of deep-learning-generated synthetic intermediate projections (SIPs) could improve the image quality while preserving dosimetric accuracy.
METHODS: This study included 16 patients treated with 177Lu-DOTATATE with SPECT/CT imaging (120 projections, 120P) at four time points. Deep neural networks (CUSIPs) were designed and trained to compile 90 SIPs from 30 acquired projections (30P). The 120P, 30P, and three different CUSIP sets (30P + 90 SIPs) were reconstructed using Monte Carlo-based OSEM reconstruction (yielding 120P_rec, 30P_rec, and CUSIP_recs). The noise levels were visually compared. Quantitative measures of normalised root mean square error, normalised mean absolute error, peak signal-to-noise ratio, and structural similarity were evaluated, and kidney and bone marrow absorbed doses were estimated for each reconstruction set.
RESULTS: The use of SIPs visually improved noise levels. All quantitative measures demonstrated high similarity between CUSIP sets and 120P. Linear regression showed nearly perfect concordance of the kidney and bone marrow absorbed doses for all reconstruction sets, compared to the doses of 120P_rec (R2 ≥ 0.97). Compared to 120P_rec, the mean relative difference in kidney absorbed dose, for all reconstruction sets, was within 3%. For bone marrow absorbed doses, there was a higher dissipation in relative differences, and CUSIP_recs outperformed 30P_rec in mean relative difference (within 4% compared to 9%). Kidney and bone marrow absorbed doses for 30P_rec were statistically significantly different from those of 120_rec, as opposed to the absorbed doses of the best performing CUSIP_rec, where no statistically significant difference was found.
CONCLUSION: When performing SPECT/CT reconstruction, the use of SIPs can substantially reduce acquisition durations in SPECT/CT imaging, enabling acquisition of multiple fields of view of high image quality with satisfactory dosimetric accuracy.
PMID:38941040 | DOI:10.1186/s40658-024-00655-x
Improved soft-tissue visibility on cone-beam computed tomography with an image-generating artificial intelligence model using a cyclic generative adversarial network
Oral Radiol. 2024 Jun 28. doi: 10.1007/s11282-024-00763-5. Online ahead of print.
ABSTRACT
OBJECTIVES: The objective of this study was to enhance the visibility of soft tissues on cone-beam computed tomography (CBCT) using a CycleGAN network trained on CT images.
METHODS: Training and evaluation of the CycleGAN were conducted using CT and CBCT images collected from Aichi Gakuin University (α facility) and Osaka Dental University (β facility). Synthesized images (sCBCT) output by the CycleGAN network were evaluated by comparing them with the original images (oCBCT) and CT images, and assessments were made using histogram analysis and human scoring of soft-tissue anatomical structures and cystic lesions.
RESULTS: The histogram analysis showed that on sCBCT, soft-tissue anatomical structures showed significant shifts in voxel intensity toward values resembling those on CT, with the mean values for all structures approaching those of CT and the specialists' visibility scores being significantly increased. However, improvement in the visibility of cystic lesions was limited.
CONCLUSIONS: Image synthesis using CycleGAN significantly improved the visibility of soft tissue on CBCT, with this improvement being particularly notable from the submandibular region to the floor of the mouth. Although the effect on the visibility of cystic lesions was limited, there is potential for further improvement through refinement of the training method.
PMID:38941003 | DOI:10.1007/s11282-024-00763-5
Epiretinal membranes in patients with uveitis: an update on the current state of management
Int Ophthalmol. 2024 Jun 28;44(1):291. doi: 10.1007/s10792-024-03199-2.
ABSTRACT
PURPOSE: This review aims to summarize the current knowledge concerning the clinical features, diagnostic work-up, and therapeutic approach of uveitic epiretinal membranes (ERM).
METHODS: A thorough investigation of the literature was conducted using the PubMed database. Additionally, a complementary search was carried out on Google Scholar to ensure the inclusion of all relevant items in the collection.
RESULTS: ERM is an abnormal layer at the vitreoretinal interface, resulting from myofibroblastic cell proliferation along the inner surface of the central retina, causing visual impairment. Known by various names, ERM has diverse causes, including idiopathic or secondary factors, with ophthalmic imaging techniques like OCT improving detection. In uveitis, ERM occurrence is common, and surgical intervention involves pars plana vitrectomy with ERM peeling, although debates persist on optimal approaches.
CONCLUSIONS: Histopathological studies and OCT advancements improved ERM understanding, revealing a diverse group of diseases without a unified model. Consensus supports surgery for uveitic ERM in progressive cases, but variability requires careful consideration and effective inflammation management. OCT biomarkers, deep learning, and surgical advances may enhance outcomes, and medical interventions and robotics show promise for early ERM intervention.
PMID:38940960 | DOI:10.1007/s10792-024-03199-2
Ultra-high resolution computed tomography with deep-learning-reconstruction: diagnostic ability in the assessment of gastric cancer and the depth of invasion
Abdom Radiol (NY). 2024 Jun 28. doi: 10.1007/s00261-024-04363-z. Online ahead of print.
ABSTRACT
PURPOSE: To evaluate the image quality of ultra-high-resolution CT (U-HRCT) images reconstructed using an improved deep-learning-reconstruction (DLR) method. Additionally, we assessed the utility of U-HRCT in visualizing gastric wall structure, detecting gastric cancer, and determining the depth of invasion.
METHODS: Forty-six patients with resected gastric cancer who underwent preoperative contrast-enhanced U-HRCT were included. The image quality of U-HRCT reconstructed using three different methods (standard DLR [AiCE], improved DLR-AiCE-Body Sharp [improved AiCE-BS], and hybrid-IR [AIDR3D]) was compared. Visualization of the gastric wall's three-layered structure in four regions and the visibility of gastric cancers were compared between U-HRCT and conventional HRCT (C-HRCT). The diagnostic ability of U-HRCT with the improved AiCE-BS for determining the depth of invasion of gastric cancers was assessed using postoperative pathology specimens.
RESULTS: The mean noise level of U-HRCT with the improved AiCE-BS was significantly lower than that of the other two methods (p < 0.001). The overall image quality scores of the improved AiCE-BS images were significantly higher (p < 0.001). U-HRCT demonstrated significantly better conspicuity scores for the three-layered structure of the gastric wall than C-HRCT in all regions (p < 0.001). In addition, U-HRCT was found to have superior visibility of gastric cancer in comparison to C-HRCT (p < 0.001). The correct diagnostic rates for determining the depth of invasion of gastric cancer using C-HRCT and U-HRCT were 80%.
CONCLUSIONS: U-HRCT reconstructed with the improved AiCE-BS provides clearer visualization of the three-layered gastric wall structure than other reconstruction methods. It is also valuable for detecting gastric cancer and assessing the depth of invasion.
PMID:38940910 | DOI:10.1007/s00261-024-04363-z
GLGFormer: Global Local Guidance Network for Mucosal Lesion Segmentation in Gastrointestinal Endoscopy Images
J Imaging Inform Med. 2024 Jun 28. doi: 10.1007/s10278-024-01162-2. Online ahead of print.
ABSTRACT
Automatic mucosal lesion segmentation is a critical component in computer-aided clinical support systems for endoscopic image analysis. Image segmentation networks currently rely mainly on convolutional neural networks (CNNs) and Transformers, which have demonstrated strong performance in various applications. However, they cannot cope with blurred lesion boundaries and lesions of different scales in gastrointestinal endoscopy images. To address these challenges, we propose a new Transformer-based network, named GLGFormer, for the task of mucosal lesion segmentation. Specifically, we design the global guidance module to guide single-scale features patch-wise, enabling them to incorporate global information from the global map without information loss. Furthermore, a partial decoder is employed to fuse these enhanced single-scale features, achieving single-scale to multi-scale enhancement. Additionally, the local guidance module is designed to refocus attention on the neighboring patch, thus enhancing local features and refining lesion boundary segmentation. We conduct experiments on a private atrophic gastritis segmentation dataset and four public gastrointestinal polyp segmentation datasets. Compared to the current lesion segmentation networks, our proposed GLGFormer demonstrates outstanding learning and generalization capabilities. On the public dataset ClinicDB, GLGFormer achieved a mean intersection over union (mIoU) of 91.0% and a mean dice coefficient (mDice) of 95.0%. On the private dataset Gastritis-Seg, GLGFormer achieved an mIoU of 90.6% and an mDice of 94.6%.
PMID:38940891 | DOI:10.1007/s10278-024-01162-2
Predicting EGFR Status After Radical Nephrectomy or Partial Nephrectomy for Renal Cell Carcinoma on CT Using a Self-attention-based Model: Variable Vision Transformer (vViT)
J Imaging Inform Med. 2024 Jun 28. doi: 10.1007/s10278-024-01180-0. Online ahead of print.
ABSTRACT
OBJECTIVE: To assess the effectiveness of the vViT model for predicting postoperative renal function decline by leveraging clinical data, medical images, and image-derived features; and to identify the most dominant factor influencing this prediction.
MATERIALS AND METHODS: We developed two models, eGFR10 and eGFR20, to identify patients with a postoperative reduction in eGFR of more than 10 and more than 20, respectively, among renal cell carcinoma patients. The eGFR10 model was trained on 75 patients and tested on 27, while the eGFR20 model was trained on 77 patients and tested on 24. The vViT model inputs included class token, patient characteristics (age, sex, BMI), comorbidities (peripheral vascular disease, diabetes, liver disease), habits (smoking, alcohol), surgical details (ischemia time, blood loss, type and procedure of surgery, approach, operative time), radiomics, and tumor and kidney imaging. We used permutation feature importance to evaluate each sector's contribution. The performance of vViT was compared with CNN models, including VGG16, ResNet50, and DenseNet121, using McNemar and DeLong tests.
RESULTS: The eGFR10 model achieved an accuracy of 0.741 and an AUC-ROC of 0.692, while the eGFR20 model attained an accuracy of 0.792 and an AUC-ROC of 0.812. The surgical and radiomics sectors were the most influential in both models. The vViT had higher accuracy and AUC-ROC than VGG16 and ResNet50, and higher AUC-ROC than DenseNet121 (p < 0.05). Specifically, the vViT did not have a statistically different AUC-ROC compared to VGG16 (p = 1.0) and ResNet50 (p = 0.7) but had a statistically different AUC-ROC compared to DenseNet121 (p = 0.87) for the eGFR10 model. For the eGFR20 model, the vViT did not have a statistically different AUC-ROC compared to VGG16 (p = 0.72), ResNet50 (p = 0.88), and DenseNet121 (p = 0.64).
CONCLUSION: The vViT model, a transformer-based approach for multimodal data, shows promise for preoperative CT-based prediction of eGFR status in patients with renal cell carcinoma.
PMID:38940889 | DOI:10.1007/s10278-024-01180-0
Challenging Complexity with Simplicity: Rethinking the Role of Single-Step Models in Computer-Aided Synthesis Planning
J Chem Inf Model. 2024 Jun 28. doi: 10.1021/acs.jcim.4c00432. Online ahead of print.
ABSTRACT
Computer-assisted synthesis planning has become increasingly important in drug discovery. While deep-learning models have shown remarkable progress in achieving high accuracies for single-step retrosynthetic predictions, their performances in retrosynthetic route planning need to be checked. This study compares the intricate single-step models with a straightforward template enumeration approach for retrosynthetic route planning on a real-world drug molecule data set. Despite the superior single-step accuracy of advanced models, the template enumeration method with a heuristic-based retrosynthesis knowledge score was found to surpass them in efficiency in searching the reaction space, achieving a higher or comparable solve rate within the same time frame. This counterintuitive result underscores the importance of efficiency and retrosynthesis knowledge in retrosynthesis route planning and suggests that future research should incorporate a simple template enumeration as a benchmark. It also suggests that this simple yet effective strategy should be considered alongside more complex models to better cater to the practical needs of computer-assisted synthesis planning in drug discovery.
PMID:38940765 | DOI:10.1021/acs.jcim.4c00432
TA-RNN: an attention-based time-aware recurrent neural network architecture for electronic health records
Bioinformatics. 2024 Jun 28;40(Supplement_1):i169-i179. doi: 10.1093/bioinformatics/btae264.
ABSTRACT
MOTIVATION: Electronic health records (EHRs) represent a comprehensive resource of a patient's medical history. EHRs are essential for utilizing advanced technologies such as deep learning (DL), enabling healthcare providers to analyze extensive data, extract valuable insights, and make precise and data-driven clinical decisions. DL methods such as recurrent neural networks (RNN) have been utilized to analyze EHR to model disease progression and predict diagnosis. However, these methods do not address some inherent irregularities in EHR data such as irregular time intervals between clinical visits. Furthermore, most DL models are not interpretable. In this study, we propose two interpretable DL architectures based on RNN, namely time-aware RNN (TA-RNN) and TA-RNN-autoencoder (TA-RNN-AE) to predict patient's clinical outcome in EHR at the next visit and multiple visits ahead, respectively. To mitigate the impact of irregular time intervals, we propose incorporating time embedding of the elapsed times between visits. For interpretability, we propose employing a dual-level attention mechanism that operates between visits and features within each visit.
RESULTS: The results of the experiments conducted on Alzheimer's Disease Neuroimaging Initiative (ADNI) and National Alzheimer's Coordinating Center (NACC) datasets indicated the superior performance of proposed models for predicting Alzheimer's Disease (AD) compared to state-of-the-art and baseline approaches based on F2 and sensitivity. Additionally, TA-RNN showed superior performance on the Medical Information Mart for Intensive Care (MIMIC-III) dataset for mortality prediction. In our ablation study, we observed enhanced predictive performance by incorporating time embedding and attention mechanisms. Finally, investigating attention weights helped identify influential visits and features in predictions.
AVAILABILITY AND IMPLEMENTATION: https://github.com/bozdaglab/TA-RNN.
PMID:38940180 | DOI:10.1093/bioinformatics/btae264
Enhancing generalizability and performance in drug-target interaction identification by integrating pharmacophore and pre-trained models
Bioinformatics. 2024 Jun 28;40(Supplement_1):i539-i547. doi: 10.1093/bioinformatics/btae240.
ABSTRACT
MOTIVATION: In drug discovery, it is crucial to assess the drug-target binding affinity (DTA). Although molecular docking is widely used, computational efficiency limits its application in large-scale virtual screening. Deep learning-based methods learn virtual scoring functions from labeled datasets and can quickly predict affinity. However, there are three limitations. First, existing methods only consider the atom-bond graph or one-dimensional sequence representations of compounds, ignoring the information about functional groups (pharmacophores) with specific biological activities. Second, relying on limited labeled datasets fails to learn comprehensive embedding representations of compounds and proteins, resulting in poor generalization performance in complex scenarios. Third, existing feature fusion methods cannot adequately capture contextual interaction information.
RESULTS: Therefore, we propose a novel DTA prediction method named HeteroDTA. Specifically, a multi-view compound feature extraction module is constructed to model the atom-bond graph and pharmacophore graph. The residue concat graph and protein sequence are also utilized to model protein structure and function. Moreover, to enhance the generalization capability and reduce the dependence on task-specific labeled data, pre-trained models are utilized to initialize the atomic features of the compounds and the embedding representations of the protein sequence. A context-aware nonlinear feature fusion method is also proposed to learn interaction patterns between compounds and proteins. Experimental results on public benchmark datasets show that HeteroDTA significantly outperforms existing methods. In addition, HeteroDTA shows excellent generalization performance in cold-start experiments and superiority in the representation learning ability of drug-target pairs. Finally, the effectiveness of HeteroDTA is demonstrated in a real-world drug discovery study.
AVAILABILITY AND IMPLEMENTATION: The source code and data are available at https://github.com/daydayupzzl/HeteroDTA.
PMID:38940179 | DOI:10.1093/bioinformatics/btae240