Deep learning

SVM-RLF-DNN: A DNN with reliefF and SVM for automatic identification of COVID from chest X-ray and CT images

Thu, 2024-05-30 06:00

Digit Health. 2024 May 27;10:20552076241257045. doi: 10.1177/20552076241257045. eCollection 2024 Jan-Dec.

ABSTRACT

AIM: To develop an advanced determination technology for detecting COVID-19 patterns from chest X-ray and CT-scan films with distinct applications of deep learning and machine learning methods.

METHODS AND MATERIALS: The newly enhanced proposed hybrid classification network (SVM-RLF-DNN) comprises of three phases: feature extraction, selection and classification. The in-depth features are extracted from a series of 3×3 convolution, 2×2 max polling operations followed by a flattened and fully connected layer of the deep neural network (DNN). ReLU activation function and Adam optimizer are used in the model. The ReliefF is an improved feature selection algorithm of Relief that uses Manhattan distance instead of Euclidean distance. Based on the significance of the feature, the ReliefF assigns weight to each extracted feature received from a fully connected layer. The weight to each feature is the average of k closest hits and misses in each class for a neighbouring instance pair in multiclass problems. The ReliefF eliminates lower-weight features by setting the node value to zero. The higher weights of the features are kept to obtain the feature selection. At the last layer of the neural network, the multiclass Support Vector Machine (SVM) is used to classify the patterns of COVID-19, viral pneumonia and healthy cases. The three classes with three binary SVM classifiers use linear kernel function for each binary SVM following a one-versus-all approach. The hinge loss function and L2-norm regularization are selected for more stable results. The proposed method is assessed on publicly available chest X-ray and CT-scan image databases from Kaggle and GitHub. The performance of the proposed classification model has comparable training, validation, and test accuracy, as well as sensitivity, specificity, and confusion matrix for quantitative evaluation on five-fold cross-validation.

RESULTS: Our proposed network has achieved test accuracy of 98.48% and 95.34% on 2-class X-rays and CT. More importantly, the proposed model's test accuracy, sensitivity, and specificity are 87.9%, 86.32%, and 90.25% for 3-class classification (COVID-19, Pneumonia, Normal) on chest X-rays. The proposed model provides the test accuracy, sensitivity, and specificity of 95.34%, 94.12%, and 96.15% for 2-class classification (COVID-19, Non-COVID) on chest CT.

CONCLUSION: Our proposed classification network experimental results indicate competitiveness with existing neural networks. The proposed neural network assists clinicians in determining and surveilling the disease.

PMID:38812845 | PMC:PMC11135098 | DOI:10.1177/20552076241257045

Categories: Literature Watch

SPECT-MPI for Coronary Artery Disease: A Deep Learning Approach

Thu, 2024-05-30 06:00

Acta Med Philipp. 2024 May 15;58(8):67-75. doi: 10.47895/amp.vi0.7582. eCollection 2024.

ABSTRACT

BACKGROUND: Worldwide, coronary artery disease (CAD) is a leading cause of mortality and morbidity and remains to be a top health priority in many countries. A non-invasive imaging modality for diagnosis of CAD such as single photon emission computed tomography-myocardial perfusion imaging (SPECT-MPI) is usually requested by cardiologists as it displays radiotracer distribution in the heart reflecting myocardial perfusion. The interpretation of SPECT-MPI is done visually by a nuclear medicine physician and is largely dependent on his clinical experience and showing significant inter-observer variability.

OBJECTIVE: The aim of the study is to apply a deep learning approach in the classification of SPECT-MPI for perfusion abnormalities using convolutional neural networks (CNN).

METHODS: A publicly available anonymized SPECT-MPI from a machine learning repository (https://www.kaggle.com/selcankaplan/spect-mpi) was used in this study involving 192 patients who underwent stress-test-rest Tc99m MPI. An exploratory approach of CNN hyperparameter selection to search for optimum neural network model was utilized with particular focus on various dropouts (0.2, 0.5, 0.7), batch sizes (8, 16, 32, 64), and number of dense nodes (32, 64, 128, 256). The base CNN model was also compared with the commonly used pre-trained CNNs in medical images such as VGG16, InceptionV3, DenseNet121 and ResNet50. All simulations experiments were performed in Kaggle using TensorFlow 2.6.0., Keras 2.6.0, and Python language 3.7.10.

RESULTS: The best performing base CNN model with parameters consisting of 0.7 dropout, batch size 8, and 32 dense nodes generated the highest normalized Matthews Correlation Coefficient at 0.909 and obtained 93.75% accuracy, 96.00% sensitivity, 96.00% precision, and 96.00% F1-score. It also obtained higher classification performance as compared to the pre-trained architectures.

CONCLUSIONS: The results suggest that deep learning approaches through the use of CNN models can be deployed by nuclear medicine physicians in their clinical practice to further augment their decision skills in the interpretation of SPECT-MPI tests. These CNN models can also be used as a dependable and valid second opinion that can aid physicians as a decision-support tool as well as serve as teaching or learning materials for the less-experienced physicians particularly those still in their training career. These highlights the clinical utility of deep learning approaches through CNN models in the practice of nuclear cardiology.

PMID:38812768 | PMC:PMC11132284 | DOI:10.47895/amp.vi0.7582

Categories: Literature Watch

Deep learning and minimally invasive inflammatory activity assessment: a proof-of-concept study for development and score correlation of a panendoscopy convolutional network

Thu, 2024-05-30 06:00

Therap Adv Gastroenterol. 2024 May 27;17:17562848241251569. doi: 10.1177/17562848241251569. eCollection 2024.

ABSTRACT

BACKGROUND: Capsule endoscopy (CE) is a valuable tool for assessing inflammation in patients with Crohn's disease (CD). The current standard for evaluating inflammation are validated scores (and clinical laboratory values) like Lewis score (LS), Capsule Endoscopy Crohn's Disease Activity Index (CECDAI), and ELIAKIM. Recent advances in artificial intelligence (AI) have made it possible to automatically select the most relevant frames in CE.

OBJECTIVES: In this proof-of-concept study, our objective was to develop an automated scoring system using CE images to objectively grade inflammation.

DESIGN: Pan-enteric CE videos (PillCam Crohn's) performed in CD patients between 09/2020 and 01/2023 were retrospectively reviewed and LS, CECDAI, and ELIAKIM scores were calculated.

METHODS: We developed a convolutional neural network-based automated score consisting of the percentage of positive frames selected by the algorithm (for small bowel and colon separately). We correlated clinical data and the validated scores with the artificial intelligence-generated score (AIS).

RESULTS: A total of 61 patients were included. The median LS was 225 (0-6006), CECDAI was 6 (0-33), ELIAKIM was 4 (0-38), and SB_AIS was 0.5659 (0-29.45). We found a strong correlation between SB_AIS and LS, CECDAI, and ELIAKIM scores (Spearman's r = 0.751, r = 0.707, r = 0.655, p = 0.001). We found a strong correlation between LS and ELIAKIM (r = 0.768, p = 0.001) and a very strong correlation between CECDAI and LS (r = 0.854, p = 0.001) and CECDAI and ELIAKIM scores (r = 0.827, p = 0.001).

CONCLUSION: Our study showed that the AI-generated score had a strong correlation with validated scores indicating that it could serve as an objective and efficient method for evaluating inflammation in CD patients. As a preliminary study, our findings provide a promising basis for future refining of a CE score that may accurately correlate with prognostic factors and aid in the management and treatment of CD patients.

PMID:38812708 | PMC:PMC11135072 | DOI:10.1177/17562848241251569

Categories: Literature Watch

PfgPDI: Pocket feature-enabled graph neural network for protein-drug interaction prediction

Thu, 2024-05-30 06:00

J Bioinform Comput Biol. 2024 Apr;22(2):2450004. doi: 10.1142/S0219720024500045. Epub 2024 May 27.

ABSTRACT

Biomolecular interaction recognition between ligands and proteins is an essential task, which largely enhances the safety and efficacy in drug discovery and development stage. Studying the interaction between proteins and ligands can improve the understanding of disease pathogenesis and lead to more effective drug targets. Additionally, it can aid in determining drug parameters, ensuring proper absorption, distribution, and metabolism within the body. Due to incomplete feature representation or the model's inadequate adaptation to protein-ligand complexes, the existing methodologies suffer from suboptimal predictive accuracy. To address these pitfalls, in this study, we designed a new deep learning method based on transformer and GCN. We first utilized the transformer network to grasp crucial information of the original protein sequences within the smile sequences and connected them to prevent falling into a local optimum. Furthermore, a series of dilation convolutions are performed to obtain the pocket features and smile features, subsequently subjected to graphical convolution to optimize the connections. The combined representations are fed into the proposed model for classification prediction. Experiments conducted on various protein-ligand binding prediction methods prove the effectiveness of our proposed method. It is expected that the PfgPDI can contribute to drug prediction and accelerate the development of new drugs, while also serving as a valuable partner for drug testing and Research and Development engineers.

PMID:38812467 | DOI:10.1142/S0219720024500045

Categories: Literature Watch

MoRF_ESM: Prediction of MoRFs in disordered proteins based on a deep transformer protein language model

Thu, 2024-05-30 06:00

J Bioinform Comput Biol. 2024 Apr;22(2):2450006. doi: 10.1142/S0219720024500069. Epub 2024 May 28.

ABSTRACT

Molecular recognition features (MoRFs) are particular functional segments of disordered proteins, which play crucial roles in regulating the phase transition of membrane-less organelles and frequently serve as central sites in cellular interaction networks. As the association between disordered proteins and severe diseases continues to be discovered, identifying MoRFs has gained growing significance. Due to the limited number of experimentally validated MoRFs, the performance of existing MoRF's prediction algorithms is not good enough and still needs to be improved. In this research, we present a model named MoRF_ESM, which utilizes deep-learning protein representations to predict MoRFs in disordered proteins. This approach employs a pretrained ESM-2 protein language model to generate embedding representations of residues in the form of attention map matrices. These representations are combined with a self-learned TextCNN model for feature extraction and prediction. In addition, an averaging step was incorporated at the end of the MoRF_ESM model to refine the output and generate final prediction results. In comparison to other impressive methods on benchmark datasets, the MoRF_ESM approach demonstrates state-of-the-art performance, achieving [Formula: see text] higher AUC than other methods when tested on TEST1 and achieving [Formula: see text] higher AUC than other methods when tested on TEST2. These results imply that the combination of ESM-2 and TextCNN can effectively extract deep evolutionary features related to protein structure and function, along with capturing shallow pattern features located in protein sequences, and is well qualified for the prediction task of MoRFs. Given that ESM-2 is a highly versatile protein language model, the methodology proposed in this study can be readily applied to other tasks involving the classification of protein sequences.

PMID:38812466 | DOI:10.1142/S0219720024500069

Categories: Literature Watch

Research on identification method of peanut pests and diseases based on lightweight LSCDNet model

Wed, 2024-05-29 06:00

Phytopathology. 2024 May 29. doi: 10.1094/PHYTO-01-24-0013-R. Online ahead of print.

ABSTRACT

Timely and accurate identification of peanut pests and diseases, coupled with effective countermeasures, are pivotal for ensuring high-quality and efficient peanut production. Despite the prevalence of pests and diseases in peanut cultivation, challenges such as minute disease spots, the elusive nature of pests, and intricate environmental conditions often lead to diminished identification accuracy and efficiency. Moreover, continuous monitoring of peanut health in real-world agricultural settings demands solutions that are computationally efficient. Traditional deep learning models often require substantial computational resources, limiting their practical applicability. In response to these challenges, we introduce LSCDNet (Lightweight Sandglass and Coordinate Attention Network), a streamlined model derived from DenseNet. LSCDNet preserves only the transition layers to reduce feature map dimensionality, simplifying the model's complexity. The inclusion of a sandglass block bolsters features extraction capabilities, mitigating potential information loss due to dimensionality reduction. Additionally, the incorporation of coordinate attention addresses issues related to positional information loss during feature extraction. Experimental results showcase that LSCDNet achieved impressive metrics with an accuracy, precision, recall, and F1 score of 96.67%, 98.05%, 95.56%, and 96.79%, respectively, while maintaining a compact parameter count of merely 0.59M. When compared to established models such as MobileNetV1, MobileNetV2, NASNetMobile, DenseNet-121, InceptionV3, and Xception, LSCDNet outperformed with accuracy gains of 2.65%, 4.87%, 8.71%, 5.04%, 6.32%, and 8.2% respectively, accompanied by substantially fewer parameters. Lastly, we deployed the LSCDNet model on Raspberry Pi for practical testing and application, achieving an average recognition accuracy of 85.36%, thereby meeting real-world operational requirements.

PMID:38810273 | DOI:10.1094/PHYTO-01-24-0013-R

Categories: Literature Watch

Rapid and Precise Diagnosis of Retroperitoneal Liposarcoma with Deep-Learned Label-Free Molecular Microscopy

Wed, 2024-05-29 06:00

Anal Chem. 2024 May 29. doi: 10.1021/acs.analchem.3c05417. Online ahead of print.

ABSTRACT

The retroperitoneal liposarcoma (RLPS) is a rare malignancy whose only curative therapy is surgical resection. However, well-differentiated liposarcomas (WDLPSs), one of its most common types, can hardly be distinguished from normal fat during operation without an effective margin assessment method, jeopardizing the prognosis severely with a high recurrence risk. Here, we combined dual label-free nonlinear optical modalities, stimulated Raman scattering (SRS) microscopy and second harmonic generation (SHG) microscopy, to image two predominant tissue biomolecules, lipids and collagen fibers, in 35 RLPSs and 34 normal fat samples collected from 35 patients. The produced dual-modal tissue images were used for RLPS diagnosis based on deep learning. Dramatically decreasing lipids and increasing collagen fibers during tumor progression were reflected. A ResNeXt101-based model achieved 94.7% overall accuracy and 0.987 mean area under the ROC curve (AUC) in differentiating among normal fat, WDLPSs, and dedifferentiated liposarcomas (DDLPSs). In particular, WDLPSs were detected with 94.1% precision and 84.6% sensitivity superior to existing methods. The ablation experiment showed that such performance was attributed to both SRS and SHG microscopies, which increased the sensitivity of recognizing WDLPS by 16.0 and 3.6%, respectively. Furthermore, we utilized this model on RLPS margins to identify the tumor infiltration. Our method holds great potential for accurate intraoperative liposarcoma detection.

PMID:38810149 | DOI:10.1021/acs.analchem.3c05417

Categories: Literature Watch

Inferring gene regulatory networks from single-cell transcriptomics based on graph embedding

Wed, 2024-05-29 06:00

Bioinformatics. 2024 May 29:btae291. doi: 10.1093/bioinformatics/btae291. Online ahead of print.

ABSTRACT

MOTIVATION: Gene regulatory networks (GRNs) encode gene regulation in living organisms, and have become a critical tool to understand complex biological processes. However, due to the dynamic and complex nature of gene regulation, inferring GRNs from scRNA-seq data is still a challenging task. Existing computational methods usually focus on the close connections between genes, and ignore the global structure and distal regulatory relationships.

RESULTS: In this study, we develop a supervised deep learning framework, IGEGRNS, to infer gene regulatory networks from scRNA-seq data based on graph embedding. In the framework, contextual information of genes is captured by GraphSAGE, which aggregates gene features and neighborhood structures to generate low-dimensional embedding for genes. Then, the k most influential nodes in the whole graph are filtered through Top-k pooling. Finally, potential regulatory relationships between genes are predicted by stacking CNNs. Compared with nine competing supervised and unsupervised methods, our method achieves better performance on six time-series scRNA-seq datasets.

AVAILABILITY AND IMPLEMENTATION: Our method IGEGRNS is implemented in Python using the Pytorch machine learning library, and it is freely available at https://github.com/DHUDBlab/IGEGRNS.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

PMID:38810116 | DOI:10.1093/bioinformatics/btae291

Categories: Literature Watch

FMCA-DTI: A Fragment-oriented method based on a Multihead Cross Attention mechanism to improve Drug-Target Interaction prediction

Wed, 2024-05-29 06:00

Bioinformatics. 2024 May 29:btae347. doi: 10.1093/bioinformatics/btae347. Online ahead of print.

ABSTRACT

MOTIVATION: Identifying drug-target interactions (DTI) is crucial in drug discovery. Fragments are less complex and can accurately characterize local features, which is important in DTI prediction. Recently, deep learning (DL) -based methods predict DTI more efficiently. However, two challenges remain in existing DL-based methods: (i) some methods directly encode drugs and proteins into integers, ignoring the substructure representation; (ii) some methods learn the features of the drugs and proteins separately instead of considering their interactions.

RESULTS: In this paper, we propose a fragment-oriented method based on a multihead cross attention mechanism for predicting DTI, named FMCA-DTI. FMCA-DTI obtains multiple types of fragments of drugs and proteins by branch chain mining and category fragment mining. Importantly, FMCA-DTI utilizes the shared-weight-based multihead cross attention mechanism to learn the complex interaction features between different fragments. Experiments on three benchmark datasets show that FMCA-DTI achieves significantly improved performance by comparing it with four state-of-the-art baselines.

AVAILABILITY: The code for this workflow is available at: https://github.com/jacky102022/FMCA-DTI.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

PMID:38810106 | DOI:10.1093/bioinformatics/btae347

Categories: Literature Watch

Surface plasmons-phonons for mid-infrared hyperspectral imaging

Wed, 2024-05-29 06:00

Sci Adv. 2024 May 31;10(22):eado3179. doi: 10.1126/sciadv.ado3179. Epub 2024 May 29.

ABSTRACT

Surface plasmons have proven their ability to boost the sensitivity of mid-infrared hyperspectral imaging by enhancing light-matter interactions. Surface phonons, a counterpart technology to plasmons, present unclear contributions to hyperspectral imaging. Here, we investigate this by developing a plasmon-phonon hyperspectral imaging system that uses asymmetric cross-shaped nanoantennas composed of stacked plasmon-phonon materials. The phonon modes within this system, controlled by light polarization, capture molecular refractive index intensity and lineshape features, distinct from those observed with plasmons, enabling more precise and sensitive molecule identification. In a deep learning-assisted imaging demonstration of severe acute respiratory syndrome coronavirus (SARS-CoV), phonons exhibit enhanced identification capabilities (230,400 spectra/s), facilitating the de-overlapping and observation of the spatial distribution of two mixed SARS-CoV spike proteins. In addition, the plasmon-phonon system demonstrates increased identification accuracy (93%), heightened sensitivity, and enhanced detection limits (down to molecule monolayers). These findings extend phonon polaritonics to hyperspectral imaging, promising applications in imaging-guided molecule screening and pharmaceutical analysis.

PMID:38809968 | DOI:10.1126/sciadv.ado3179

Categories: Literature Watch

Leveraging conformal prediction to annotate enzyme function space with limited false positives

Wed, 2024-05-29 06:00

PLoS Comput Biol. 2024 May 29;20(5):e1012135. doi: 10.1371/journal.pcbi.1012135. Online ahead of print.

ABSTRACT

Machine learning (ML) is increasingly being used to guide biological discovery in biomedicine such as prioritizing promising small molecules in drug discovery. In those applications, ML models are used to predict the properties of biological systems, and researchers use these predictions to prioritize candidates as new biological hypotheses for downstream experimental validations. However, when applied to unseen situations, these models can be overconfident and produce a large number of false positives. One solution to address this issue is to quantify the model's prediction uncertainty and provide a set of hypotheses with a controlled false discovery rate (FDR) pre-specified by researchers. We propose CPEC, an ML framework for FDR-controlled biological discovery. We demonstrate its effectiveness using enzyme function annotation as a case study, simulating the discovery process of identifying the functions of less-characterized enzymes. CPEC integrates a deep learning model with a statistical tool known as conformal prediction, providing accurate and FDR-controlled function predictions for a given protein enzyme. Conformal prediction provides rigorous statistical guarantees to the predictive model and ensures that the expected FDR will not exceed a user-specified level with high probability. Evaluation experiments show that CPEC achieves reliable FDR control, better or comparable prediction performance at a lower FDR than existing methods, and accurate predictions for enzymes under-represented in the training data. We expect CPEC to be a useful tool for biological discovery applications where a high yield rate in validation experiments is desired but the experimental budget is limited.

PMID:38809942 | DOI:10.1371/journal.pcbi.1012135

Categories: Literature Watch

Spectral Tensor Layers for Communication-Free Distributed Deep Learning

Wed, 2024-05-29 06:00

IEEE Trans Neural Netw Learn Syst. 2024 May 29;PP. doi: 10.1109/TNNLS.2024.3394861. Online ahead of print.

ABSTRACT

In this article, we propose a novel spectral tensor layer for communication-free distributed deep learning. The overall framework is as follows: first, we represent the data in tensor form (instead of vector form) and replace the matrix product in conventional neural networks with the tensor product, which in effect imposes certain transformed-induced structure on the original weight matrices, e.g., a block-circulant structure; then, we apply a linear transform along a certain dimension to split the original dataset into multiple spectral subdatasets; as a result, the proposed spectral tensor network consists of parallel branches where each branch is a conventional neural network trained on a spectral subdataset with ZERO communication cost. The parallel branches are directly ensembled (i.e., the weighted sum of their outputs) to generate an overall network with substantially stronger generalization capability than that of each branch. Moreover, the proposed method enjoys a byproduct of decentralization gain in terms of memory and computation, compared with traditional networks. It is a natural yet elegant solution for heterogeneous data in federated learning (FL), where data at different nodes have different resolutions. Finally, we evaluate the proposed spectral tensor networks on the MNIST, CIFAR-10, ImageNet-1K, and ImageNet-21K datasets, respectively, to verify that they simultaneously achieve communication-free distributed learning, distributed storage reduction, parallel computation speedup, and learning with multiresolution data.

PMID:38809740 | DOI:10.1109/TNNLS.2024.3394861

Categories: Literature Watch

Brain-Inspired Learning, Perception, and Cognition: A Comprehensive Review

Wed, 2024-05-29 06:00

IEEE Trans Neural Netw Learn Syst. 2024 May 29;PP. doi: 10.1109/TNNLS.2024.3401711. Online ahead of print.

ABSTRACT

The progress of brain cognition and learning mechanisms has provided new inspiration for the next generation of artificial intelligence (AI) and provided the biological basis for the establishment of new models and methods. Brain science can effectively improve the intelligence of existing models and systems. Compared with other reviews, this article provides a comprehensive review of brain-inspired deep learning algorithms for learning, perception, and cognition from microscopic, mesoscopic, macroscopic, and super-macroscopic perspectives. First, this article introduces the brain cognition mechanism. Then, it summarizes the existing studies on brain-inspired learning and modeling from the perspectives of neural structure, cognitive module, learning mechanism, and behavioral characteristics. Next, this article introduces the potential learning directions of brain-inspired learning from four aspects: perception, cognition, understanding, and decision-making. Finally, the top-ten open problems that brain-inspired learning, perception, and cognition currently face are summarized, and the next generation of AI technology has been prospected. This work intends to provide a quick overview of the research on brain-inspired AI algorithms and to motivate future research by illuminating the latest developments in brain science.

PMID:38809737 | DOI:10.1109/TNNLS.2024.3401711

Categories: Literature Watch

Learning Attention in the Frequency Domain for Flexible Real Photograph Denoising

Wed, 2024-05-29 06:00

IEEE Trans Image Process. 2024 May 29;PP. doi: 10.1109/TIP.2024.3404253. Online ahead of print.

ABSTRACT

Recent advancements in deep learning techniques have pushed forward the frontiers of real photograph denoising. However, due to the inherent pooling operations in the spatial domain, current CNN-based denoisers are biased towards focusing on low-frequency representations, while discarding the high-frequency components. This will induce a problem for suboptimal visual quality as the image denoising tasks target completely eliminating the complex noises and recovering all fine-scale and salient information. In this work, we tackle this challenge from the frequency perspective and present a new solution pipeline, coined as frequency attention denoising network (FADNet). Our key idea is to build a learning-based frequency attention framework, where the feature correlations on a broader frequency spectrum can be fully characterized, thus enhancing the representational power of the network across multiple frequency channels. Based on this, we design a cascade of adaptive instance residual modules (AIRMs). In each AIRM, we first transform the spatial-domain features into the frequency space. Then, a learning-based frequency attention framework is devised to explore the feature inter-dependencies converted in the frequency domain. Besides this, we introduce an adaptive layer by leveraging the guidance of the estimated noise map and intermediate features to meet the challenges of model generalization in the noise discrepancy. The effectiveness of our method is demonstrated on several real camera benchmark datasets, with superior denoising performance, generalization capability, and efficiency versus the state-of-the-art.

PMID:38809730 | DOI:10.1109/TIP.2024.3404253

Categories: Literature Watch

A Novel Skip-Connection Strategy by Fusing Spatial and Channel Wise Features for Multi-Region Medical Image Segmentation

Wed, 2024-05-29 06:00

IEEE J Biomed Health Inform. 2024 May 29;PP. doi: 10.1109/JBHI.2024.3406786. Online ahead of print.

ABSTRACT

Recent methods often introduce attention mechanisms into the skip connections of U-shaped networks to capture features. However, these methods usually overlook spatial information extraction in skip connections and exhibit inefficiency in capturing spatial and channel information. This issue prompts us to reevaluate the design of the skip-connection mechanism and propose a new deep-learning network called the Fusing Spatial and Channel Attention Network, abbreviated as FSCA-Net. FSCA-Net is a novel U-shaped network architecture that utilizes the Parallel Attention Transformer (PAT) to enhance the extraction of spatial and channel features in the skip-connection mechanism, further compensating for downsampling losses. We design the Cross-Attention Bridge Layer (CAB) to mitigate excessive feature and resolution loss when downsampling to the lowest level, ensuring meaningful information fusion during upsampling at the lowest level. Finally, we construct the Dual-Path Channel Attention (DPCA) module to guide channel and spatial information filtering for Transformer features, eliminating ambiguities with decoder features and better concatenating features with semantic inconsistencies between the Transformer and the U-Net decoder. FSCA-Net is designed explicitly for fine-grained segmentation tasks of multiple organs and regions. Our approach achieves over 48% reduction in FLOPs and over 32% reduction in parameters compared to the state-of-the-art method. Moreover, FSCA-Net outperforms existing segmentation methods on seven public datasets, demonstrating exceptional performance. The code has been made available on GitHub: https://github.com/Henry991115/FSCA-Net.

PMID:38809722 | DOI:10.1109/JBHI.2024.3406786

Categories: Literature Watch

Convolutional Neural Network-Based Prediction of Axial Length Using Color Fundus Photography

Wed, 2024-05-29 06:00

Transl Vis Sci Technol. 2024 May 1;13(5):23. doi: 10.1167/tvst.13.5.23.

ABSTRACT

PURPOSE: To develop convolutional neural network (CNN)-based models for predicting the axial length (AL) using color fundus photography (CFP) and explore associated clinical and structural characteristics.

METHODS: This study enrolled 1105 fundus images from 467 participants with ALs ranging from 19.91 to 32.59 mm, obtained at National Taiwan University Hospital between 2020 and 2021. The AL measurements obtained from a scanning laser interferometer served as the gold standard. The accuracy of prediction was compared among CNN-based models with different inputs, including CFP, age, and/or sex. Heatmaps were interpreted by integrated gradients.

RESULTS: Using age, sex, and CFP as input, the mean ± standard deviation absolute error (MAE) for AL prediction by the model was 0.771 ± 0.128 mm, outperforming models that used age and sex alone (1.263 ± 0.115 mm; P < 0.001) and CFP alone (0.831 ± 0.216 mm; P = 0.016) by 39.0% and 7.31%, respectively. The removal of relatively poor-quality CFPs resulted in a slight MAE reduction to 0.759 ± 0.120 mm without statistical significance (P = 0.24). The inclusion of age and CFP improved prediction accuracy by 5.59% (P = 0.043), while adding sex had no significant improvement (P = 0.41). The optic disc and temporal peripapillary area were highlighted as the focused areas on the heatmaps.

CONCLUSIONS: Deep learning-based prediction of AL using CFP was fairly accurate and enhanced by age inclusion. The optic disc and temporal peripapillary area may contain crucial structural information for AL prediction in CFP.

TRANSLATIONAL RELEVANCE: This study might aid AL assessments and the understanding of the morphologic characteristics of the fundus related to AL.

PMID:38809531 | DOI:10.1167/tvst.13.5.23

Categories: Literature Watch

Signal separation of simultaneous dual-tracer PET imaging based on global spatial information and channel attention

Wed, 2024-05-29 06:00

EJNMMI Phys. 2024 May 29;11(1):47. doi: 10.1186/s40658-024-00649-9.

ABSTRACT

BACKGROUND: Simultaneous dual-tracer positron emission tomography (PET) imaging efficiently provides more complete information for disease diagnosis. The signal separation has long been a challenge of dual-tracer PET imaging. To predict the single-tracer images, we proposed a separation network based on global spatial information and channel attention, and connected it to FBP-Net to form the FBPnet-Sep model.

RESULTS: Experiments using simulated dynamic PET data were conducted to: (1) compare the proposed FBPnet-Sep model to Sep-FBPnet model and currently existing Multi-task CNN, (2) verify the effectiveness of modules incorporated in FBPnet-Sep model, (3) investigate the generalization of FBPnet-Sep model to low-dose data, and (4) investigate the application of FBPnet-Sep model to multiple tracer combinations with decay corrections. Compared to the Sep-FBPnet model and Multi-task CNN, the FBPnet-Sep model reconstructed single-tracer images with higher structural similarity, peak signal-to-noise ratio and lower mean squared error, and reconstructed time-activity curves with lower bias and variation in most regions. Excluding the Inception or channel attention module resulted in degraded image qualities. The FBPnet-Sep model showed acceptable performance when applied to low-dose data. Additionally, it could deal with multiple tracer combinations. The qualities of predicted images, as well as the accuracy of derived time-activity curves and macro-parameters were slightly improved by incorporating a decay correction module.

CONCLUSIONS: The proposed FBPnet-Sep model was considered a potential method for the reconstruction and signal separation of simultaneous dual-tracer PET imaging.

PMID:38809438 | DOI:10.1186/s40658-024-00649-9

Categories: Literature Watch

Improving Laryngoscopy Image Analysis Through Integration of Global Information and Local Features in VoFoCD Dataset

Wed, 2024-05-29 06:00

J Imaging Inform Med. 2024 May 29. doi: 10.1007/s10278-024-01068-z. Online ahead of print.

ABSTRACT

The diagnosis and treatment of vocal fold disorders heavily rely on the use of laryngoscopy. A comprehensive vocal fold diagnosis requires accurate identification of crucial anatomical structures and potential lesions during laryngoscopy observation. However, existing approaches have yet to explore the joint optimization of the decision-making process, including object detection and image classification tasks simultaneously. In this study, we provide a new dataset, VoFoCD, with 1724 laryngology images designed explicitly for object detection and image classification in laryngoscopy images. Images in the VoFoCD dataset are categorized into four classes and comprise six glottic object types. Moreover, we propose a novel Multitask Efficient trAnsformer network for Laryngoscopy (MEAL) to classify vocal fold images and detect glottic landmarks and lesions. To further facilitate interpretability for clinicians, MEAL provides attention maps to visualize important learned regions for explainable artificial intelligence results toward supporting clinical decision-making. We also analyze our model's effectiveness in simulated clinical scenarios where shaking of the laryngoscopy process occurs. The proposed model demonstrates outstanding performance on our VoFoCD dataset. The accuracy for image classification and mean average precision at an intersection over a union threshold of 0.5 (mAP50) for object detection are 0.951 and 0.874, respectively. Our MEAL method integrates global knowledge, encompassing general laryngoscopy image classification, into local features, which refer to distinct anatomical regions of the vocal fold, particularly abnormal regions, including benign and malignant lesions. Our contribution can effectively aid laryngologists in identifying benign or malignant lesions of vocal folds and classifying images in the laryngeal endoscopy process visually.

PMID:38809338 | DOI:10.1007/s10278-024-01068-z

Categories: Literature Watch

Magnetic nanoparticles for magnetic particle imaging (MPI): design and applications

Wed, 2024-05-29 06:00

Nanoscale. 2024 May 29. doi: 10.1039/d4nr01195c. Online ahead of print.

ABSTRACT

Recent advancements in medical imaging have brought forth various techniques such as magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), and ultrasound, each contributing to improved diagnostic capabilities. Most recently, magnetic particle imaging (MPI) has become a rapidly advancing imaging modality with profound implications for medical diagnostics and therapeutics. By directly detecting the magnetization response of magnetic tracers, MPI surpasses conventional imaging modalities in sensitivity and quantifiability, particularly in stem cell tracking applications. Herein, this comprehensive review explores the fundamental principles, instrumentation, magnetic nanoparticle tracer design, and applications of MPI, offering insights into recent advancements and future directions. Novel tracer designs, such as zinc-doped iron oxide nanoparticles (Zn-IONPs), exhibit enhanced performance, broadening MPI's utility. Spatial encoding strategies, scanning trajectories, and instrumentation innovations are elucidated, illuminating the technical underpinnings of MPI's evolution. Moreover, integrating machine learning and deep learning methods enhances MPI's image processing capabilities, paving the way for more efficient segmentation, quantification, and reconstruction. The potential of superferromagnetic iron oxide nanoparticle chains (SFMIOs) as new MPI tracers further advanced the imaging quality and expanded clinical applications, underscoring the promising future of this emerging imaging modality.

PMID:38809214 | DOI:10.1039/d4nr01195c

Categories: Literature Watch

Enhancing multimodal deep learning for improved precision and efficiency in medical diagnostics

Wed, 2024-05-29 06:00

J Eur Acad Dermatol Venereol. 2024 May 29. doi: 10.1111/jdv.20166. Online ahead of print.

NO ABSTRACT

PMID:38808949 | DOI:10.1111/jdv.20166

Categories: Literature Watch

Pages