Deep learning

Texture preserving low dose CT image denoising using Pearson divergence

Tue, 2024-04-30 06:00

Phys Med Biol. 2024 Apr 30. doi: 10.1088/1361-6560/ad45a4. Online ahead of print.

ABSTRACT

The mean squared error (MSE), also known as $L_2$ loss, has been widely used as a loss function to optimize image denoising models due to its strong performance as a mean estimator of the Gaussian noise model. Recently, various low-dose computed tomography (LDCT) image denoising methods using deep learning combined with the MSE loss have been developed; however, this approach has been observed to suffer from the regression-to-the-mean problem, leading to over-smoothed edges and degradation of texture in the image.
Approach: To overcome this issue, we propose a stochastic function in the loss function to improve the texture of the denoised CT images, rather than relying on complicated networks or feature space losses. The proposed loss function includes the MSE loss to learn the mean distribution and the Pearson divergence loss to learn feature textures. Specifically, the Pearson divergence loss is computed in an image space to measure the distance between two intensity measures of denoised low-dose and normal-dose CT images. The evaluation of the proposed model employs a novel approach of multi-metric quantitative analysis utilizing relative texture feature distance.
Results: Our experimental results show that the proposed Pearson divergence loss leads to a significant improvement in texture compared to the conventional MSE loss and generative adversarial network (GAN), both qualitatively and quantitatively.
Significance: Achieving consistent texture preservation in LDCT is a challenge in conventional GAN-type methods due to adversarial aspects aimed at minimizing noise while preserving texture. By incorporating the Pearson regularizer in the loss function, we can easily achieve a balance between two conflicting properties. Consistent high-quality CT images can significantly help clinicians in diagnoses and supporting researchers in the development of AI-diagnostic models.

PMID:38688292 | DOI:10.1088/1361-6560/ad45a4

Categories: Literature Watch

FlexDTI: flexible diffusion gradient encoding scheme-based highly efficient diffusion tensor imaging using deep learning

Tue, 2024-04-30 06:00

Phys Med Biol. 2024 Apr 30. doi: 10.1088/1361-6560/ad45a5. Online ahead of print.

ABSTRACT


Most deep neural network-based diffusion tensor imaging methods require the diffusion gradients' number and directions in the data to be reconstructed to match those in the training data. This work aims to develop and evaluate a novel dynamic-convolution-based method called FlexDTI for highly efficient diffusion tensor reconstruction with flexible diffusion encoding gradient scheme.
Approach:
FlexDTI was developed to achieve high-quality DTI parametric mapping with flexible number and directions of diffusion encoding gradients. The method used dynamic convolution kernels to embed diffusion gradient direction information into feature maps of the corresponding diffusion signal. Furthermore, it realized the generalization of a flexible number of diffusion gradient directions by setting the maximum number of input channels of the network. The network was trained and tested using datasets from the Human Connectome Project and local hospitals. Results from FlexDTI and other advanced tensor parameter estimation methods were compared.
Main results:
Compared to other methods, FlexDTI successfully achieves high-quality diffusion tensor-derived parameters even if the number and directions of diffusion encoding gradients change. It reduces normalized root mean squared error (NRMSE) by about 50% on fractional anisotropy (FA) and 15% on mean diffusivity (MD), compared with the state-of-the-art deep learning method with flexible diffusion encoding gradient scheme.
Significance:
FlexDTI can well learn diffusion gradient direction information to achieve generalized DTI reconstruction with flexible diffusion gradient scheme. Both flexibility and reconstruction quality can be taken into account in this network.&#xD.

PMID:38688288 | DOI:10.1088/1361-6560/ad45a5

Categories: Literature Watch

Automatic detection of bumblefoot in cage-free hens using computer vision technologies

Tue, 2024-04-30 06:00

Poult Sci. 2024 Apr 20;103(7):103780. doi: 10.1016/j.psj.2024.103780. Online ahead of print.

ABSTRACT

Cage-free (CF) housing systems are expected to be the dominant egg production system in North America and European Union countries by 2030. Within these systems, bumblefoot (a common bacterial infection and chronic inflammatory reaction) is mostly observed in hens reared on litter floors. It causes pain and stress in hens and is detrimental to their welfare. For instance, hens with bumblefoot have difficulty moving freely, thus hindering access to feeders and drinkers. However, it is technically challenging to detect hens with bumblefoot, and no automatic methods have been applied for hens' bumblefoot detection (BFD), especially in its early stages. This study aimed to develop and test artificial intelligence methods (i.e., deep learning models) to detect hens' bumblefoot condition in a CF environment under various settings such as epochs (number of times the entire dataset passes through the network during training), batch size (number of data samples processed per iteration during training), and camera height. The performance of 3 newly developed deep learning models (i.e., YOLOv5s-BFD, YOLOv5m-BFD, & YOLOv5x-BFD) were compared in detecting hens with bumblefoot of hens in CF environments. The result shows that the YOLOv5m-BFD model had the highest precision (93.7%), recall (84.6%), mAP@0.50 (90.9%), mAP@0.50:0.95 (51.8%), and F1-score (89.0%) compared with other models. The observed YOLOv5m-BFD model trained at 400 epochs and batch size 16 is recommended for bumblefoot detection in laying hens. This study provides a basis for developing an automatic bumblefoot detection system in commercial CF houses. This model will be modified and trained to detect the occurrence of broilers with bumblefoot in the future.

PMID:38688138 | DOI:10.1016/j.psj.2024.103780

Categories: Literature Watch

Nodule-CLIP: Lung nodule classification based on multi-modal contrastive learning

Tue, 2024-04-30 06:00

Comput Biol Med. 2024 Apr 26;175:108505. doi: 10.1016/j.compbiomed.2024.108505. Online ahead of print.

ABSTRACT

The latest developments in deep learning have demonstrated the importance of CT medical imaging for the classification of pulmonary nodules. However, challenges remain in fully leveraging the relevant medical annotations of pulmonary nodules and distinguishing between the benign and malignant labels of adjacent nodules. Therefore, this paper proposes the Nodule-CLIP model, which deeply mines the potential relationship between CT images, complex attributes of lung nodules, and benign and malignant attributes of lung nodules through a comparative learning method, and optimizes the model in the image feature extraction network by using its similarities and differences to improve its ability to distinguish similar lung nodules. Firstly, we segment the 3D lung nodule information by U-Net to reduce the interference caused by the background of lung nodules and focus on the lung nodule images. Secondly, the image features, class features, and complex attribute features are aligned by contrastive learning and loss function in Nodule-CLIP to achieve lung nodule image optimization and improve classification ability. A series of testing and ablation experiments were conducted on the public dataset LIDC-IDRI, and the final benign and malignant classification rate was 90.6%, and the recall rate was 92.81%. The experimental results show the advantages of this method in terms of lung nodule classification as well as interpretability.

PMID:38688129 | DOI:10.1016/j.compbiomed.2024.108505

Categories: Literature Watch

A dual data stream hybrid neural network for classifying pathological images of lung adenocarcinoma

Tue, 2024-04-30 06:00

Comput Biol Med. 2024 Apr 24;175:108519. doi: 10.1016/j.compbiomed.2024.108519. Online ahead of print.

ABSTRACT

Lung cancer has seriously threatened human health due to its high lethality and morbidity. Lung adenocarcinoma, in particular, is one of the most common subtypes of lung cancer. Pathological diagnosis is regarded as the gold standard for cancer diagnosis. However, the traditional manual screening of lung cancer pathology images is time consuming and error prone. Computer-aided diagnostic systems have emerged to solve this problem. Current research methods are unable to fully exploit the beneficial features inherent within patches, and they are characterized by high model complexity and significant computational effort. In this study, a deep learning framework called Multi-Scale Network (MSNet) is proposed for the automatic detection of lung adenocarcinoma pathology images. MSNet is designed to efficiently harness the valuable features within data patches, while simultaneously reducing model complexity, computational demands, and storage space requirements. The MSNet framework employs a dual data stream input method. In this input method, MSNet combines Swin Transformer and MLP-Mixer models to address global information between patches and the local information within each patch. Subsequently, MSNet uses the Multilayer Perceptron (MLP) module to fuse local and global features and perform classification to output the final detection results. In addition, a dataset of lung adenocarcinoma pathology images containing three categories is created for training and testing the MSNet framework. Experimental results show that the diagnostic accuracy of MSNet for lung adenocarcinoma pathology images is 96.55 %. In summary, MSNet has high classification performance and shows effectiveness and potential in the classification of lung adenocarcinoma pathology images.

PMID:38688128 | DOI:10.1016/j.compbiomed.2024.108519

Categories: Literature Watch

Triple-task mutual consistency for semi-supervised 3D medical image segmentation

Tue, 2024-04-30 06:00

Comput Biol Med. 2024 Apr 23;175:108506. doi: 10.1016/j.compbiomed.2024.108506. Online ahead of print.

ABSTRACT

Semi-supervised deep learning algorithm is an effective means of medical image segmentation. Among these methods, multi-task learning with consistency regularization has achieved outstanding results. However, most of the existing methods usually simply embed the Signed Distance Map (SDM) task into the network, which underestimates the potential ability of SDM in edge awareness and leads to excessive dependence between tasks. In this work, we propose a novel triple-task mutual consistency (TTMC) framework to enhance shape and edge awareness capabilities, and overcome the task dependence problem underestimated in previous work. Specifically, we innovatively construct the Signed Attention Map (SAM), a novel fusion image with attention mechanism, and use it as an auxiliary task for segmentation to enhance the edge awareness ability. Then we implement a triple-task deep network, which jointly predicts the voxel-wise classification map, the Signed Distance Map and the Signed Attention Map. In our proposed framework, an optimized differentiable transformation layer associates SDM with voxel-wise classification map and SAM prediction, while task-level consistency regularization utilizes unlabeled data in an unsupervised manner. Evaluated on the public Left Atrium dataset and NIH Pancreas dataset, our proposed framework achieves significant performance gains by effectively utilizing unlabeled data, outperforming recent state-of-the-art semi-supervised segmentation methods. Code is available at https://github.com/Saocent/TTMC.

PMID:38688127 | DOI:10.1016/j.compbiomed.2024.108506

Categories: Literature Watch

MulStack: An ensemble learning prediction model of multilabel mRNA subcellular localization

Tue, 2024-04-30 06:00

Comput Biol Med. 2024 Mar 16;175:108289. doi: 10.1016/j.compbiomed.2024.108289. Online ahead of print.

ABSTRACT

Subcellular localization of mRNA is related to protein synthesis, cell polarity, cell movement and other biological regulation mechanisms. The distribution of mRNAs in subcellulars is similar to that of proteins, and most mRNAs are distributed in multiple subcellulars. Recently, some computational methods have been designed to predict the subcellular localization of mRNA. However, these methods only employed a sin-gle level of mRNA features and did not employ the position encoding of nucleotides in mRNA. In this paper, an ensemble learning prediction model is proposed, named MulStack, which is based on random forest and deep learning for multilabel mRNA subcellular localization. The proposed method employs two levels of mRNA features, including sequence-level and residue-level features, and position encoding is employed for the first time in the field of subcellular localization of mRNA. Random forest is employed to learn mRNA sequence-level feature, deep learning is employed to learn mRNA sequence-level feature and mRNA residue-level combined with position encoding. And the outputs of random forest and deep learning model will be weighted sum as the prediction probability. Compared with existing methods, the results show that MulStack is the best in the localization of the nucleus, cytosol and exosome. In addition, position weight matrices (PWMs) are extracted by convolutional neural networks (CNNs) that can be matched with known RNA binding protein motifs. Gene ontology (GO) enrichment analysis shows biological processes, molecular functions and cellular components of mRNA genes. The prediction web server of MulStack is freely accessible at http://bliulab.net/MulStack.

PMID:38688123 | DOI:10.1016/j.compbiomed.2024.108289

Categories: Literature Watch

DISSECT: deep semi-supervised consistency regularization for accurate cell type fraction and gene expression estimation

Tue, 2024-04-30 06:00

Genome Biol. 2024 Apr 30;25(1):112. doi: 10.1186/s13059-024-03251-5.

ABSTRACT

Cell deconvolution is the estimation of cell type fractions and cell type-specific gene expression from mixed data. An unmet challenge in cell deconvolution is the scarcity of realistic training data and the domain shift often observed in synthetic training data. Here, we show that two novel deep neural networks with simultaneous consistency regularization of the target and training domains significantly improve deconvolution performance. Our algorithm, DISSECT, outperforms competing algorithms in cell fraction and gene expression estimation by up to 14 percentage points. DISSECT can be easily adapted to other biomedical data types, as exemplified by our proteomic deconvolution experiments.

PMID:38689377 | DOI:10.1186/s13059-024-03251-5

Categories: Literature Watch

One-shot neuroanatomy segmentation through online data augmentation and confidence aware pseudo label

Tue, 2024-04-30 06:00

Med Image Anal. 2024 Apr 25;95:103182. doi: 10.1016/j.media.2024.103182. Online ahead of print.

ABSTRACT

Recently, deep learning-based brain segmentation methods have achieved great success. However, most approaches focus on supervised segmentation, which requires many high-quality labeled images. In this paper, we pay attention to one-shot segmentation, aiming to learn from one labeled image and a few unlabeled images. We propose an end-to-end unified network that joints deformation modeling and segmentation tasks. Our network consists of a shared encoder, a deformation modeling head, and a segmentation head. In the training phase, the atlas and unlabeled images are input to the encoder to get multi-scale features. The features are then fed to the multi-scale deformation modeling module to estimate the atlas-to-image deformation field. The deformation modeling module implements the estimation at the feature level in a coarse-to-fine manner. Then, we employ the field to generate the augmented image pair through online data augmentation. We do not apply any appearance transformations cause the shared encoder could capture appearance variations. Finally, we adopt supervised segmentation loss for the augmented image. Considering that the unlabeled images still contain rich information, we introduce confidence aware pseudo label for them to further boost the segmentation performance. We validate our network on three benchmark datasets. Experimental results demonstrate that our network significantly outperforms other deep single-atlas-based and traditional multi-atlas-based segmentation methods. Notably, the second dataset is collected from multi-center, and our network still achieves promising segmentation performance on both the seen and unseen test sets, revealing its robustness. The source code will be available at https://github.com/zhangliutong/brainseg.

PMID:38688039 | DOI:10.1016/j.media.2024.103182

Categories: Literature Watch

Deciphering the Coevolutionary Dynamics of L2 beta-Lactamases via Deep Learning

Tue, 2024-04-30 06:00

J Chem Inf Model. 2024 Apr 30. doi: 10.1021/acs.jcim.4c00189. Online ahead of print.

ABSTRACT

L2 β-lactamases, serine-based class A β-lactamases expressed by Stenotrophomonas maltophilia, play a pivotal role in antimicrobial resistance (AMR). However, limited studies have been conducted on these important enzymes. To understand the coevolutionary dynamics of L2 β-lactamase, innovative computational methodologies, including adaptive sampling molecular dynamics simulations, and deep learning methods (convolutional variational autoencoders and BindSiteS-CNN) explored conformational changes and correlations within the L2 β-lactamase family together with other representative class A enzymes including SME-1 and KPC-2. This work also investigated the potential role of hydrophobic nodes and binding site residues in facilitating the functional mechanisms. The convergence of analytical approaches utilized in this effort yielded comprehensive insights into the dynamic behavior of the β-lactamases, specifically from an evolutionary standpoint. In addition, this analysis presents a promising approach for understanding how the class A β-lactamases evolve in response to environmental pressure and establishes a theoretical foundation for forthcoming endeavors in drug development aimed at combating AMR.

PMID:38687957 | DOI:10.1021/acs.jcim.4c00189

Categories: Literature Watch

Study on the differential diagnosis of benign and malignant breast lesions using a deep learning model based on multimodal images

Tue, 2024-04-30 06:00

J Cancer Res Ther. 2024 Apr 1;20(2):625-632. doi: 10.4103/jcrt.jcrt_1796_23. Epub 2024 Apr 30.

ABSTRACT

OBJECTIVE: To establish a multimodal model for distinguishing benign and malignant breast lesions.

MATERIALS AND METHODS: Clinical data, mammography, and MRI images (including T2WI, diffusion-weighted images (DWI), apparent diffusion coefficient (ADC), and DCE-MRI images) of 132 benign and breast cancer patients were analyzed retrospectively. The region of interest (ROI) in each image was marked and segmented using MATLAB software. The mammography, T2WI, DWI, ADC, and DCE-MRI models based on the ResNet34 network were trained. Using an integrated learning method, the five models were used as a basic model, and voting methods were used to construct a multimodal model. The dataset was divided into a training set and a prediction set. The accuracy, sensitivity, specificity, positive predictive value, and negative predictive value of the model were calculated. The diagnostic efficacy of each model was analyzed using a receiver operating characteristic curve (ROC) and an area under the curve (AUC). The diagnostic value was determined by the DeLong test with statistically significant differences set at P < 0.05.

RESULTS: We evaluated the ability of the model to classify benign and malignant tumors using the test set. The AUC values of the multimodal model, mammography model, T2WI model, DWI model, ADC model and DCE-MRI model were 0.943, 0.645, 0.595, 0.905, 0.900, and 0.865, respectively. The diagnostic ability of the multimodal model was significantly higher compared with that of the mammography and T2WI models. However, compared with the DWI, ADC, and DCE-MRI models, there was no significant difference in the diagnostic ability of these models.

CONCLUSION: Our deep learning model based on multimodal image training has practical value for the diagnosis of benign and malignant breast lesions.

PMID:38687933 | DOI:10.4103/jcrt.jcrt_1796_23

Categories: Literature Watch

Deep multiple instance learning versus conventional deep single instance learning for interpretable oral cancer detection

Tue, 2024-04-30 06:00

PLoS One. 2024 Apr 30;19(4):e0302169. doi: 10.1371/journal.pone.0302169. eCollection 2024.

ABSTRACT

The current medical standard for setting an oral cancer (OC) diagnosis is histological examination of a tissue sample taken from the oral cavity. This process is time-consuming and more invasive than an alternative approach of acquiring a brush sample followed by cytological analysis. Using a microscope, skilled cytotechnologists are able to detect changes due to malignancy; however, introducing this approach into clinical routine is associated with challenges such as a lack of resources and experts. To design a trustworthy OC detection system that can assist cytotechnologists, we are interested in deep learning based methods that can reliably detect cancer, given only per-patient labels (thereby minimizing annotation bias), and also provide information regarding which cells are most relevant for the diagnosis (thereby enabling supervision and understanding). In this study, we perform a comparison of two approaches suitable for OC detection and interpretation: (i) conventional single instance learning (SIL) approach and (ii) a modern multiple instance learning (MIL) method. To facilitate systematic evaluation of the considered approaches, we, in addition to a real OC dataset with patient-level ground truth annotations, also introduce a synthetic dataset-PAP-QMNIST. This dataset shares several properties of OC data, such as image size and large and varied number of instances per bag, and may therefore act as a proxy model of a real OC dataset, while, in contrast to OC data, it offers reliable per-instance ground truth, as defined by design. PAP-QMNIST has the additional advantage of being visually interpretable for non-experts, which simplifies analysis of the behavior of methods. For both OC and PAP-QMNIST data, we evaluate performance of the methods utilizing three different neural network architectures. Our study indicates, somewhat surprisingly, that on both synthetic and real data, the performance of the SIL approach is better or equal to the performance of the MIL approach. Visual examination by cytotechnologist indicates that the methods manage to identify cells which deviate from normality, including malignant cells as well as those suspicious for dysplasia. We share the code as open source.

PMID:38687694 | DOI:10.1371/journal.pone.0302169

Categories: Literature Watch

Radiation dose estimation with multiple artificial neural networks in dicentric chromosome assay

Tue, 2024-04-30 06:00

Int J Radiat Biol. 2024 Apr 30:1-10. doi: 10.1080/09553002.2024.2338531. Online ahead of print.

ABSTRACT

PURPOSE: The dicentric chromosome assay (DCA), often referred to as the 'gold standard' in radiation dose estimation, exhibits significant challenges as a consequence of its labor-intensive nature and dependency on expert knowledge. Existing automated technologies face limitations in accurately identifying dicentric chromosomes (DCs), resulting in decreased precision for radiation dose estimation. Furthermore, in the process of identifying DCs through automatic or semi-automatic methods, the resulting distribution could demonstrate under-dispersion or over-dispersion, which results in significant deviations from the Poisson distribution. In response to these issues, we developed an algorithm that employs deep learning to automatically identify chromosomes and perform fully automatic and accurate estimation of diverse radiation doses, adhering to a Poisson distribution.

MATERIALS AND METHODS: The dataset utilized for the dose estimation algorithm was generated from 30 healthy donors, with samples created across seven doses, ranging from 0 to 4 Gy. The procedure encompasses several steps: extracting images for dose estimation, counting chromosomes, and detecting DC and fragments. To accomplish these tasks, we utilize a diverse array of artificial neural networks (ANNs). The identification of DCs was accomplished using a detection mechanism that integrates both deep learning-based object detection and classification methods. Based on these detection results, dose-response curves were constructed. A dose estimation was carried out by combining a regression-based ANN with the Monte-Carlo method.

RESULTS: In the process of extracting images for dose analysis and identifying DCs, an under-dispersion tendency was observed. To rectify the discrepancy, classification ANN was employed to identify the results of DC detection. This approach led to satisfaction of Poisson distribution criteria by 32 out of the initial pool of 35 data points. In the subsequent stage, dose-response curves were constructed using data from 25 donors. Data provided by the remaining five donors served in performing dose estimations, which were subsequently calibrated by incorporating a regression-based ANN. Of the 23 points, 22 fell within their respective confidence intervals at p < .05 (95%), except for those associated with doses at levels below 0.5 Gy, where accurate calculation was obstructed by numerical issues. The accuracy of dose estimation has been improved for all radiation levels, with the exception of 1 Gy.

CONCLUSIONS: This study successfully demonstrates a high-precision dose estimation method across a general range up to 4 Gy through fully automated detection of DCs, adhering strictly to Poisson distribution. Incorporating multiple ANNs confirms the ability to perform fully automated radiation dose estimation. This approach is particularly advantageous in scenarios such as large-scale radiological incidents, improving operational efficiency and speeding up procedures while maintaining consistency in assessments. Moreover, it reduces potential human error and enhances the reliability of results.

PMID:38687685 | DOI:10.1080/09553002.2024.2338531

Categories: Literature Watch

A Colorectal Coordinate-Driven Method for Colorectum and Colorectal Cancer Segmentation in Conventional CT Scans

Tue, 2024-04-30 06:00

IEEE Trans Neural Netw Learn Syst. 2024 Apr 30;PP. doi: 10.1109/TNNLS.2024.3386610. Online ahead of print.

ABSTRACT

Automated colorectal cancer (CRC) segmentation in medical imaging is the key to achieving automation of CRC detection, staging, and treatment response monitoring. Compared with magnetic resonance imaging (MRI) and computed tomography colonography (CTC), conventional computed tomography (CT) has enormous potential because of its broad implementation, superiority for the hollow viscera (colon), and convenience without needing bowel preparation. However, the segmentation of CRC in conventional CT is more challenging due to the difficulties presenting with the unprepared bowel, such as distinguishing the colorectum from other structures with similar appearance and distinguishing the CRC from the contents of the colorectum. To tackle these challenges, we introduce DeepCRC-SL, the first automated segmentation algorithm for CRC and colorectum in conventional contrast-enhanced CT scans. We propose a topology-aware deep learning-based approach, which builds a novel 1-D colorectal coordinate system and encodes each voxel of the colorectum with a relative position along the coordinate system. We then induce an auxiliary regression task to predict the colorectal coordinate value of each voxel, aiming to integrate global topology into the segmentation network and thus improve the colorectum's continuity. Self-attention layers are utilized to capture global contexts for the coordinate regression task and enhance the ability to differentiate CRC and colorectum tissues. Moreover, a coordinate-driven self-learning (SL) strategy is introduced to leverage a large amount of unlabeled data to improve segmentation performance. We validate the proposed approach on a dataset including 227 labeled and 585 unlabeled CRC cases by fivefold cross-validation. Experimental results demonstrate that our method outperforms some recent related segmentation methods and achieves the segmentation accuracy in DSC for CRC of 0.669 and colorectum of 0.892, reaching to the performance (at 0.639 and 0.890, respectively) of a medical resident with two years of specialized CRC imaging fellowship.

PMID:38687670 | DOI:10.1109/TNNLS.2024.3386610

Categories: Literature Watch

Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities

Tue, 2024-04-30 06:00

IEEE Trans Pattern Anal Mach Intell. 2024 Apr 30;PP. doi: 10.1109/TPAMI.2024.3395423. Online ahead of print.

ABSTRACT

Recently, zero-shot (or training-free) Neural Architecture Search (NAS) approaches have been proposed to liberate NAS from the expensive training process. The key idea behind zero-shot NAS approaches is to design proxies that can predict the accuracy of some given networks without training the network parameters. The proxies proposed so far are usually inspired by recent progress in theoretical understanding of deep learning and have shown great potential on several datasets and NAS benchmarks. This paper aims to comprehensively review and compare the state-of-the-art (SOTA) zero-shot NAS approaches, with an emphasis on their hardware awareness. To this end, we first review the mainstream zero-shot proxies and discuss their theoretical underpinnings. We then compare these zero-shot proxies through large-scale experiments and demonstrate their effectiveness in both hardware-aware and hardware-oblivious NAS scenarios. Finally, we point out several promising ideas to design better proxies. Our source code and the list of related papers are available on https://github.com/SLDGroup/survey-zero-shot-nas.

PMID:38687659 | DOI:10.1109/TPAMI.2024.3395423

Categories: Literature Watch

A Deep Learning Approach to Estimate Multi-Level Mental Stress from EEG using Serious Games

Tue, 2024-04-30 06:00

IEEE J Biomed Health Inform. 2024 Apr 30;PP. doi: 10.1109/JBHI.2024.3395548. Online ahead of print.

ABSTRACT

Stress is revealed by the inability of individuals to cope with their environment, which is frequently evidenced by a failure to achieve their full potential in tasks or goals. This study aims to assess the feasibility of estimating the level of stress that the user is perceiving related to a specific task through an electroencephalograpic (EEG) system. This system is integrated with a Serious Game consisting of a multi-level stress driving tool, and Deep Learning (DL) neural networks are used for classification. The game involves controlling a vehicle to dodge obstacles, with the number of obstacles increasing based on complexity. Assuming that there is a direct correlation between the difficulty level of the game and the stress level of the user, a recurrent neural network (RNN) with a structure based on gated recurrent units (GRU) was used to classify the different levels of stress. The results show that the RNN model is able to predict stress levels above current state-of-the-art with up to 94% accuracy in some cases, suggesting that the use of EEG systems in combination with Serious Games and DL represents a promising technique in the prediction and classification of mental stress levels.

PMID:38687658 | DOI:10.1109/JBHI.2024.3395548

Categories: Literature Watch

PCNet: Prior Category Network for CT Universal Segmentation Model

Tue, 2024-04-30 06:00

IEEE Trans Med Imaging. 2024 Apr 30;PP. doi: 10.1109/TMI.2024.3395349. Online ahead of print.

ABSTRACT

Accurate segmentation of anatomical structures in Computed Tomography (CT) images is crucial for clinical diagnosis, treatment planning, and disease monitoring. The present deep learning segmentation methods are hindered by factors such as data scale and model size. Inspired by how doctors identify tissues, we propose a novel approach, the Prior Category Network (PCNet), that boosts segmentation performance by leveraging prior knowledge between different categories of anatomical structures. Our PCNet comprises three key components: prior category prompt (PCP), hierarchy category system (HCS), and hierarchy category loss (HCL). PCP utilizes Contrastive Language-Image Pretraining (CLIP), along with attention modules, to systematically define the relationships between anatomical categories as identified by clinicians. HCS guides the segmentation model in distinguishing between specific organs, anatomical structures, and functional systems through hierarchical relationships. HCL serves as a consistency constraint, fortifying the directional guidance provided by HCS to enhance the segmentation model's accuracy and robustness. We conducted extensive experiments to validate the effectiveness of our approach, and the results indicate that PCNet can generate a high-performance, universal model for CT segmentation. The PCNet framework also demonstrates a significant transferability on multiple downstream tasks. The ablation experiments show that the methodology employed in constructing the HCS is of critical importance. The prompt and HCS can be accessed at https://github.com/YixinChen-AI/PCNet.

PMID:38687654 | DOI:10.1109/TMI.2024.3395349

Categories: Literature Watch

Adaptive and Iterative Learning with Multi-perspective Regularizations for Metal Artifact Reduction

Tue, 2024-04-30 06:00

IEEE Trans Med Imaging. 2024 Apr 30;PP. doi: 10.1109/TMI.2024.3395348. Online ahead of print.

ABSTRACT

Metal artifact reduction (MAR) is important for clinical diagnosis with CT images. The existing state-of-the-art deep learning methods usually suppress metal artifacts in sinogram or image domains or both. However, their performance is limited by the inherent characteristics of the two domains, i.e., the errors introduced by local manipulations in the sinogram domain would propagate throughout the whole image during backprojection and lead to serious secondary artifacts, while it is difficult to distinguish artifacts from actual image features in the image domain. To alleviate these limitations, this study analyzes the desirable properties of wavelet transform in-depth and proposes to perform MAR in the wavelet domain. First, wavelet transform yields components that possess spatial correspondence with the image, thereby preventing the spread of local errors to avoid secondary artifacts. Second, using wavelet transform could facilitate identification of artifacts from image since metal artifacts are mainly high-frequency signals. Taking these advantages of the wavelet transform, this paper decomposes an image into multiple wavelet components and introduces multi-perspective regularizations into the proposed MAR model. To improve the transparency and validity of the model, all the modules in the proposed MAR model are designed to reflect their mathematical meanings. In addition, an adaptive wavelet module is also utilized to enhance the flexibility of the model. To optimize the model, an iterative algorithm is developed. The evaluation on both synthetic and real clinical datasets consistently confirms the superior performance of the proposed method over the competing methods.

PMID:38687653 | DOI:10.1109/TMI.2024.3395348

Categories: Literature Watch

Assessment of the deep learning-based gamma passing rate prediction system for 1.5 T magnetic resonance-guided linear accelerator

Tue, 2024-04-30 06:00

Radiol Phys Technol. 2024 Apr 30. doi: 10.1007/s12194-024-00800-2. Online ahead of print.

ABSTRACT

Measurement-based verification is impossible for the patient-specific quality assurance (QA) of online adaptive magnetic resonance imaging-guided radiotherapy (oMRgRT) because the patient remains on the couch throughout the session. We assessed a deep learning (DL) system for oMRgRT to predict the gamma passing rate (GPR). This study collected 125 verification plans [reference plan (RP), 100; adapted plan (AP), 25] from patients with prostate cancer treated using Elekta Unity. Based on our previous study, we employed a convolutional neural network that predicted the GPRs of nine pairs of gamma criteria from 1%/1 mm to 3%/3 mm. First, we trained and tested the DL model using RPs (n = 75 and n = 25 for training and testing, respectively) for its optimization. Second, we tested the GPR prediction accuracy using APs to determine whether the DL model could be applied to APs. The mean absolute error (MAE) and correlation coefficient (r) of the RPs were 1.22 ± 0.27% and 0.29 ± 0.10 in 3%/2 mm, 1.35 ± 0.16% and 0.37 ± 0.15 in 2%/2 mm, and 3.62 ± 0.55% and 0.32 ± 0.14 in 1%/1 mm, respectively. The MAE and r of the APs were 1.13 ± 0.33% and 0.35 ± 0.22 in 3%/2 mm, 1.68 ± 0.47% and 0.30 ± 0.11 in 2%/2 mm, and 5.08 ± 0.29% and 0.15 ± 0.10 in 1%/1 mm, respectively. The time cost was within 3 s for the prediction. The results suggest the DL-based model has the potential for rapid GPR prediction in Elekta Unity.

PMID:38687457 | DOI:10.1007/s12194-024-00800-2

Categories: Literature Watch

On Minimizers and Convolutional Filters: Theoretical Connections and Applications to Genome Analysis

Tue, 2024-04-30 06:00

J Comput Biol. 2024 Apr 30. doi: 10.1089/cmb.2024.0483. Online ahead of print.

ABSTRACT

Minimizers and convolutional neural networks (CNNs) are two quite distinct popular techniques that have both been employed to analyze categorical biological sequences. At face value, the methods seem entirely dissimilar. Minimizers use min-wise hashing on a rolling window to extract a single important k-mer feature per window. CNNs start with a wide array of randomly initialized convolutional filters, paired with a pooling operation, and then multiple additional neural layers to learn both the filters themselves and how they can be used to classify the sequence. In this study, our main result is a careful mathematical analysis of hash function properties showing that for sequences over a categorical alphabet, random Gaussian initialization of convolutional filters with max-pooling is equivalent to choosing a minimizer ordering such that selected k-mers are (in Hamming distance) far from the k-mers within the sequence but close to other minimizers. In empirical experiments, we find that this property manifests as decreased density in repetitive regions, both in simulation and on real human telomeres. We additionally train from scratch a CNN embedding of synthetic short-reads from the SARS-CoV-2 genome into 3D Euclidean space that locally recapitulates the linear sequence distance of the read origins, a modest step toward building a deep learning assembler, although it is at present too slow to be practical. In total, this article provides a partial explanation for the effectiveness of CNNs in categorical sequence analysis.

PMID:38687333 | DOI:10.1089/cmb.2024.0483

Categories: Literature Watch

Pages