Deep learning

Improved DeepSORT-Based Object Tracking in Foggy Weather for AVs Using Sematic Labels and Fused Appearance Feature Network

Sat, 2024-07-27 06:00

Sensors (Basel). 2024 Jul 19;24(14):4692. doi: 10.3390/s24144692.

ABSTRACT

The presence of fog in the background can prevent small and distant objects from being detected, let alone tracked. Under safety-critical conditions, multi-object tracking models require faster tracking speed while maintaining high object-tracking accuracy. The original DeepSORT algorithm used YOLOv4 for the detection phase and a simple neural network for the deep appearance descriptor. Consequently, the feature map generated loses relevant details about the track being matched with a given detection in fog. Targets with a high degree of appearance similarity on the detection frame are more likely to be mismatched, resulting in identity switches or track failures in heavy fog. We propose an improved multi-object tracking model based on the DeepSORT algorithm to improve tracking accuracy and speed under foggy weather conditions. First, we employed our camera-radar fusion network (CR-YOLOnet) in the detection phase for faster and more accurate object detection. We proposed an appearance feature network to replace the basic convolutional neural network. We incorporated GhostNet to take the place of the traditional convolutional layers to generate more features and reduce computational complexities and costs. We adopted a segmentation module and fed the semantic labels of the corresponding input frame to add rich semantic information to the low-level appearance feature maps. Our proposed method outperformed YOLOv5 + DeepSORT with a 35.15% increase in multi-object tracking accuracy, a 32.65% increase in multi-object tracking precision, a speed increase by 37.56%, and identity switches decreased by 46.81%.

PMID:39066088 | DOI:10.3390/s24144692

Categories: Literature Watch

Infrared Image Super-Resolution Network Utilizing the Enhanced Transformer and U-Net

Sat, 2024-07-27 06:00

Sensors (Basel). 2024 Jul 19;24(14):4686. doi: 10.3390/s24144686.

ABSTRACT

Infrared images hold significant value in applications such as remote sensing and fire safety. However, infrared detectors often face the problem of high hardware costs, which limits their widespread use. Advancements in deep learning have spurred innovative approaches to image super-resolution (SR), but comparatively few efforts have been dedicated to the exploration of infrared images. To address this, we design the Residual Swin Transformer and Average Pooling Block (RSTAB) and propose the SwinAIR, which can effectively extract and fuse the diverse frequency features in infrared images and achieve superior SR reconstruction performance. By further integrating SwinAIR with U-Net, we propose the SwinAIR-GAN for real infrared image SR reconstruction. SwinAIR-GAN extends the degradation space to better simulate the degradation process of real infrared images. Additionally, it incorporates spectral normalization, dropout, and artifact discrimination loss to reduce the potential image artifacts. Qualitative and quantitative evaluations on various datasets confirm the effectiveness of our proposed method in reconstructing realistic textures and details of infrared images.

PMID:39066083 | DOI:10.3390/s24144686

Categories: Literature Watch

Next-Gen Medical Imaging: U-Net Evolution and the Rise of Transformers

Sat, 2024-07-27 06:00

Sensors (Basel). 2024 Jul 18;24(14):4668. doi: 10.3390/s24144668.

ABSTRACT

The advancement of medical imaging has profoundly impacted our understanding of the human body and various diseases. It has led to the continuous refinement of related technologies over many years. Despite these advancements, several challenges persist in the development of medical imaging, including data shortages characterized by low contrast, high noise levels, and limited image resolution. The U-Net architecture has significantly evolved to address these challenges, becoming a staple in medical imaging due to its effective performance and numerous updated versions. However, the emergence of Transformer-based models marks a new era in deep learning for medical imaging. These models and their variants promise substantial progress, necessitating a comparative analysis to comprehend recent advancements. This review begins by exploring the fundamental U-Net architecture and its variants, then examines the limitations encountered during its evolution. It then introduces the Transformer-based self-attention mechanism and investigates how modern models incorporate positional information. The review emphasizes the revolutionary potential of Transformer-based techniques, discusses their limitations, and outlines potential avenues for future research.

PMID:39066065 | DOI:10.3390/s24144668

Categories: Literature Watch

Latent Space Representations for Marker-Less Realtime Hand-Eye Calibration

Sat, 2024-07-27 06:00

Sensors (Basel). 2024 Jul 18;24(14):4662. doi: 10.3390/s24144662.

ABSTRACT

Marker-less hand-eye calibration permits the acquisition of an accurate transformation between an optical sensor and a robot in unstructured environments. Single monocular cameras, despite their low cost and modest computation requirements, present difficulties for this purpose due to their incomplete correspondence of projected coordinates. In this work, we introduce a hand-eye calibration procedure based on the rotation representations inferred by an augmented autoencoder neural network. Learning-based models that attempt to directly regress the spatial transform of objects such as the links of robotic manipulators perform poorly in the orientation domain, but this can be overcome through the analysis of the latent space vectors constructed in the autoencoding process. This technique is computationally inexpensive and can be run in real time in markedly varied lighting and occlusion conditions. To evaluate the procedure, we use a color-depth camera and perform a registration step between the predicted and the captured point clouds to measure translation and orientation errors and compare the results to a baseline based on traditional checkerboard markers.

PMID:39066062 | DOI:10.3390/s24144662

Categories: Literature Watch

Tuberculosis research advances and future trends: A bibliometric knowledge mapping approach

Fri, 2024-07-26 06:00

Medicine (Baltimore). 2024 Jul 26;103(30):e39052. doi: 10.1097/MD.0000000000039052.

ABSTRACT

The Gulf Cooperation Council (GCC) countries are more vulnerable to many transmissible diseases, including tuberculosis (TB). This study is to identify the scientific publications related to TB in the GCC countries using topic modeling and co-word analysis. A bibliometric analytic study. The R-package, VOSviewer software, IBM SPPS, and Scopus Analytics were used to analyze performance, hotspots, knowledge structure, thematic evolution, trend topics, and inter-gulf and international cooperation on TB in the past 30 years (1993-2022). A total of 1999 publications associated with research on GCC-TB were published. The annual growth rate of documents was 7.76%. Saudi Arabia is the most highly published, followed by the United Arab Emirates, Kuwait, Qatar, Oman, and Bahrain. The most-cited GC country is Kingdom Saudi Arabia, followed by Kuwait. One hundred sixty research institutions contributed to the dissemination of TB-related knowledge in the GCC, where the highest publishing organizations were King Saud University (Kingdom Saudi Arabia; n = 518). The number of publications related to TB is high in GCC Countries. The current tendencies indicated that GCC scholars are increasingly focused on deep learning, chest X-ray, molecular docking, comorbid covid-19, risk factors, and Mycobacterium bovis.

PMID:39058842 | DOI:10.1097/MD.0000000000039052

Categories: Literature Watch

Deep learning-based material decomposition of iodine and calcium in mobile photon counting detector CT

Fri, 2024-07-26 06:00

PLoS One. 2024 Jul 26;19(7):e0306627. doi: 10.1371/journal.pone.0306627. eCollection 2024.

ABSTRACT

Photon-counting detector (PCD)-based computed tomography (CT) offers several advantages over conventional energy-integrating detector-based CT. Among them, the ability to discriminate energy exhibits significant potential for clinical applications because it provides material-specific information. That is, material decomposition (MD) can be achieved through energy discrimination. In this study, deep learning-based material decomposition was performed using live animal data. We propose MD-Unet, which is a deep learning strategy for material decomposition based on an Unet architecture trained with data from three energy bins. To mitigate the data insufficiency, we developed a pretrained model incorporating various simulation data forms and augmentation strategies. Incorporating these approaches into model training results in enhanced precision in material decomposition, thereby enabling the identification of distinct materials at individual pixel locations. The trained network was applied to the acquired animal data to evaluate material decomposition results. Compared with conventional methods, the newly generated MD-Unet demonstrated more accurate material decomposition imaging. Moreover, the network demonstrated an improved material decomposition ability and significantly reduced noise. In addition, they can potentially offer an enhancement level similar to that of a typical contrast agent. This implies that it can acquire images of the same quality with fewer contrast agents administered to patients, thereby demonstrating its significant clinical value.

PMID:39058758 | DOI:10.1371/journal.pone.0306627

Categories: Literature Watch

Deep learning-based respiratory muscle segmentation as a potential imaging biomarker for respiratory function assessment

Fri, 2024-07-26 06:00

PLoS One. 2024 Jul 26;19(7):e0306789. doi: 10.1371/journal.pone.0306789. eCollection 2024.

ABSTRACT

Respiratory diseases significantly affect respiratory function, making them a considerable contributor to global mortality. The respiratory muscles play an important role in disease prognosis; as such, quantitative analysis of the respiratory muscles is crucial to assess the status of the respiratory system and the quality of life in patients. In this study, we aimed to develop an automated approach for the segmentation and classification of three types of respiratory muscles from computed tomography (CT) images using artificial intelligence. With a dataset of approximately 600,000 thoracic CT images from 3,200 individuals, we trained the model using the Attention U-Net architecture, optimized for detailed and focused segmentation. Subsequently, we calculated the volumes and densities from the muscle masks segmented by our model and performed correlation analysis with pulmonary function test (PFT) parameters. The segmentation models for muscle tissue and respiratory muscles obtained dice scores of 0.9823 and 0.9688, respectively. The classification model, achieving a generalized dice score of 0.9900, also demonstrated high accuracy in classifying thoracic region muscle types, as evidenced by its F1 scores: 0.9793 for the pectoralis muscle, 0.9975 for the erector spinae muscle, and 0.9839 for the intercostal muscle. In the correlation analysis, the volume of the respiratory muscles showed a strong correlation with PFT parameters, suggesting that respiratory muscle volume may serve as a potential novel biomarker for respiratory function. Although muscle density showed a weaker correlation with the PFT parameters, it has a potential significance in medical research.

PMID:39058719 | DOI:10.1371/journal.pone.0306789

Categories: Literature Watch

DeepDRA: Drug repurposing using multi-omics data integration with autoencoders

Fri, 2024-07-26 06:00

PLoS One. 2024 Jul 26;19(7):e0307649. doi: 10.1371/journal.pone.0307649. eCollection 2024.

ABSTRACT

Cancer treatment has become one of the biggest challenges in the world today. Different treatments are used against cancer; drug-based treatments have shown better results. On the other hand, designing new drugs for cancer is costly and time-consuming. Some computational methods, such as machine learning and deep learning, have been suggested to solve these challenges using drug repurposing. Despite the promise of classical machine-learning methods in repurposing cancer drugs and predicting responses, deep-learning methods performed better. This study aims to develop a deep-learning model that predicts cancer drug response based on multi-omics data, drug descriptors, and drug fingerprints and facilitates the repurposing of drugs based on those responses. To reduce multi-omics data's dimensionality, we use autoencoders. As a multi-task learning model, autoencoders are connected to MLPs. We extensively tested our model using three primary datasets: GDSC, CTRP, and CCLE to determine its efficacy. In multiple experiments, our model consistently outperforms existing state-of-the-art methods. Compared to state-of-the-art models, our model achieves an impressive AUPRC of 0.99. Furthermore, in a cross-dataset evaluation, where the model is trained on GDSC and tested on CCLE, it surpasses the performance of three previous works, achieving an AUPRC of 0.72. In conclusion, we presented a deep learning model that outperforms the current state-of-the-art regarding generalization. Using this model, we could assess drug responses and explore drug repurposing, leading to the discovery of novel cancer drugs. Our study highlights the potential for advanced deep learning to advance cancer therapeutic precision.

PMID:39058696 | DOI:10.1371/journal.pone.0307649

Categories: Literature Watch

Matryoshka: Exploiting the Over-Parametrization of Deep Learning Models for Covert Data Transmission

Fri, 2024-07-26 06:00

IEEE Trans Pattern Anal Mach Intell. 2024 Jul 26;PP. doi: 10.1109/TPAMI.2024.3434417. Online ahead of print.

ABSTRACT

High-quality private machine learning (ML) data stored in local data centers becomes a key competitive factor for AI corporations. In this paper, we present a novel insider attack called Matryoshka to reveal the possibility of breaking the privacy of ML data even with no exposed interface. Our attack employs a scheduled-to-publish DNN model as a carrier model for covert transmission of secret models which memorize the information of private ML data that otherwise has no interface to the outsider. At the core of our attack, we present a novel parameter sharing approach which exploits the learning capacity of the carrier model for information hiding. Our approach simultaneously achieves: (i) High Capacity - With almost no utility loss of the carrier model, Matryoshka can transmit over 10,000 real-world data samples within a carrier model which has 220× less parameters than the total size of the stolen data, and simultaneously transmit multiple heterogeneous datasets or models within a single carrier model under a trivial distortion rate, neither of which can be done with existing steganography techniques; (ii) Decoding Efficiency - once downloading the published carrier model, an outside colluder can exclusively decode the hidden models from the carrier model with only several integer secrets and the knowledge of the hidden model architecture; (iii) Effectiveness - Moreover, almost all the recovered models either have similar performance as if it is trained independently on the private data, or can be further used to extract memorized raw training data with low error; (iv) Robustness - Information redundancy is naturally implemented to achieve resilience against common post-processing techniques on the carrier before its publishing; (v) Covertness - A model inspector with different levels of prior knowledge could hardly differentiate a carrier model from a normal model.

PMID:39058616 | DOI:10.1109/TPAMI.2024.3434417

Categories: Literature Watch

Adaptive Neural Message Passing for Inductive Learning on Hypergraphs

Fri, 2024-07-26 06:00

IEEE Trans Pattern Anal Mach Intell. 2024 Jul 26;PP. doi: 10.1109/TPAMI.2024.3434483. Online ahead of print.

ABSTRACT

Graphs are the most ubiquitous data structures for representing relational datasets and performing inferences in them. They model, however, only pairwise relations between nodes and are not designed for encoding the higher-order relations. This drawback is mitigated by hypergraphs, in which an edge can connect an arbitrary number of nodes. Most hypergraph learning approaches convert the hypergraph structure to that of a graph and then deploy existing geometric deep learning methods. This transformation leads to information loss, and sub-optimal exploitation of the hypergraph's expressive power. We present HyperMSG, a novel hypergraph learning framework that uses a modular two-level neural message passing strategy to accurately and efficiently propagate information within each hyperedge and across the hyperedges. HyperMSG adapts to the data and task by learning an attention weight associated with each node's degree centrality. Such a mechanism quantifies both local and global importance of a node, capturing the structural properties of a hypergraph. HyperMSG is inductive, allowing inference on previously unseen nodes. Further, it is robust and outperforms state-of-the-art hypergraph learning methods on a wide range of tasks and datasets. Finally, we demonstrate the effectiveness of HyperMSG in learning multimodal relations through detailed experimentation on a challenging multimedia dataset.

PMID:39058615 | DOI:10.1109/TPAMI.2024.3434483

Categories: Literature Watch

3D DenseNet with temporal transition layer for heart rate estimation from real-life RGB videos

Fri, 2024-07-26 06:00

Technol Health Care. 2024 Jul 23. doi: 10.3233/THC-241104. Online ahead of print.

ABSTRACT

BACKGROUND: Deep learning has demonstrated superior performance over traditional methods for the estimation of heart rates in controlled contexts. However, in less controlled scenarios this performance seems to vary based on the training dataset and the architecture of the deep learning models.

OBJECTIVES: In this paper, we develop a deep learning-based model leveraging the power of 3D convolutional neural networks (3DCNN) to extract temporal and spatial features that lead to an accurate heart rates estimation from RGB no pre-defined region of interest (ROI) videos.

METHODS: We propose a 3D DenseNet with a 3D temporal transition layer for the estimation of heart rates from a large-scale dataset of videos that appear more hospital-like and real-life than other existing facial video-based datasets.

RESULTS: Experimentally, our model was trained and tested on this less controlled dataset and showed heart rate estimation performance with root mean square error (RMSE) of 8.68 BPM and mean absolute error (MAE) of 3.34 BPM.

CONCLUSION: Moreover, we show that such a model can also achieve better results than the state-of-the-art models when tested on the VIPL-HR public dataset.

PMID:39058471 | DOI:10.3233/THC-241104

Categories: Literature Watch

Prediction of early-phase cytomegalovirus pneumonia in post-stem cell transplantation using a deep learning model

Fri, 2024-07-26 06:00

Technol Health Care. 2024 Jul 8. doi: 10.3233/THC-240597. Online ahead of print.

ABSTRACT

BACKGROUND: Diagnostic challenges exist for CMV pneumonia in post-hematopoietic stem cell transplantation (post-HSCT) patients, despite early-phase radiographic changes.

OBJECTIVE: The study aims to employ a deep learning model distinguishing CMV pneumonia from COVID-19 pneumonia, community-acquired pneumonia, and normal lungs post-HSCT.

METHODS: Initially, 6 neural network models were pre-trained with COVID-19 pneumonia, community-acquired pneumonia, and normal lung CT images from Kaggle's COVID multiclass dataset (Dataset A), then Dataset A was combined with the CMV pneumonia images from our center, forming Dataset B. We use a few-shot transfer learning strategy to fine-tune the pre-trained models and evaluate model performance in Dataset B.

RESULTS: 34 cases of CMV pneumonia were found between January 2018 and December 2022 post-HSCT. Dataset A contained 1681 images of each subgroup from Kaggle. Combined with Dataset A, Dataset B was initially formed by 98 images of CMV pneumonia and normal lung. The optimal model (Xception) achieved an accuracy of 0.9034. Precision, recall, and F1-score all reached 0.9091, with an AUC of 0.9668 in the test set of Dataset B.

CONCLUSIONS: This framework demonstrates the deep learning model's ability to distinguish rare pneumonia types utilizing a small volume of CT images, facilitating early detection of CMV pneumonia post-HSCT.

PMID:39058469 | DOI:10.3233/THC-240597

Categories: Literature Watch

Fine grained automatic left ventricle segmentation via ROI based Tri-Convolutional neural networks

Fri, 2024-07-26 06:00

Technol Health Care. 2024 Jul 6. doi: 10.3233/THC-240062. Online ahead of print.

ABSTRACT

BACKGROUND: The left ventricle segmentation (LVS) is crucial to the assessment of cardiac function. Globally, cardiovascular disease accounts for the majority of deaths, posing a significant health threat. In recent years, LVS has gained important attention due to its ability to measure vital parameters such as myocardial mass, end-diastolic volume, and ejection fraction. Medical professionals realize that manually segmenting data to evaluate these processes takes a lot of time, effort when diagnosing heart diseases. Yet, manually segmenting these images is labour-intensive and may reduce diagnostic accuracy.

OBJECTIVE/METHODS: This paper, propose a combination of different deep neural networks for semantic segmentation of the left ventricle based on Tri-Convolutional Networks (Tri-ConvNets) to obtain highly accurate segmentation. CMRI images are initially pre-processed to remove noise artefacts and enhance image quality, then ROI-based extraction is done in three stages to accurately identify the LV. The extracted features are given as input to three different deep learning structures for segmenting the LV in an efficient way. The contour edges are processed in the standard ConvNet, the contour points are processed using Fully ConvNet and finally the noise free images are converted into patches to perform pixel-wise operations in ConvNets.

RESULTS/CONCLUSIONS: The proposed Tri-ConvNets model achieves the Jaccard indices of 0.9491 ± 0.0188 for the sunny brook dataset and 0.9497 ± 0.0237 for the York dataset, and the dice index of 0.9419 ± 0.0178 for the ACDC dataset and 0.9414 ± 0.0247 for LVSC dataset respectively. The experimental results also reveal that the proposed Tri-ConvNets model is faster and requires minimal resources compared to state-of-the-art models.

PMID:39058464 | DOI:10.3233/THC-240062

Categories: Literature Watch

Multi-dimensional dense attention network for pixel-wise segmentation of optic disc in colour fundus images

Fri, 2024-07-26 06:00

Technol Health Care. 2024 Jul 11. doi: 10.3233/THC-230310. Online ahead of print.

ABSTRACT

BACKGROUND: Segmentation of retinal fragments like blood vessels, Optic Disc (OD), and Optic Cup (OC) enables the early detection of different retinal pathologies like Diabetic Retinopathy (DR), Glaucoma, etc.

OBJECTIVE: Accurate segmentation of OD remains challenging due to blurred boundaries, vessel occlusion, and other distractions and limitations. These days, deep learning is rapidly progressing in the segmentation of image pixels, and a number of network models have been proposed for end-to-end image segmentation. However, there are still certain limitations, such as limited ability to represent context, inadequate feature processing, limited receptive field, etc., which lead to the loss of local details and blurred boundaries.

METHODS: A multi-dimensional dense attention network, or MDDA-Net, is proposed for pixel-wise segmentation of OD in retinal images in order to address the aforementioned issues and produce more thorough and accurate segmentation results. In order to acquire powerful contexts when faced with limited context representation capabilities, a dense attention block is recommended. A triple-attention (TA) block is introduced in order to better extract the relationship between pixels and obtain more comprehensive information, with the goal of addressing the insufficient feature processing. In the meantime, a multi-scale context fusion (MCF) is suggested for acquiring the multi-scale contexts through context improvement.

RESULTS: Specifically, we provide a thorough assessment of the suggested approach on three difficult datasets. In the MESSIDOR and ORIGA data sets, the suggested MDDA-NET approach obtains accuracy levels of 99.28% and 98.95%, respectively.

CONCLUSION: The experimental results show that the MDDA-Net can obtain better performance than state-of-the-art deep learning models under the same environmental conditions.

PMID:39058458 | DOI:10.3233/THC-230310

Categories: Literature Watch

The role of radiomics for predicting of lymph-vascular space invasion in cervical cancer patients based on artificial intelligence: a systematic review and meta-analysis

Fri, 2024-07-26 06:00

J Gynecol Oncol. 2024 Jul 19. doi: 10.3802/jgo.2025.36.e26. Online ahead of print.

ABSTRACT

The primary aim of this study was to conduct a methodical examination and assessment of the prognostic efficacy exhibited by magnetic resonance imaging (MRI)-derived radiomic models concerning the preoperative prediction of lymph-vascular space infiltration (LVSI) in cervical cancer cases. A comprehensive and thorough exploration of pertinent academic literature was undertaken by two investigators, employing the resources of the Embase, PubMed, Web of Science, and Cochrane Library databases. The scope of this research was bounded by a publication cutoff date of May 15, 2023. The inclusion criteria encompassed studies that utilized radiomic models based on MRI to prognosticate the accuracy of preoperative LVSI estimation in instances of cervical cancer. The Diagnostic Accuracy Studies-2 framework and the Radiomic Quality Score metric were employed. This investigation included nine distinct research studies, enrolling a total of 1,406 patients. The diagnostic performance metrics of MRI-based radiomic models in the prediction of preoperative LVSI among cervical cancer patients were determined as follows: sensitivity of 83% (95% confidence interval [CI]=77%-87%), specificity of 74% (95% CI=69%-79%), and a corresponding AUC of summary receiver operating characteristic measuring 0.86 (95% CI=0.82-0.88). The results of the synthesized meta-analysis did not reveal substantial heterogeneity.This meta-analysis suggests the robust diagnostic proficiency of the MRI-based radiomic model in the prognostication of preoperative LVSI within the cohort of cervical cancer patients. In the future, radiomics holds the potential to emerge as a widely applicable noninvasive modality for the early detection of LVSI in the context of cervical cancer.

PMID:39058366 | DOI:10.3802/jgo.2025.36.e26

Categories: Literature Watch

Automatically Detecting Pancreatic Cysts in Autosomal Dominant Polycystic Kidney Disease on MRI Using Deep Learning

Fri, 2024-07-26 06:00

Tomography. 2024 Jul 16;10(7):1148-1158. doi: 10.3390/tomography10070087.

ABSTRACT

BACKGROUND: Pancreatic cysts in autosomal dominant polycystic kidney disease (ADPKD) correlate with PKD2 mutations, which have a different phenotype than PKD1 mutations. However, pancreatic cysts are commonly overlooked by radiologists. Here, we automate the detection of pancreatic cysts on abdominal MRI in ADPKD.

METHODS: Eight nnU-Net-based segmentation models with 2D or 3D configuration and various loss functions were trained on positive-only or positive-and-negative datasets, comprising axial and coronal T2-weighted MR images from 254 scans on 146 ADPKD patients with pancreatic cysts labeled independently by two radiologists. Model performance was evaluated on test subjects unseen in training, comprising 40 internal, 40 external, and 23 test-retest reproducibility ADPKD patients.

RESULTS: Two radiologists agreed on 52% of cysts labeled on training data, and 33%/25% on internal/external test datasets. The 2D model with a loss of combined dice similarity coefficient and cross-entropy trained with the dataset with both positive and negative cases produced an optimal dice score of 0.7 ± 0.5/0.8 ± 0.4 at the voxel level on internal/external validation and was thus used as the best-performing model. In the test-retest, the optimal model showed superior reproducibility (83% agreement between scan A and B) in segmenting pancreatic cysts compared to six expert observers (77% agreement). In the internal/external validation, the optimal model showed high specificity of 94%/100% but limited sensitivity of 20%/24%.

CONCLUSIONS: Labeling pancreatic cysts on T2 images of the abdomen in patients with ADPKD is challenging, deep learning can help the automated detection of pancreatic cysts, and further image quality improvement is warranted.

PMID:39058059 | DOI:10.3390/tomography10070087

Categories: Literature Watch

Reducing Manual Annotation Costs for Cell Segmentation by Upgrading Low-Quality Annotations

Fri, 2024-07-26 06:00

J Imaging. 2024 Jul 17;10(7):172. doi: 10.3390/jimaging10070172.

ABSTRACT

Deep-learning algorithms for cell segmentation typically require large data sets with high-quality annotations to be trained with. However, the annotation cost for obtaining such sets may prove to be prohibitively expensive. Our work aims to reduce the time necessary to create high-quality annotations of cell images by using a relatively small well-annotated data set for training a convolutional neural network to upgrade lower-quality annotations, produced at lower annotation costs. We investigate the performance of our solution when upgrading the annotation quality for labels affected by three types of annotation error: omission, inclusion, and bias. We observe that our method can upgrade annotations affected by high error levels from 0.3 to 0.9 Dice similarity with the ground-truth annotations. We also show that a relatively small well-annotated set enlarged with samples with upgraded annotations can be used to train better-performing cell segmentation networks compared to training only on the well-annotated set. Moreover, we present a use case where our solution can be successfully employed to increase the quality of the predictions of a segmentation network trained on just 10 annotated samples.

PMID:39057743 | DOI:10.3390/jimaging10070172

Categories: Literature Watch

Deep Efficient Data Association for Multi-Object Tracking: Augmented with SSIM-Based Ambiguity Elimination

Fri, 2024-07-26 06:00

J Imaging. 2024 Jul 16;10(7):171. doi: 10.3390/jimaging10070171.

ABSTRACT

Recently, to address the multiple object tracking (MOT) problem, we harnessed the power of deep learning-based methods. The tracking-by-detection approach to multiple object tracking (MOT) involves two primary steps: object detection and data association. In the first step, objects of interest are detected in each frame of a video. The second step establishes the correspondence between these detected objects across different frames to track their trajectories. This paper proposes an efficient and unified data association method that utilizes a deep feature association network (deepFAN) to learn the associations. Additionally, the Structural Similarity Index Metric (SSIM) is employed to address uncertainties in the data association, complementing the deep feature association network. These combined association computations effectively link the current detections with the previous tracks, enhancing the overall tracking performance. To evaluate the efficiency of the proposed MOT framework, we conducted a comprehensive analysis of the popular MOT datasets, such as the MOT challenge and UA-DETRAC. The results showed that our technique performed substantially better than the current state-of-the-art methods in terms of standard MOT metrics.

PMID:39057742 | DOI:10.3390/jimaging10070171

Categories: Literature Watch

Automated Lung Cancer Diagnosis Applying Butterworth Filtering, Bi-Level Feature Extraction, and Sparce Convolutional Neural Network to Luna 16 CT Images

Fri, 2024-07-26 06:00

J Imaging. 2024 Jul 15;10(7):168. doi: 10.3390/jimaging10070168.

ABSTRACT

Accurate prognosis and diagnosis are crucial for selecting and planning lung cancer treatments. As a result of the rapid development of medical imaging technology, the use of computed tomography (CT) scans in pathology is becoming standard practice. An intricate interplay of requirements and obstacles characterizes computer-assisted diagnosis, which relies on the precise and effective analysis of pathology images. In recent years, pathology image analysis tasks such as tumor region identification, prognosis prediction, tumor microenvironment characterization, and metastasis detection have witnessed the considerable potential of artificial intelligence, especially deep learning techniques. In this context, an artificial intelligence (AI)-based methodology for lung cancer diagnosis is proposed in this research work. As a first processing step, filtering using the Butterworth smooth filter algorithm was applied to the input images from the LUNA 16 lung cancer dataset to remove noise without significantly degrading the image quality. Next, we performed the bi-level feature selection step using the Chaotic Crow Search Algorithm and Random Forest (CCSA-RF) approach to select features such as diameter, margin, spiculation, lobulation, subtlety, and malignancy. Next, the Feature Extraction step was performed using the Multi-space Image Reconstruction (MIR) method with Grey Level Co-occurrence Matrix (GLCM). Next, the Lung Tumor Severity Classification (LTSC) was implemented by using the Sparse Convolutional Neural Network (SCNN) approach with a Probabilistic Neural Network (PNN). The developed method can detect benign, normal, and malignant lung cancer images using the PNN algorithm, which reduces complexity and efficiently provides classification results. Performance parameters, namely accuracy, precision, F-score, sensitivity, and specificity, were determined to evaluate the effectiveness of the implemented hybrid method and compare it with other solutions already present in the literature.

PMID:39057739 | DOI:10.3390/jimaging10070168

Categories: Literature Watch

Few-Shot Conditional Learning: Automatic and Reliable Device Classification for Medical Test Equipment

Fri, 2024-07-26 06:00

J Imaging. 2024 Jul 13;10(7):167. doi: 10.3390/jimaging10070167.

ABSTRACT

The limited availability of specialized image databases (particularly in hospitals, where tools vary between providers) makes it difficult to train deep learning models. This paper presents a few-shot learning methodology that uses a pre-trained ResNet integrated with an encoder as a backbone to encode conditional shape information for the classification of neonatal resuscitation equipment from less than 100 natural images. The model is also strengthened by incorporating a reliability score, which enriches the prediction with an estimation of classification reliability. The model, whose performance is cross-validated, reached a median accuracy performance of over 99% (and a lower limit of 73.4% for the least accurate model/fold) using only 87 meta-training images. During the test phase on complex natural images, performance was slightly degraded due to a sub-optimal segmentation strategy (FastSAM) required to maintain the real-time inference phase (median accuracy 87.25%). This methodology proves to be excellent for applying complex classification models to contexts (such as neonatal resuscitation) that are not available in public databases. Improvements to the automatic segmentation strategy prior to the extraction of conditional information will allow a natural application in simulation and hospital settings.

PMID:39057738 | DOI:10.3390/jimaging10070167

Categories: Literature Watch

Pages