Deep learning

GC<sup>2</sup>: Generalizable Continual Classification of Medical Images

Wed, 2024-05-08 06:00

IEEE Trans Med Imaging. 2024 May 8;PP. doi: 10.1109/TMI.2024.3398533. Online ahead of print.

ABSTRACT

Deep learning models have achieved remarkable success in medical image classification. These models are typically trained once on the available annotated images and thus lack the ability of continually learning new tasks (i.e., new classes or data distributions) due to the problem of catastrophic forgetting. Recently, there has been more interest in designing continual learning methods to learn different tasks presented sequentially over time while preserving previously acquired knowledge. However, these methods focus mainly on preventing catastrophic forgetting and are tested under a closed-world assumption; i.e., assuming the test data is drawn from the same distribution as the training data. In this work, we advance the state-of-the-art in continual learning by proposing GC2 for medical image classification, which learns a sequence of tasks while simultaneously enhancing its out-of-distribution robustness. To alleviate forgetting, GC2 employs a gradual culpability-based network pruning to identify an optimal subnetwork for each task. To improve generalization, GC2 incorporates adversarial image augmentation and knowledge distillation approaches for learning generalized and robust representations for each subnetwork. Our extensive experiments on multiple benchmarks in a task-agnostic inference demonstrate that GC2 significantly outperforms baselines and other continual learning methods in reducing forgetting and enhancing generalization. Our code is publicly available at the following link: https://github.com/ nourhanb/TMI2024-GC2.

PMID:38717881 | DOI:10.1109/TMI.2024.3398533

Categories: Literature Watch

Simulating the cellular context in synthetic datasets for cryo-electron tomography

Wed, 2024-05-08 06:00

IEEE Trans Med Imaging. 2024 May 8;PP. doi: 10.1109/TMI.2024.3398401. Online ahead of print.

ABSTRACT

Cryo-electron tomography (cryo-ET) allows to visualize the cellular context at macromolecular level. To date, the impossibility of obtaining a reliable ground truth is limiting the application of deep learning-based image processing algorithms in this field. As a consequence, there is a growing demand of realistic synthetic datasets for training deep learning algorithms. In addition, besides assisting the acquisition and interpretation of experimental data, synthetic tomograms are used as reference models for cellular organization analysis from cellular tomograms. Current simulators in cryo-ET focus on reproducing distortions from image acquisition and tomogram reconstruction, however, they can not generate many of the low order features present in cellular tomograms. Here we propose several geometric and organization models to simulate low order cellular structures imaged by cryo-ET. Specifically, clusters of any known cytosolic or membrane bound macromolecules, membranes with different geometries as well as different filamentous structures such as microtubules or actin-like networks. Moreover, we use parametrizable stochastic models to generate a high diversity of geometries and organizations to simulate representative and generalized datasets, including very crowded environments like those observed in native cells. These models have been implemented in a multiplatform open-source Python package, including scripts to generate cryo-tomograms with adjustable sizes and resolutions. In addition, these scripts provide also distortion-free density maps besides the ground truth in different file formats for efficient access and advanced visualization. We show that such a realistic synthetic dataset can be readily used to train generalizable deep learning algorithms.

PMID:38717878 | DOI:10.1109/TMI.2024.3398401

Categories: Literature Watch

Child face detection on front passenger seat through deep learning

Wed, 2024-05-08 06:00

Traffic Inj Prev. 2024 May 8:1-10. doi: 10.1080/15389588.2024.2346811. Online ahead of print.

ABSTRACT

OBJECTIVE: One of the main causes of death worldwide among young people are car crashes, and most of these fatalities occur to children who are seated in the front passenger seat and who, at the time of an accident, receive a direct impact from the airbags, which is lethal for children under 13 years of age. The present study seeks to raise awareness of this risk by interior monitoring with a child face detection system that serves to alert the driver that the child should not be sitting in the front passenger seat.

METHODS: The system incorporates processing of data collected, elements of deep learning such as transfer learning, fine-tunning and facial detection to identify the presence of children in a robust way, which was achieved by training with a dataset generated from scratch for this specific purpose. The MobileNetV2 architecture was used based on the good performance shown when compared with the Inception architecture for this task; and its low computational cost, which facilitates implementing the final model on a Raspberry Pi 4B.

RESULTS: The resulting image dataset consisted of 102 empty seats, 71 children (0-13 years), and 96 adults (14-75 years). From the data augmentation, there were 2,496 images for adults and 2,310 for children. The classification of faces without sliding window gave a result of 98% accuracy and 100% precision. Finally, using the proposed methodology, it was possible to detect children in the front passenger seat in real time, with a delay of 1 s per decision and sliding window criterion, reaching an accuracy of 100%.

CONCLUSIONS: Although our 100% accuracy in an experimental environment is somewhat idealized in that the sensor was not blocked by direct sunlight, nor was it partially or completely covered by dirt or other debris common in vehicles transporting children. The present study showed that is possible the implementation of a robust noninvasive classification system made on Raspberry Pi 4 Model B in any automobile for the detection of a child in the front seat through deep learning methods such as Deep CNN.

PMID:38717829 | DOI:10.1080/15389588.2024.2346811

Categories: Literature Watch

An Automated Vertebrae Localization, Segmentation, and Osteoporotic Compression Fracture Detection Pipeline for Computed Tomographic Imaging

Wed, 2024-05-08 06:00

J Imaging Inform Med. 2024 May 8. doi: 10.1007/s10278-024-01135-5. Online ahead of print.

ABSTRACT

Osteoporosis is the most common chronic metabolic bone disease worldwide. Vertebral compression fracture (VCF) is the most common type of osteoporotic fracture. Approximately 700,000 osteoporotic VCFs are diagnosed annually in the USA alone, resulting in an annual economic burden of ~$13.8B. With an aging population, the rate of osteoporotic VCFs and their associated burdens are expected to rise. Those burdens include pain, functional impairment, and increased medical expenditure. Therefore, it is of utmost importance to develop an analytical tool to aid in the identification of VCFs. Computed Tomography (CT) imaging is commonly used to detect occult injuries. Unlike the existing VCF detection approaches based on CT, the standard clinical criteria for determining VCF relies on the shape of vertebrae, such as loss of vertebral body height. We developed a novel automated vertebrae localization, segmentation, and osteoporotic VCF detection pipeline for CT scans using state-of-the-art deep learning models to bridge this gap. To do so, we employed a publicly available dataset of spine CT scans with 325 scans annotated for segmentation, 126 of which also graded for VCF (81 with VCFs and 45 without VCFs). Our approach attained 96% sensitivity and 81% specificity in detecting VCF at the vertebral-level, and 100% accuracy at the subject-level, outperforming deep learning counterparts tested for VCF detection without segmentation. Crucially, we showed that adding predicted vertebrae segments as inputs significantly improved VCF detection at both vertebral and subject levels by up to 14% Sensitivity and 20% Specificity (p-value = 0.028).

PMID:38717516 | DOI:10.1007/s10278-024-01135-5

Categories: Literature Watch

Comparison of Different Fusion Radiomics for Predicting Benign and Malignant Sacral Tumors: A Pilot Study

Wed, 2024-05-08 06:00

J Imaging Inform Med. 2024 May 8. doi: 10.1007/s10278-024-01134-6. Online ahead of print.

ABSTRACT

Differentiating between benign and malignant sacral tumors is crucial for determining appropriate treatment options. This study aims to develop two benchmark fusion models and a deep learning radiomic nomogram (DLRN) capable of distinguishing between benign and malignant sacral tumors using multiple imaging modalities. We reviewed axial T2-weighted imaging (T2WI) and non-contrast computed tomography (NCCT) of 134 patients pathologically confirmed as sacral tumors. The two benchmark fusion models were developed using fusion deep learning (DL) features and fusion classical machine learning (CML) features from multiple imaging modalities, employing logistic regression, K-nearest neighbor classification, and extremely randomized trees. The two benchmark models exhibiting the most robust predictive performance were merged with clinical data to formulate the DLRN. Performance assessment involved computing the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, negative predictive value (NPV), and positive predictive value (PPV). The DL benchmark fusion model demonstrated superior performance compared to the CML fusion model. The DLRN, identified as the optimal model, exhibited the highest predictive performance, achieving an accuracy of 0.889 and an AUC of 0.961 in the test sets. Calibration curves were utilized to evaluate the predictive capability of the models, and decision curve analysis (DCA) was conducted to assess the clinical net benefit of the DLR model. The DLRN could serve as a practical predictive tool, capable of distinguishing between benign and malignant sacral tumors, offering valuable information for risk counseling, and aiding in clinical treatment decisions.

PMID:38717515 | DOI:10.1007/s10278-024-01134-6

Categories: Literature Watch

Predicting underwater acoustic transmission loss in the SOFAR channel from ray trajectories via deep learning

Wed, 2024-05-08 06:00

JASA Express Lett. 2024 May 1;4(5):056001. doi: 10.1121/10.0025976.

ABSTRACT

Predicting acoustic transmission loss in the SOFAR channel faces challenges, such as excessively complex algorithms and computationally intensive calculations in classical methods. To address these challenges, a deep learning-based underwater acoustic transmission loss prediction method is proposed. By properly training a U-net-type convolutional neural network, the method can provide an accurate mapping between ray trajectories and the transmission loss over the problem domain. Verifications are performed in a SOFAR channel with Munk's sound speed profile. The results suggest that the method has potential to be used as a fast predicting model without sacrificing accuracy.

PMID:38717470 | DOI:10.1121/10.0025976

Categories: Literature Watch

Detecting Substance Use Disorder Using Social Media Data and the Dark Web: Time- and Knowledge-Aware Study

Wed, 2024-05-08 06:00

JMIRx Med. 2024 May 1;5:e48519. doi: 10.2196/48519.

ABSTRACT

BACKGROUND: Opioid and substance misuse has become a widespread problem in the United States, leading to the "opioid crisis." The relationship between substance misuse and mental health has been extensively studied, with one possible relationship being that substance misuse causes poor mental health. However, the lack of evidence on the relationship has resulted in opioids being largely inaccessible through legal means.

OBJECTIVES: This study aims to analyze social media posts related to substance use and opioids being sold through cryptomarket listings. The study aims to use state-of-the-art deep learning models to generate sentiment and emotion from social media posts to understand users' perceptions of social media. The study also aims to investigate questions such as which synthetic opioids people are optimistic, neutral, or negative about; what kind of drugs induced fear and sorrow; what kind of drugs people love or are thankful about; which drugs people think negatively about; and which opioids cause little to no sentimental reaction.

METHODS: The study used the drug abuse ontology and state-of-the-art deep learning models, including knowledge-aware Bidirectional Encoder Representations From Transformers-based models, to generate sentiment and emotion from social media posts related to substance use and opioids being sold through cryptomarket listings. The study crawled cryptomarket data and extracted posts for fentanyl, fentanyl analogs, and other novel synthetic opioids. The study performed topic analysis associated with the generated sentiments and emotions to understand which topics correlate with people's responses to various drugs. Additionally, the study analyzed time-aware neural models built on these features while considering historical sentiment and emotional activity of posts related to a drug.

RESULTS: The study found that the most effective model performed well (statistically significant, with a macro-F1-score of 82.12 and recall of 83.58) in identifying substance use disorder. The study also found that there were varying levels of sentiment and emotion associated with different synthetic opioids, with some drugs eliciting more positive or negative responses than others. The study identified topics that correlated with people's responses to various drugs, such as pain relief, addiction, and withdrawal symptoms.

CONCLUSIONS: The study provides insight into users' perceptions of synthetic opioids based on sentiment and emotion expressed in social media posts. The study's findings can be used to inform interventions and policies aimed at reducing substance misuse and addressing the opioid crisis. The study demonstrates the potential of deep learning models for analyzing social media data to gain insights into public health issues.

PMID:38717384 | DOI:10.2196/48519

Categories: Literature Watch

Improving Automated Hemorrhage Detection in Sparse-view CT via U-Net-based Artifact Reduction

Wed, 2024-05-08 06:00

Radiol Artif Intell. 2024 May 8:e230275. doi: 10.1148/ryai.230275. Online ahead of print.

ABSTRACT

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To explore the potential benefits of deep learning-based artifact reduction in sparse-view cranial CT scans and its impact on automated hemorrhage detection. Materials and Methods In this retrospective study, a U-Net was trained for artifact reduction on simulated sparseview cranial CT scans from 3000 patients obtained from a public dataset and reconstructed with varying sparse-view levels. Additionally, the EfficientNetB2 was trained on full-view CT data from 17,545 patients for automated hemorrhage detection. Detection performance was evaluated using the area under the receiver operator characteristic curve (AUC), with differences assessed using the DeLong test, along with confusion matrices. A total variation (TV) postprocessing approach, commonly applied to sparse-view, served as the basis for comparison. A Bonferronicorrected significance level of 0.001/6 = 0.00017 was used to accommodate for multiple hypotheses testing. Results Images with U-Net postprocessing were better than unprocessed and TV-processed images with respect to image quality and automated hemorrhage detection. With U-Net postprocessing, the number of views could be reduced from 4096 (AUC: 0.97; 95% CI: 0.97-0.98) to 512 (0.97; 0.97-0.98; P < .00017) and to 256 views (0.97; 0.96-0.97; P < .00017) with minimal decrease in hemorrhage detection performance. This was accompanied by mean structural similarity index measure increases of 0.0210 (95% CI: 0.0210-0.0211) and 0.0560 (95% CI: 0.0559-0.0560) relative to unprocessed images. Conclusion U-Net based artifact reduction substantially enhances automated hemorrhage detection in sparse-view cranial CTs. ©RSNA, 2024.

PMID:38717293 | DOI:10.1148/ryai.230275

Categories: Literature Watch

Impact of Transfer Learning Using Local Data on Performance of a Deep Learning Model for Screening Mammography

Wed, 2024-05-08 06:00

Radiol Artif Intell. 2024 May 8:e230383. doi: 10.1148/ryai.230383. Online ahead of print.

ABSTRACT

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To investigate the issues of generalizability and replication of deep learning (DL) models by assessing performance of a screening mammography DL system developed at New York University (NYU) on a local Australian dataset. Materials and Methods In this retrospective study, all individuals with biopsy and surgical pathology-proven lesions and age-matched controls were identified from a South Australian public mammography screening program (January 2010 to December 2016). The primary outcome was DL system performance, measured with the area under the receiver operating characteristic curve (AUC), in classifying invasive breast cancer or ductal carcinoma in situ (n = 425) from no malignancy (n = 490) or benign lesions (n = 44) in age-matched controls. The NYU system, including models without (NYU1) and with (NYU2) heatmaps, was tested in its original form, after training from scratch (without transfer learning; TL), after retraining with TL. Results The local test set comprised 959 individuals (mean age, 62.5 years [SD, 8.5]; all female). The original AUCs for the NYU1 and NYU2 models were 0.83 (95%CI = 0.82-0.84) and 0.89 (95%CI = 0.88-0.89), respectively. When applied in their original form to the local test set, the AUCs were 0.76 (95%CI = 0.73-0.79) and 0.84 (95%CI = 0.82-0.87), respectively. After local training without TL, the AUCs were 0.66 (95%CI = 0.62-0.69) and 0.86 (95%CI = 0.84-0.88). After retraining with TL, the AUCs were 0.82 (95%CI = 0.80-0.85) and 0.86 (95%CI = 0.84-0.88). Conclusion A deep learning system developed using a U.S. dataset showed reduced performance when applied 'out of the box' to an Australian dataset. Local retraining with transfer learning using available model weights improved model performance. ©RSNA, 2024.

PMID:38717291 | DOI:10.1148/ryai.230383

Categories: Literature Watch

Deep network fault diagnosis for imbalanced small-sized samples via a coupled adversarial autoencoder based on the Bayesian method

Wed, 2024-05-08 06:00

Rev Sci Instrum. 2024 May 1;95(5):055104. doi: 10.1063/5.0193162.

ABSTRACT

Deep network fault diagnosis methods heavily rely on abundant labeled data for effective model training. However, small-sized samples and imbalanced samples often lead to insufficient features, resulting in accuracy degradation and even instability in the diagnosis model. To address this challenge, this paper introduces a coupled adversarial autoencoder (CoAAE) based on the Bayesian method. This model aims to solve the issue of insufficient samples by generating fake samples and integrating them with the original ones. Within the CoAAE framework, the probability density distribution of the original data is captured using an encoder and fake samples are generated by random sampling from this distribution and decoding them. This process is the adversarial interaction between the encoder and a classifier to obtain the prior distribution of the encoder's parameters. The encoder's parameters are updated through the decoder's reconstruction process, leading to the posterior distribution. Concurrently, the decoder is trained to enhance its ability to reconstruct samples accurately. To address the imbalance in the original samples, a parallel coupled network is employed. This network shares the weights of the extraction layer in the encoder, enabling it to learn the joint distribution between fault-related and normal samples. To evaluate the effectiveness of the proposed data augmentation method, experiments were conducted on a bearing database from Case Western Reserve University using ResNet18 as the deep learning diagnosis model representative. The results demonstrate that CoAAE can effectively augment imbalanced datasets and outperform other advanced methods.

PMID:38717264 | DOI:10.1063/5.0193162

Categories: Literature Watch

Underwater sound speed profile estimation from vessel traffic recordings and multi-view neural networks

Wed, 2024-05-08 06:00

J Acoust Soc Am. 2024 May 1;155(5):3015-3026. doi: 10.1121/10.0025920.

ABSTRACT

Sound speed is a critical parameter in ocean acoustic studies, as it determines the propagation and interpretation of recorded sounds. The potential for exploiting oceanic vessel noise as a sound source of opportunity to estimate ocean sound speed profile is investigated. A deep learning-based inversion scheme, relying upon the underwater radiated noise of moving vessels measured by a single hydrophone, is proposed. The dataset used for this study consists of Automatic Identification System data and acoustic recordings of maritime vessels transiting through the Santa Barbara Channel between January 2015 and December 2017. The acoustic recordings and vessel descriptors are used as predictors for regressing sound speed for each meter in the top 200 m of the water column, where sound speeds are most variable. Multiple (typically ranging between 4 and 10) transits were recorded each day; therefore, this dataset provides an opportunity to investigate whether multiple acoustic observations can be leveraged together to improve inversion estimates. The proposed single-transit and multi-transit models resulted in depth-averaged root-mean-square errors of 1.79 and 1.55 m/s, respectively, compared to the seasonal average predictions of 2.80 m/s.

PMID:38717207 | DOI:10.1121/10.0025920

Categories: Literature Watch

Application of deep learning for the analysis of stomata: A review of current methods and future directions

Wed, 2024-05-08 06:00

J Exp Bot. 2024 May 8:erae207. doi: 10.1093/jxb/erae207. Online ahead of print.

ABSTRACT

Plant physiology and metabolism relies on the function of stomata, structures on the surface of above ground organs, which facilitate the exchange of gases with the atmosphere. The morphology of the guard cells and corresponding pore which make up the stomata, as well as the density (number per unit area) are critical in determining overall gas exchange capacity. These characteristics can be quantified visually from images captured using microscopes, traditionally relying on time-consuming manual analysis. However, deep learning (DL) models provide a promising route to increase the throughput and accuracy of plant phenotyping tasks, including stomatal analysis. Here we review the published literature on the application of DL for stomatal analysis. We discuss the variation in pipelines used; from data acquisition, pre-processing, DL architecture and output evaluation to post processing. We introduce the most common network structures, the plant species that have been studied, and the measurements that have been performed. Through this review, we hope to promote the use of DL methods for plant phenotyping tasks and highlight future requirements to optimise uptake; predominantly focusing on the sharing of datasets and generalisation of models as well as the caveats associated with utilising image data to infer physiological function.

PMID:38716775 | DOI:10.1093/jxb/erae207

Categories: Literature Watch

Trends in artificial intelligence, machine learning, and chemometrics applied to chemical data

Wed, 2024-05-08 06:00

Anal Sci Adv. 2021 Feb 2;2(3-4):128-141. doi: 10.1002/ansa.202000162. eCollection 2021 Apr.

ABSTRACT

Artificial intelligence-based methods such as chemometrics, machine learning, and deep learning are promising tools that lead to a clearer and better understanding of data. Only with these tools, data can be used to its full extent, and the gained knowledge on processes, interactions, and characteristics of the sample is maximized. Therefore, scientists are developing data science tools mentioned above to automatically and accurately extract information from data and increase the application possibilities of the respective data in various fields. Accordingly, AI-based techniques were utilized for chemical data since the 1970s and this review paper focuses on the recent trends of chemometrics, machine learning, and deep learning for chemical and spectroscopic data in 2020. In this regard, inverse modeling, preprocessing methods, and data modeling applied to spectra and image data for various measurement techniques are discussed.

PMID:38716450 | PMC:PMC10989568 | DOI:10.1002/ansa.202000162

Categories: Literature Watch

Brain MRI sequence and view plane identification using deep learning

Wed, 2024-05-08 06:00

Front Neuroinform. 2024 Apr 23;18:1373502. doi: 10.3389/fninf.2024.1373502. eCollection 2024.

ABSTRACT

Brain magnetic resonance imaging (MRI) scans are available in a wide variety of sequences, view planes, and magnet strengths. A necessary preprocessing step for any automated diagnosis is to identify the MRI sequence, view plane, and magnet strength of the acquired image. Automatic identification of the MRI sequence can be useful in labeling massive online datasets used by data scientists in the design and development of computer aided diagnosis (CAD) tools. This paper presents a deep learning (DL) approach for brain MRI sequence and view plane identification using scans of different data types as input. A 12-class classification system is presented for commonly used MRI scans, including T1, T2-weighted, proton density (PD), fluid attenuated inversion recovery (FLAIR) sequences in axial, coronal and sagittal view planes. Multiple online publicly available datasets have been used to train the system, with multiple infrastructures. MobileNet-v2 offers an adequate performance accuracy of 99.76% with unprocessed MRI scans and a comparable accuracy with skull-stripped scans and has been deployed in a tool for public use. The tool has been tested on unseen data from online and hospital sources with a satisfactory performance accuracy of 99.84 and 86.49%, respectively.

PMID:38716062 | PMC:PMC11074364 | DOI:10.3389/fninf.2024.1373502

Categories: Literature Watch

ContourTL-Net: Contour-Based Transfer Learning Algorithm for Early-Stage Brain Tumor Detection

Wed, 2024-05-08 06:00

Int J Biomed Imaging. 2024 Apr 29;2024:6347920. doi: 10.1155/2024/6347920. eCollection 2024.

ABSTRACT

Brain tumors are critical neurological ailments caused by uncontrolled cell growth in the brain or skull, often leading to death. An increasing patient longevity rate requires prompt detection; however, the complexities of brain tissue make early diagnosis challenging. Hence, automated tools are necessary to aid healthcare professionals. This study is particularly aimed at improving the efficacy of computerized brain tumor detection in a clinical setting through a deep learning model. Hence, a novel thresholding-based MRI image segmentation approach with a transfer learning model based on contour (ContourTL-Net) is suggested to facilitate the clinical detection of brain malignancies at an initial phase. The model utilizes contour-based analysis, which is critical for object detection, precise segmentation, and capturing subtle variations in tumor morphology. The model employs a VGG-16 architecture priorly trained on the "ImageNet" collection for feature extraction and categorization. The model is designed to utilize its ten nontrainable and three trainable convolutional layers and three dropout layers. The proposed ContourTL-Net model is evaluated on two benchmark datasets in four ways, among which an unseen case is considered as the clinical aspect. Validating a deep learning model on unseen data is crucial to determine the model's generalization capability, domain adaptation, robustness, and real-world applicability. Here, the presented model's outcomes demonstrate a highly accurate classification of the unseen data, achieving a perfect sensitivity and negative predictive value (NPV) of 100%, 98.60% specificity, 99.12% precision, 99.56% F1-score, and 99.46% accuracy. Additionally, the outcomes of the suggested model are compared with state-of-the-art methodologies to further enhance its effectiveness. The proposed solution outperforms the existing solutions in both seen and unseen data, with the potential to significantly improve brain tumor detection efficiency and accuracy, leading to earlier diagnoses and improved patient outcomes.

PMID:38716037 | PMC:PMC11074715 | DOI:10.1155/2024/6347920

Categories: Literature Watch

MAN-C: A masked autoencoder neural cryptography based encryption scheme for CT scan images

Wed, 2024-05-08 06:00

MethodsX. 2024 Apr 28;12:102738. doi: 10.1016/j.mex.2024.102738. eCollection 2024 Jun.

ABSTRACT

Sharing medical images securely is very important towards keeping patients' data confidential. In this paper we propose MAN-C: a Masked Autoencoder Neural Cryptography based encryption scheme for sharing medical images. The proposed technique builds upon recently proposed masked autoencoders. In the original paper, the masked autoencoders are used as scalable self-supervised learners for computer vision which reconstruct portions of originally patched images. Here, the facility to obfuscate portions of input image and the ability to reconstruct original images is used an encryption-decryption scheme. In the final form, masked autoencoders are combined with neural cryptography consisting of a tree parity machine and Shamir Scheme for secret image sharing. The proposed technique MAN-C helps to recover the loss in image due to noise during secret sharing of image.•Uses recently proposed masked autoencoders, originally designed as scalable self-supervised learners for computer vision, in an encryption-decryption setup.•Combines autoencoders with neural cryptography - the advantage our proposed approach offers over existing technique is that (i) Neural cryptography is a new type of public key cryptography that is not based on number theory, requires less computing time and memory and is non-deterministic in nature, (ii) masked auto-encoders provide additional level of obfuscation through their deep learning architecture.•The proposed scheme was evaluated on dataset consisting of CT scans made public by The Cancer Imaging Archive (TCIA). The proposed method produces better RMSE values between the input the encrypted image and comparable correlation values between the input and the output image with respect to the existing techniques.

PMID:38715952 | PMC:PMC11074963 | DOI:10.1016/j.mex.2024.102738

Categories: Literature Watch

A practical guide to the implementation of artificial intelligence in orthopaedic research-Part 2: A technical introduction

Wed, 2024-05-08 06:00

J Exp Orthop. 2024 May 7;11(3):e12025. doi: 10.1002/jeo2.12025. eCollection 2024 Jul.

ABSTRACT

Recent advances in artificial intelligence (AI) present a broad range of possibilities in medical research. However, orthopaedic researchers aiming to participate in research projects implementing AI-based techniques require a sound understanding of the technical fundamentals of this rapidly developing field. Initial sections of this technical primer provide an overview of the general and the more detailed taxonomy of AI methods. Researchers are presented with the technical basics of the most frequently performed machine learning (ML) tasks, such as classification, regression, clustering and dimensionality reduction. Additionally, the spectrum of supervision in ML including the domains of supervised, unsupervised, semisupervised and self-supervised learning will be explored. Recent advances in neural networks (NNs) and deep learning (DL) architectures have rendered them essential tools for the analysis of complex medical data, which warrants a rudimentary technical introduction to orthopaedic researchers. Furthermore, the capability of natural language processing (NLP) to interpret patterns in human language is discussed and may offer several potential applications in medical text classification, patient sentiment analysis and clinical decision support. The technical discussion concludes with the transformative potential of generative AI and large language models (LLMs) on AI research. Consequently, this second article of the series aims to equip orthopaedic researchers with the fundamental technical knowledge required to engage in interdisciplinary collaboration in AI-driven orthopaedic research.

LEVEL OF EVIDENCE: Level IV.

PMID:38715910 | PMC:PMC11076014 | DOI:10.1002/jeo2.12025

Categories: Literature Watch

Advances in artificial intelligence in thyroid-associated ophthalmopathy

Wed, 2024-05-08 06:00

Front Endocrinol (Lausanne). 2024 Apr 23;15:1356055. doi: 10.3389/fendo.2024.1356055. eCollection 2024.

ABSTRACT

Thyroid-associated ophthalmopathy (TAO), also referred to as Graves' ophthalmopathy, is a medical condition wherein ocular complications arise due to autoimmune thyroid illness. The diagnosis of TAO, reliant on imaging, typical ocular symptoms, and abnormalities in thyroid function or thyroid-associated antibodies, is generally graded and staged. In recent years, Artificial intelligence(AI), particularly deep learning(DL) technology, has gained widespread use in the diagnosis and treatment of ophthalmic diseases. This paper presents a discussion on specific studies involving AI, specifically DL, in the context of TAO, highlighting their applications in TAO diagnosis, staging, grading, and treatment decisions. Additionally, it addresses certain limitations in AI research on TAO and potential future directions for the field.

PMID:38715793 | PMC:PMC11075148 | DOI:10.3389/fendo.2024.1356055

Categories: Literature Watch

Synthesizing 3D Multi-Contrast Brain Tumor MRIs Using Tumor Mask Conditioning

Wed, 2024-05-08 06:00

Proc SPIE Int Soc Opt Eng. 2024 Feb;12931:129310M. doi: 10.1117/12.3009331. Epub 2024 Apr 2.

ABSTRACT

Data scarcity and data imbalance are two major challenges in training deep learning models on medical images, such as brain tumor MRI data. The recent advancements in generative artificial intelligence have opened new possibilities for synthetically generating MRI data, including brain tumor MRI scans. This approach can be a potential solution to mitigate the data scarcity problem and enhance training data availability. This work focused on adapting the 2D latent diffusion models to generate 3D multi-contrast brain tumor MRI data with a tumor mask as the condition. The framework comprises two components: a 3D autoencoder model for perceptual compression and a conditional 3D Diffusion Probabilistic Model (DPM) for generating high-quality and diverse multi-contrast brain tumor MRI samples, guided by a conditional tumor mask. Unlike existing works that focused on generating either 2D multi-contrast or 3D single-contrast MRI samples, our models generate multi-contrast 3D MRI samples. We also integrated a conditional module within the UNet backbone of the DPM to capture the semantic class-dependent data distribution driven by the provided tumor mask to generate MRI brain tumor samples based on a specific brain tumor mask. We trained our models using two brain tumor datasets: The Cancer Genome Atlas (TCGA) public dataset and an internal dataset from the University of Texas Southwestern Medical Center (UTSW). The models were able to generate high-quality 3D multi-contrast brain tumor MRI samples with the tumor location aligned by the input condition mask. The quality of the generated images was evaluated using the Fréchet Inception Distance (FID) score. This work has the potential to mitigate the scarcity of brain tumor data and improve the performance of deep learning models involving brain tumor MRI data.

PMID:38715792 | PMC:PMC11075745 | DOI:10.1117/12.3009331

Categories: Literature Watch

Audio-Based Emotion Recognition Using Self-Supervised Learning on an Engineered Feature Space

Wed, 2024-05-08 06:00

AI (Basel). 2024 Mar;5(1):195-207. doi: 10.3390/ai5010011. Epub 2024 Jan 17.

ABSTRACT

Emotion recognition models using audio input data can enable the development of interactive systems with applications in mental healthcare, marketing, gaming, and social media analysis. While the field of affective computing using audio data is rich, a major barrier to achieve consistently high-performance models is the paucity of available training labels. Self-supervised learning (SSL) is a family of methods which can learn despite a scarcity of supervised labels by predicting properties of the data itself. To understand the utility of self-supervised learning for audio-based emotion recognition, we have applied self-supervised learning pre-training to the classification of emotions from the CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU- MOSEI)'s acoustic data. Unlike prior papers that have experimented with raw acoustic data, our technique has been applied to encoded acoustic data with 74 parameters of distinctive audio features at discrete timesteps. Our model is first pre-trained to uncover the randomly masked timestamps of the acoustic data. The pre-trained model is then fine-tuned using a small sample of annotated data. The performance of the final model is then evaluated via overall mean absolute error (MAE), mean absolute error (MAE) per emotion, overall four-class accuracy, and four-class accuracy per emotion. These metrics are compared against a baseline deep learning model with an identical backbone architecture. We find that self-supervised learning consistently improves the performance of the model across all metrics, especially when the number of annotated data points in the fine-tuning step is small. Furthermore, we quantify the behaviors of the self-supervised model and its convergence as the amount of annotated data increases. This work characterizes the utility of self-supervised learning for affective computing, demonstrating that self-supervised learning is most useful when the number of training examples is small and that the effect is most pronounced for emotions which are easier to classify such as happy, sad, and angry. This work further demonstrates that self-supervised learning still improves performance when applied to the embedded feature representations rather than the traditional approach of pre-training on the raw input space.

PMID:38715564 | PMC:PMC11076058 | DOI:10.3390/ai5010011

Categories: Literature Watch

Pages