Deep learning

Effects of precise cardio sounds on the success rate of phonocardiography

Mon, 2024-07-15 06:00

PLoS One. 2024 Jul 15;19(7):e0305404. doi: 10.1371/journal.pone.0305404. eCollection 2024.

ABSTRACT

This work investigates whether inclusion of the low-frequency components of heart sounds can increase the accuracy, sensitivity and specificity of diagnosis of cardiovascular disorders. We standardized the measurement method to minimize changes in signal characteristics. We used the Continuous Wavelet Transform to analyze changing frequency characteristics over time and to allocate frequencies appropriately between the low-frequency and audible frequency bands. We used a Convolutional Neural Network (CNN) and deep-learning (DL) for image classification, and a CNN equipped with long short-term memory to enable sequential feature extraction. The accuracy of the learning model was validated using the PhysioNet 2016 CinC dataset, then we used our collected dataset to show that incorporating low-frequency components in the dataset increased the DL model's accuracy by 2% and sensitivity by 4%. Furthermore, the LSTM layer was 0.8% more accurate than the dense layer.

PMID:39008512 | DOI:10.1371/journal.pone.0305404

Categories: Literature Watch

An assessment of the value of deep neural networks in genetic risk prediction for surgically relevant outcomes

Mon, 2024-07-15 06:00

PLoS One. 2024 Jul 15;19(7):e0294368. doi: 10.1371/journal.pone.0294368. eCollection 2024.

ABSTRACT

INTRODUCTION: Postoperative complications affect up to 15% of surgical patients constituting a major part of the overall disease burden in a modern healthcare system. While several surgical risk calculators have been developed, none have so far been shown to decrease the associated mortality and morbidity. Combining deep neural networks and genomics with the already established clinical predictors may hold promise for improvement.

METHODS: The UK Biobank was utilized to build linear and deep learning models for the prediction of surgery relevant outcomes. An initial GWAS for the relevant outcomes was initially conducted to select the Single Nucleotide Polymorphisms for inclusion in the models. Model performance was assessed with Receiver Operator Characteristics of the Area Under the Curve and optimum precision and recall. Feature importance was assessed with SHapley Additive exPlanations.

RESULTS: Models were generated for atrial fibrillation, venous thromboembolism and pneumonia as genetics only, clinical features only and a combined model. For venous thromboembolism, the ROC-AUCs were 60.1% [59.6%-60.4%], 63.4% [63.2%-63.4%] and 66.6% [66.2%-66.9%] for the linear models and 51.5% [49.4%-53.4%], 63.2% [61.2%-65.0%] and 62.6% [60.7%-64.5%] for the deep learning SNP, clinical and combined models, respectively. For atrial fibrillation, the ROC-AUCs were 60.3% [60.0%-60.4%], 78.7% [78.7%-78.7%] and 80.0% [79.9%-80.0%] for the linear models and 59.4% [58.2%-60.9%], 78.8% [77.8%-79.8%] and 79.8% [78.8%-80.9%] for the deep learning SNP, clinical and combined models, respectively. For pneumonia, the ROC-AUCs were 50.1% [49.6%-50.6%], 69.2% [69.1%-69.2%] and 68.4% [68.0%-68.5%] for the linear models and 51.0% [49.7%-52.4%], 69.7% [.5%-70.8%] and 69.7% [68.6%-70.8%] for the deep learning SNP, clinical and combined models, respectively.

CONCLUSION: In this report we presented linear and deep learning predictive models for surgery relevant outcomes. Overall, predictability was similar between linear and deep learning models and inclusion of genetics seemed to improve accuracy.

PMID:39008506 | DOI:10.1371/journal.pone.0294368

Categories: Literature Watch

Automated recognition of emotional states of horses from facial expressions

Mon, 2024-07-15 06:00

PLoS One. 2024 Jul 15;19(7):e0302893. doi: 10.1371/journal.pone.0302893. eCollection 2024.

ABSTRACT

Animal affective computing is an emerging new field, which has so far mainly focused on pain, while other emotional states remain uncharted territories, especially in horses. This study is the first to develop AI models to automatically recognize horse emotional states from facial expressions using data collected in a controlled experiment. We explore two types of pipelines: a deep learning one which takes as input video footage, and a machine learning one which takes as input EquiFACS annotations. The former outperforms the latter, with 76% accuracy in separating between four emotional states: baseline, positive anticipation, disappointment and frustration. Anticipation and frustration were difficult to separate, with only 61% accuracy.

PMID:39008504 | DOI:10.1371/journal.pone.0302893

Categories: Literature Watch

Extra-abdominal trocar and instrument detection for enhanced surgical workflow understanding

Mon, 2024-07-15 06:00

Int J Comput Assist Radiol Surg. 2024 Jul 15. doi: 10.1007/s11548-024-03220-0. Online ahead of print.

ABSTRACT

PURPOSE: Video-based intra-abdominal instrument tracking for laparoscopic surgeries is a common research area. However, the tracking can only be done with instruments that are actually visible in the laparoscopic image. By using extra-abdominal cameras to detect trocars and classify their occupancy state, additional information about the instrument location, whether an instrument is still in the abdomen or not, can be obtained. This can enhance laparoscopic workflow understanding and enrich already existing intra-abdominal solutions.

METHODS: A data set of four laparoscopic surgeries recorded with two time-synchronized extra-abdominal 2D cameras was generated. The preprocessed and annotated data were used to train a deep learning-based network architecture consisting of a trocar detection, a centroid tracker and a temporal model to provide the occupancy state of all trocars during the surgery.

RESULTS: The trocar detection model achieves an F1 score of 95.06 ± 0.88 % . The prediction of the occupancy state yields an F1 score of 89.29 ± 5.29 % , providing a first step towards enhanced surgical workflow understanding.

CONCLUSION: The current method shows promising results for the extra-abdominal tracking of trocars and their occupancy state. Future advancements include the enlargement of the data set and incorporation of intra-abdominal imaging to facilitate accurate assignment of instruments to trocars.

PMID:39008232 | DOI:10.1007/s11548-024-03220-0

Categories: Literature Watch

Assessment of image quality and diagnostic accuracy for cervical spondylosis using T2w-STIR sequence with a deep learning-based reconstruction approach

Mon, 2024-07-15 06:00

Eur Spine J. 2024 Jul 15. doi: 10.1007/s00586-024-08409-0. Online ahead of print.

ABSTRACT

OBJECTIVES: To investigate potential of enhancing image quality, maintaining interobserver consensus, and elevating disease diagnostic efficacy through the implementation of deep learning-based reconstruction (DLR) processing in 3.0 T cervical spine fast magnetic resonance imaging (MRI) images, compared with conventional images.

METHODS: The 3.0 T cervical spine MRI images of 71 volunteers were categorized into two groups: sagittal T2-weighted short T1 inversion recovery without DLR (Sag T2w-STIR) and with DLR (Sag T2w-STIR-DLR). The assessment covered artifacts, perceptual signal-to-noise ratio, clearness of tissue interfaces, fat suppression, overall image quality, and the delineation of spinal cord, vertebrae, discs, dopamine, and joints. Spanning canal stenosis, neural foraminal stenosis, herniated discs, annular fissures, hypertrophy of the ligamentum flavum or vertebral facet joints, and intervertebral disc degeneration were evaluated by three impartial readers.

RESULTS: Sag T2w-STIR-DLR images exhibited markedly superior performance across quality indicators (median = 4 or 5) compared to Sag T2w-STIR sequences (median = 3 or 4) (p < 0.001). No statistically significant differences were observed between the two sequences in terms of diagnosis and grading (p > 0.05). The interobserver agreement for Sag T2w-STIR-DLR images (0.604-0.931) was higher than the other (0.545-0.853), Sag T2w-STIR-DLR (0.747-1.000) demonstrated increased concordance between reader 1 and reader 3 in comparison to Sag T2w-STIR (0.508-1.000). Acquisition time diminished from 364 to 197 s through the DLR scheme.

CONCLUSIONS: Our investigation establishes that 3.0 T fast MRI images subjected to DLR processing present heightened image quality, bolstered diagnostic performance, and reduced scanning durations for cervical spine MRI compared with conventional sequences.

PMID:39007984 | DOI:10.1007/s00586-024-08409-0

Categories: Literature Watch

Present and future of whole-body MRI in metastatic disease and myeloma: how and why you will do it

Mon, 2024-07-15 06:00

Skeletal Radiol. 2024 Jul 15. doi: 10.1007/s00256-024-04723-2. Online ahead of print.

ABSTRACT

Metastatic disease and myeloma present unique diagnostic challenges due to their multifocal nature. Accurate detection and staging are critical for determining appropriate treatment. Bone scintigraphy, skeletal radiographs and CT have long been the mainstay for the assessment of these diseases, but have limitations, including reduced sensitivity and radiation exposure. Whole-body MRI has emerged as a highly sensitive and radiation-free alternative imaging modality. Initially developed for skeletal screening, it has extended tumor screening to all organs, providing morphological and physiological information on tumor tissue. Along with PET/CT, whole-body MRI is now accepted for staging and response assessment in many malignancies. It is the first choice in an ever increasing number of cancers (such as myeloma, lobular breast cancer, advanced prostate cancer, myxoid liposarcoma, bone sarcoma, …). It has also been validated as the method of choice for cancer screening in patients with a predisposition to cancer and for staging cancers observed during pregnancy. The current and future challenges for WB-MRI are its availability facing this number of indications, and its acceptance by patients, radiologists and health authorities. Guidelines have been developed to optimize image acquisition and reading, assessment of lesion response to treatment, and to adapt examination designs to specific cancers. The implementation of 3D acquisition, Dixon method, and deep learning-based image optimization further improve the diagnostic performance of the technique and reduce examination durations. Whole-body MRI screening is feasible in less than 30 min. This article reviews validated indications, recent developments, growing acceptance, and future perspectives of whole-body MRI.

PMID:39007948 | DOI:10.1007/s00256-024-04723-2

Categories: Literature Watch

Enhancement of cyber security in IoT based on ant colony optimized artificial neural adaptive Tensor flow

Mon, 2024-07-15 06:00

Network. 2024 Jul 15:1-17. doi: 10.1080/0954898X.2024.2336058. Online ahead of print.

ABSTRACT

The Internet of Things (IoT) is a network that connects various hardware, software, data storage, and applications. These interconnected devices provide services to businesses and can potentially serve as entry points for cyber-attacks. The privacy of IoT devices is increasingly vulnerable, particularly to threats like viruses and illegal software distribution lead to the theft of critical information. Ant Colony-Optimized Artificial Neural-Adaptive Tensorflow (ACO-ANT) technique is proposed to detect malicious software illicitly disseminated through the IoT. To emphasize the significance of each token in source duplicate data, the noise data undergoes processing using tokenization and weighted attribute techniques. Deep learning (DL) methods are then employed to identify source code duplication. Also the Multi-Objective Recurrent Neural Network (M-RNN) is used to identify suspicious activities within an IoT environment. The performance of proposed technique is examined using Loss, accuracy, F measure, precision to identify its efficiency. The experimental outcomes demonstrate that the proposed method ACO-ANT on Malimg dataset provides 12.35%, 14.75%, 11.84% higher precision and 10.95%, 15.78%, 13.89% higher f-measure compared to the existing methods. Further, leveraging block chain for malware detection is a promising direction for future research the fact that could enhance the security of IoT and identify malware threats.

PMID:39007930 | DOI:10.1080/0954898X.2024.2336058

Categories: Literature Watch

Colorimetric Analyses of the Optic Nerve Head and Retina Indicate Increased Blood Flow After Vitrectomy

Mon, 2024-07-15 06:00

Transl Vis Sci Technol. 2024 Jul 1;13(7):12. doi: 10.1167/tvst.13.7.12.

ABSTRACT

PURPOSE: The purpose of this study was to evaluate the impact of vitrectomy and posterior hyaloid (PH) peeling on color alteration of optic nerve head (ONH) and retina as a surrogate biomarker of induced perfusion changes.

METHODS: Masked morphometric and colorimetric analyses were conducted on preoperative (<1 month) and postoperative (<18 months) color fundus photographs of 54 patients undergoing vitrectomy, either with (44) or without (10) PH peeling and 31 years of age and gender-matched control eyes. Images were calibrated according to the hue and saturation values of the parapapillary venous blood column. Chromatic spectra of the retinal pigment epithelium and choroid were subtracted to avoid color aberrations. Red, green, and blue (RGB) bit values over the ONH and retina were plotted within the constructed RGB color space to analyze vitrectomy-induced color shift. Vitrectomy-induced parapapillary vein caliber changes were also computed morphometrically.

RESULTS: A significant post-vitrectomy red hue shift was noted on the ONH (37.1 degrees ± 10.9 degrees vs. 4.1 degrees ± 17.7 degrees, P < 0.001), which indicates a 2.8-fold increase in blood perfusion compared to control (2.6 ± 1.9 vs. 0.9 ± 1.8, P < 0.001). A significant post-vitrectomy increase in the retinal vein diameter was also noticed (6.8 ± 6.4% vs. 0.1 ± 0.3%, P < 0.001), which was more pronounced with PH peeling (7.9 ± 6.6% vs. 3.1 ± 4.2%, P = 0.002).

CONCLUSIONS: Vitrectomy and PH peeling increase ONH and retinal blood flow. Colorimetric and morphometric analyses offer valuable insights for future artificial intelligence and deep learning applications in this field.

TRANSLATIONAL RELEVANCE: The methodology described herein can easily be applied in different clinical settings and may enlighten the beneficial effects of vitrectomy in several retinal vascular diseases.

PMID:39007833 | DOI:10.1167/tvst.13.7.12

Categories: Literature Watch

Efficiency of oral keratinized gingiva detection and measurement based on convolutional neural network

Mon, 2024-07-15 06:00

J Periodontol. 2024 Jul 15. doi: 10.1002/JPER.24-0151. Online ahead of print.

ABSTRACT

BACKGROUND: With recent advances in artificial intelligence, the use of this technology has begun to facilitate comprehensive tissue evaluation and planning of interventions. This study aimed to assess different convolutional neural networks (CNN) in deep learning algorithms to detect keratinized gingiva based on intraoral photos and evaluate the ability of networks to measure keratinized gingiva width.

METHODS: Six hundred of 1200 photographs taken before and after applying a disclosing agent were used to compare the neural networks in segmenting the keratinized gingiva. Segmentation performances of networks were evaluated using accuracy, intersection over union, and F1 score. Keratinized gingiva width from a reference point was measured from ground truth images and compared with the measurements of clinicians and the DeepLab image that was generated from the ResNet50 model. The effect of measurement operators, phenotype, and jaw on differences in measurements was evaluated by three-factor mixed-design analysis of variance (ANOVA).

RESULTS: Among the compared networks, ResNet50 distinguished keratinized gingiva at the highest accuracy rate of 91.4%. The measurements between deep learning and clinicians were in excellent agreement according to jaw and phenotype. When analyzing the influence of the measurement operators, phenotype, and jaw on the measurements performed according to the ground truth, there were statistically significant differences in measurement operators and jaw (p < 0.05).

CONCLUSIONS: Automated keratinized gingiva segmentation with the ResNet50 model might be a feasible method for assisting professionals. The measurement results promise a potentially high performance of the model as it requires less time and experience.

PLAIN LANGUAGE SUMMARY: With recent advances in artificial intelligence (AI), it is now possible to use this technology to evaluate tissues and plan medical procedures thoroughly. This study focused on testing different AI models, specifically CNN, to identify and measure a specific type of gum tissue called keratinized gingiva using photos taken inside the mouth. Out of 1200 photos, 600 were used in the study to compare the performance of different CNN in identifying gingival tissue. The accuracy and effectiveness of these models were measured and compared to human clinician ratings. The study found that the ResNet50 model was the most accurate, correctly identifying gingival tissue 91.4% of the time. When the AI model and clinicians' measurements of gum tissue width were compared, the results were very similar, especially when accounting for different jaws and gum structures. The study also analyzed the effect of various factors on the measurements and found significant differences based on who took the measurements and jaw type. In conclusion, using the ResNet50 model to identify and measure gum tissue automatically could be a practical tool for dental professionals, saving time and requiring less expertise.

PMID:39007745 | DOI:10.1002/JPER.24-0151

Categories: Literature Watch

3DReact: Geometric Deep Learning for Chemical Reactions

Mon, 2024-07-15 06:00

J Chem Inf Model. 2024 Jul 15. doi: 10.1021/acs.jcim.4c00104. Online ahead of print.

ABSTRACT

Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction data sets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS, and Proparg-21-TS data sets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different data sets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.

PMID:39007724 | DOI:10.1021/acs.jcim.4c00104

Categories: Literature Watch

Enhanced enchondroma detection from x-ray images using deep learning: A step towards accurate and cost-effective diagnosis

Mon, 2024-07-15 06:00

J Orthop Res. 2024 Jul 15. doi: 10.1002/jor.25938. Online ahead of print.

ABSTRACT

This study investigates the automated detection of enchondromas, benign cartilage tumors, from x-ray images using deep learning techniques. Enchondromas pose diagnostic challenges due to their potential for malignant transformation and overlapping radiographic features with other conditions. Leveraging a data set comprising 1645 x-ray images from 1173 patients, a deep-learning model implemented with Detectron2 achieved an accuracy of 0.9899 in detecting enchondromas. The study employed rigorous validation processes and compared its findings with the existing literature, highlighting the superior performance of the deep learning approach. Results indicate the potential of machine learning in improving diagnostic accuracy and reducing healthcare costs associated with advanced imaging modalities. The study underscores the significance of early and accurate detection of enchondromas for effective patient management and suggests avenues for further research in musculoskeletal tumor detection.

PMID:39007705 | DOI:10.1002/jor.25938

Categories: Literature Watch

A novel deep machine learning algorithm with dimensionality and size reduction approaches for feature elimination: thyroid cancer diagnoses with randomly missing data

Mon, 2024-07-15 06:00

Brief Bioinform. 2024 May 23;25(4):bbae344. doi: 10.1093/bib/bbae344.

ABSTRACT

Thyroid cancer incidences endure to increase even though a large number of inspection tools have been developed recently. Since there is no standard and certain procedure to follow for the thyroid cancer diagnoses, clinicians require conducting various tests. This scrutiny process yields multi-dimensional big data and lack of a common approach leads to randomly distributed missing (sparse) data, which are both formidable challenges for the machine learning algorithms. This paper aims to develop an accurate and computationally efficient deep learning algorithm to diagnose the thyroid cancer. In this respect, randomly distributed missing data stemmed singularity in learning problems is treated and dimensionality reduction with inner and target similarity approaches are developed to select the most informative input datasets. In addition, size reduction with the hierarchical clustering algorithm is performed to eliminate the considerably similar data samples. Four machine learning algorithms are trained and also tested with the unseen data to validate their generalization and robustness abilities. The results yield 100% training and 83% testing preciseness for the unseen data. Computational time efficiencies of the algorithms are also examined under the equal conditions.

PMID:39007597 | DOI:10.1093/bib/bbae344

Categories: Literature Watch

DeepIDA-GRU: a deep learning pipeline for integrative discriminant analysis of cross-sectional and longitudinal multiview data with applications to inflammatory bowel disease classification

Mon, 2024-07-15 06:00

Brief Bioinform. 2024 May 23;25(4):bbae339. doi: 10.1093/bib/bbae339.

ABSTRACT

Biomedical research now commonly integrates diverse data types or views from the same individuals to better understand the pathobiology of complex diseases, but the challenge lies in meaningfully integrating these diverse views. Existing methods often require the same type of data from all views (cross-sectional data only or longitudinal data only) or do not consider any class outcome in the integration method, which presents limitations. To overcome these limitations, we have developed a pipeline that harnesses the power of statistical and deep learning methods to integrate cross-sectional and longitudinal data from multiple sources. In addition, it identifies key variables that contribute to the association between views and the separation between classes, providing deeper biological insights. This pipeline includes variable selection/ranking using linear and nonlinear methods, feature extraction using functional principal component analysis and Euler characteristics, and joint integration and classification using dense feed-forward networks for cross-sectional data and recurrent neural networks for longitudinal data. We applied this pipeline to cross-sectional and longitudinal multiomics data (metagenomics, transcriptomics and metabolomics) from an inflammatory bowel disease (IBD) study and identified microbial pathways, metabolites and genes that discriminate by IBD status, providing information on the etiology of IBD. We conducted simulations to compare the two feature extraction methods.

PMID:39007595 | DOI:10.1093/bib/bbae339

Categories: Literature Watch

Training immunophenotyping deep learning models with the same-section ground truth cell label derivation method improves virtual staining accuracy

Mon, 2024-07-15 06:00

Front Immunol. 2024 Jun 28;15:1404640. doi: 10.3389/fimmu.2024.1404640. eCollection 2024.

ABSTRACT

INTRODUCTION: Deep learning (DL) models predicting biomarker expression in images of hematoxylin and eosin (H&E)-stained tissues can improve access to multi-marker immunophenotyping, crucial for therapeutic monitoring, biomarker discovery, and personalized treatment development. Conventionally, these models are trained on ground truth cell labels derived from IHC-stained tissue sections adjacent to H&E-stained ones, which might be less accurate than labels from the same section. Although many such DL models have been developed, the impact of ground truth cell label derivation methods on their performance has not been studied.

METHODOLOGY: In this study, we assess the impact of cell label derivation on H&E model performance, with CD3+ T-cells in lung cancer tissues as a proof-of-concept. We compare two Pix2Pix generative adversarial network (P2P-GAN)-based virtual staining models: one trained with cell labels obtained from the same tissue section as the H&E-stained section (the 'same-section' model) and one trained on cell labels from an adjacent tissue section (the 'serial-section' model).

RESULTS: We show that the same-section model exhibited significantly improved prediction performance compared to the 'serial-section' model. Furthermore, the same-section model outperformed the serial-section model in stratifying lung cancer patients within a public lung cancer cohort based on survival outcomes, demonstrating its potential clinical utility.

DISCUSSION: Collectively, our findings suggest that employing ground truth cell labels obtained through the same-section approach boosts immunophenotyping DL solutions.

PMID:39007128 | PMC:PMC11239356 | DOI:10.3389/fimmu.2024.1404640

Categories: Literature Watch

Evaluating synthetic neuroimaging data augmentation for automatic brain tumour segmentation with a deep fully-convolutional network

Mon, 2024-07-15 06:00

IBRO Neurosci Rep. 2023 Dec 14;16:57-66. doi: 10.1016/j.ibneur.2023.12.002. eCollection 2024 Jun.

ABSTRACT

Gliomas observed in medical images require expert neuro-radiologist evaluation for treatment planning and monitoring, motivating development of intelligent systems capable of automating aspects of tumour evaluation. Deep learning models for automatic image segmentation rely on the amount and quality of training data. In this study we developed a neuroimaging synthesis technique to augment data for training fully-convolutional networks (U-nets) to perform automatic glioma segmentation. We used StyleGAN2-ada to simultaneously generate fluid-attenuated inversion recovery (FLAIR) magnetic resonance images and corresponding glioma segmentation masks. Synthetic data were successively added to real training data (n = 2751) in fourteen rounds of 1000 and used to train U-nets that were evaluated on held-out validation (n = 590) and test sets (n = 588). U-nets were trained with and without geometric augmentation (translation, zoom and shear), and Dice coefficients were computed to evaluate segmentation performance. We also monitored the number of training iterations before stopping, total training time, and time per iteration to evaluate computational costs associated with training each U-net. Synthetic data augmentation yielded marginal improvements in Dice coefficients (validation set +0.0409, test set +0.0355), whereas geometric augmentation improved generalization (standard deviation between training, validation and test set performances of 0.01 with, and 0.04 without geometric augmentation). Based on the modest performance gains for automatic glioma segmentation we find it hard to justify the computational expense of developing a synthetic image generation pipeline. Future work may seek to optimize the efficiency of synthetic data generation for augmentation of neuroimaging data.

PMID:39007088 | PMC:PMC11240293 | DOI:10.1016/j.ibneur.2023.12.002

Categories: Literature Watch

Behavioral profiling for adaptive video summarization: From generalization to personalization

Mon, 2024-07-15 06:00

MethodsX. 2024 Jun 14;13:102780. doi: 10.1016/j.mex.2024.102780. eCollection 2024 Dec.

ABSTRACT

In today's world of managing multimedia content, dealing with the amount of CCTV footage poses challenges related to storage, accessibility and efficient navigation. To tackle these issues, we suggest an encompassing technique, for summarizing videos that merges machine-learning techniques with user engagement. Our methodology consists of two phases, each bringing improvements to video summarization. In Phase I we introduce a method for summarizing videos based on keyframe detection and behavioral analysis. By utilizing technologies like YOLOv5 for object recognition, Deep SORT for object tracking, and Single Shot Detector (SSD) for creating video summaries. In Phase II we present a User Interest Based Video summarization system driven by machine learning. By incorporating user preferences into the summarization process we enhance techniques with personalized content curation. Leveraging tools such as NLTK, OpenCV, TensorFlow, and the EfficientDET model enables our system to generate customized video summaries tailored to preferences. This innovative approach not only enhances user interactions but also efficiently handles the overwhelming amount of video data on digital platforms. By combining these two methodologies we make progress in applying machine learning techniques while offering a solution to the complex challenges presented by managing multimedia data.

PMID:39007030 | PMC:PMC11239710 | DOI:10.1016/j.mex.2024.102780

Categories: Literature Watch

Masked pre-training of transformers for histology image analysis

Mon, 2024-07-15 06:00

J Pathol Inform. 2024 May 31;15:100386. doi: 10.1016/j.jpi.2024.100386. eCollection 2024 Dec.

ABSTRACT

In digital pathology, whole-slide images (WSIs) are widely used for applications such as cancer diagnosis and prognosis prediction. Vision transformer (ViT) models have recently emerged as a promising method for encoding large regions of WSIs while preserving spatial relationships among patches. However, due to the large number of model parameters and limited labeled data, applying transformer models to WSIs remains challenging. In this study, we propose a pretext task to train the transformer model in a self-supervised manner. Our model, MaskHIT, uses the transformer output to reconstruct masked patches, measured by contrastive loss. We pre-trained MaskHIT model using over 7000 WSIs from TCGA and extensively evaluated its performance in multiple experiments, covering survival prediction, cancer subtype classification, and grade prediction tasks. Our experiments demonstrate that the pre-training procedure enables context-aware understanding of WSIs, facilitates the learning of representative histological features based on patch positions and visual patterns, and is essential for the ViT model to achieve optimal results on WSI-level tasks. The pre-trained MaskHIT surpasses various multiple instance learning approaches by 3% and 2% on survival prediction and cancer subtype classification tasks, and also outperforms recent state-of-the-art transformer-based methods. Finally, a comparison between the attention maps generated by the MaskHIT model with pathologist's annotations indicates that the model can accurately identify clinically relevant histological structures on the whole slide for each task.

PMID:39006998 | PMC:PMC11246055 | DOI:10.1016/j.jpi.2024.100386

Categories: Literature Watch

Discovering novel Cathepsin L inhibitors from natural products using artificial intelligence

Mon, 2024-07-15 06:00

Comput Struct Biotechnol J. 2024 Jun 7;23:2606-2614. doi: 10.1016/j.csbj.2024.06.009. eCollection 2024 Dec.

ABSTRACT

Cathepsin L (CTSL) is a promising therapeutic target for metabolic disorders. Current pharmacological interventions targeting CTSL have demonstrated potential in reducing body weight gain, serum insulin levels, and improving glucose tolerance. However, the clinical application of CTSL inhibitors remains limited. In this study, we used a combination of artificial intelligence and experimental methods to identify new CTSL inhibitors from natural products. Through a robust deep learning model and molecular docking, we screened 150 molecules from natural products for experimental validation. At a concentration of 100 µM, we found that 36 of them exhibited more than 50 % inhibition of CTSL. Notably, 13 molecules displayed over 90 % inhibition and exhibiting concentration-dependent effects. The molecular dynamics simulation on the two most potent inhibitors, Plumbagin and Beta-Lapachone, demonstrated stable interaction at the CTSL active site. Enzyme kinetics studies have shown that these inhibitors exert an uncompetitive inhibitory effect on CTSL. In conclusion, our research identifies Plumbagin and Beta-Lapachone as potential CTSL inhibitors, offering promising candidates for the treatment of metabolic disorders and illustrating the effectiveness of artificial intelligence in drug discovery.

PMID:39006920 | PMC:PMC11245987 | DOI:10.1016/j.csbj.2024.06.009

Categories: Literature Watch

Explainable deep-learning framework: decoding brain states and prediction of individual performance in false-belief task at early childhood stage

Mon, 2024-07-15 06:00

Front Neuroinform. 2024 Jun 28;18:1392661. doi: 10.3389/fninf.2024.1392661. eCollection 2024.

ABSTRACT

Decoding of cognitive states aims to identify individuals' brain states and brain fingerprints to predict behavior. Deep learning provides an important platform for analyzing brain signals at different developmental stages to understand brain dynamics. Due to their internal architecture and feature extraction techniques, existing machine-learning and deep-learning approaches are suffering from low classification performance and explainability issues that must be improved. In the current study, we hypothesized that even at the early childhood stage (as early as 3-years), connectivity between brain regions could decode brain states and predict behavioral performance in false-belief tasks. To this end, we proposed an explainable deep learning framework to decode brain states (Theory of Mind and Pain states) and predict individual performance on ToM-related false-belief tasks in a developmental dataset. We proposed an explainable spatiotemporal connectivity-based Graph Convolutional Neural Network (Ex-stGCNN) model for decoding brain states. Here, we consider a developmental dataset, N = 155 (122 children; 3-12 yrs and 33 adults; 18-39 yrs), in which participants watched a short, soundless animated movie, shown to activate Theory-of-Mind (ToM) and pain networs. After scanning, the participants underwent a ToM-related false-belief task, leading to categorization into the pass, fail, and inconsistent groups based on performance. We trained our proposed model using Functional Connectivity (FC) and Inter-Subject Functional Correlations (ISFC) matrices separately. We observed that the stimulus-driven feature set (ISFC) could capture ToM and Pain brain states more accurately with an average accuracy of 94%, whereas it achieved 85% accuracy using FC matrices. We also validated our results using five-fold cross-validation and achieved an average accuracy of 92%. Besides this study, we applied the SHapley Additive exPlanations (SHAP) approach to identify brain fingerprints that contributed the most to predictions. We hypothesized that ToM network brain connectivity could predict individual performance on false-belief tasks. We proposed an Explainable Convolutional Variational Auto-Encoder (Ex-Convolutional VAE) model to predict individual performance on false-belief tasks and trained the model using FC and ISFC matrices separately. ISFC matrices again outperformed the FC matrices in prediction of individual performance. We achieved 93.5% accuracy with an F1-score of 0.94 using ISFC matrices and achieved 90% accuracy with an F1-score of 0.91 using FC matrices.

PMID:39006894 | PMC:PMC11239353 | DOI:10.3389/fninf.2024.1392661

Categories: Literature Watch

3D Biological/Biomedical Image Registration with enhanced Feature Extraction and Outlier Detection

Mon, 2024-07-15 06:00

ACM BCB. 2023 Sep;2023:1. doi: 10.1145/3584371.3612965. Epub 2023 Oct 4.

ABSTRACT

In various applications, such as computer vision, medical imaging and robotics, three-dimensional (3D) image registration is a significant step. It enables the alignment of various datasets into a single coordinate system, consequently providing a consistent perspective that allows further analysis. By precisely aligning images we can compare, analyze, and combine data collected in different situations. This paper presents a novel approach for 3D or z-stack microscopy and medical image registration, utilizing a combination of conventional and deep learning techniques for feature extraction and adaptive likelihood-based methods for outlier detection. The proposed method uses the Scale-invariant Feature Transform (SIFT) and the Residual Network (ResNet50) deep neural learning network to extract effective features and obtain precise and exhaustive representations of image contents. The registration approach also employs the adaptive Maximum Likelihood Estimation SAmple Consensus (MLESAC) method that optimizes outlier detection and increases noise and distortion resistance to improve the efficacy of these combined extracted features. This integrated approach demonstrates robustness, flexibility, and adaptability across a variety of imaging modalities, enabling the registration of complex images with higher precision. Experimental results show that the proposed algorithm outperforms state-of-the-art image registration methods, including conventional SIFT, SIFT with Random Sample Consensus (RANSAC), and Oriented FAST and Rotated BRIEF (ORB) methods, as well as registration software packages such as bUnwrapJ and TurboReg, in terms of Mutual Information (MI), Phase Congruency-Based (PCB) metrics, and Gradiant-based metrics (GBM), using 3D MRI and 3D serial sections of multiplex microscopy images.

PMID:39006863 | PMC:PMC11246549 | DOI:10.1145/3584371.3612965

Categories: Literature Watch

Pages