Deep learning

Enhanced PRIM recognition using PRI sound and deep learning techniques

Wed, 2024-05-01 06:00

PLoS One. 2024 May 1;19(5):e0298373. doi: 10.1371/journal.pone.0298373. eCollection 2024.

ABSTRACT

Pulse repetition interval modulation (PRIM) is integral to radar identification in modern electronic support measure (ESM) and electronic intelligence (ELINT) systems. Various distortions, including missing pulses, spurious pulses, unintended jitters, and noise from radar antenna scans, often hinder the accurate recognition of PRIM. This research introduces a novel three-stage approach for PRIM recognition, emphasizing the innovative use of PRI sound. A transfer learning-aided deep convolutional neural network (DCNN) is initially used for feature extraction. This is followed by an extreme learning machine (ELM) for real-time PRIM classification. Finally, a gray wolf optimizer (GWO) refines the network's robustness. To evaluate the proposed method, we develop a real experimental dataset consisting of sound of six common PRI patterns. We utilized eight pre-trained DCNN architectures for evaluation, with VGG16 and ResNet50V2 notably achieving recognition accuracies of 97.53% and 96.92%. Integrating ELM and GWO further optimized the accuracy rates to 98.80% and 97.58. This research advances radar identification by offering an enhanced method for PRIM recognition, emphasizing the potential of PRI sound to address real-world distortions in ESM and ELINT systems.

PMID:38691542 | DOI:10.1371/journal.pone.0298373

Categories: Literature Watch

Editorial Comment: Using Appropriate Training Data in Deep-Learning Tissue and Organ Segmentations on CT

Wed, 2024-05-01 06:00

AJR Am J Roentgenol. 2024 May 1. doi: 10.2214/AJR.24.31345. Online ahead of print.

NO ABSTRACT

PMID:38691412 | DOI:10.2214/AJR.24.31345

Categories: Literature Watch

Deep-Learning Models for Abdominal CT Organ Segmentation in Children: Development and Validation in Internal and Heterogeneous Public Datasets

Wed, 2024-05-01 06:00

AJR Am J Roentgenol. 2024 May 1. doi: 10.2214/AJR.24.30931. Online ahead of print.

ABSTRACT

Background: Deep-learning abdominal organ segmentation algorithms have shown excellent results in adults; validation in children is sparse. Objective: To develop and validate deep-learning models for liver, spleen, and pancreas segmentation on pediatric CT examinations. Methods: This retrospective study developed and validated deep-learning models for liver, spleen, and pancreas segmentation using 1731 CT examinations (1504 training, 221 testing), derived from three internal institutional pediatric (age ≤18) datasets (n=483) and three public datasets comprising pediatric and adult examinations with various pathologies (n=1248). Three deep-learning model architectures (SegResNet, DynUNet, and SwinUNETR) from the Medical Open Network for AI (MONAI) framework underwent training using native training (NT), relying solely on institutional datasets, and transfer learning (TL), incorporating pre-training on public datasets. For comparison, TotalSegmentator (TS), a publicly available segmentation model, was applied to test data without further training. Segmentation performance was evaluated using mean Dice similarity coefficient (DSC), with manual segmentations as reference. Results: For internal pediatric data, DSC for normal liver was 0.953 (TS), 0.964-0.965 (NT models), and 0.965-0.966 (TL models); normal spleen, 0.914 (TS), 0.942-0.945 (NT models), and 0.937-0.945 (TL models); normal pancreas, 0.733 (TS), 0.774-0.785 (NT models), and 0.775-0.786 (TL models); pancreas with pancreatitis, 0.703 (TS), 0.590-0.640 (NT models), and 0.667-0.711 (TL models). For public pediatric data, DSC for liver was 0.952 (TS), 0.876-0.908 (NT models), and 0.941-0.946 (TL models); spleen, 0.905 (TS), 0.771-0.827 (NT models), and 0.897-0.926 (TL models); pancreas, 0.700 (TS), 0.577-0.648 (NT models), and 0.693-0.736 (TL models). For public primarily adult data, DSC for liver was 0.991 (TS), 0.633-0.750 (NT models), and 0.926-0.952 (TL models); spleen, 0.983 (TS), 0.569-0.604 (NT models), and 0.923-0.947 (TL models); pancreas, 0.909 (TS), 0.148-0.241 (NT models), and 0.699-0.775 (TL models). DynUNet-TL was selected as the best-performing NT or TL model and was made available as an opensource MONAI bundle (https://github.com/cchmc-dll/pediatric_abdominal_segmentation_bundle.git). Conclusion: TL models trained on heterogeneous public datasets and fine-tuned using institutional pediatric data outperformed internal NT models and TotalSegmentator across internal and external pediatric test data. Segmentation performance was better in liver and spleen than in pancreas. Clinical Impact: The selected model may be used for various volumetry applications in pediatric imaging.

PMID:38691411 | DOI:10.2214/AJR.24.30931

Categories: Literature Watch

Deep Learning-Based Assessment of Built Environment From Satellite Images and Cardiometabolic Disease Prevalence

Wed, 2024-05-01 06:00

JAMA Cardiol. 2024 May 1. doi: 10.1001/jamacardio.2024.0749. Online ahead of print.

ABSTRACT

IMPORTANCE: Built environment plays an important role in development of cardiovascular disease. Large scale, pragmatic evaluation of built environment has been limited owing to scarce data and inconsistent data quality.

OBJECTIVE: To investigate the association between image-based built environment and the prevalence of cardiometabolic disease in urban cities.

DESIGN, SETTING, AND PARTICIPANTS: This cross-sectional study used features extracted from Google satellite images (GSI) to measure the built environment and link them with prevalence of cardiometabolic disease. Convolutional neural networks, light gradient-boosting machines, and activation maps were used to assess the association with health outcomes and identify feature associations with coronary heart disease (CHD), stroke, and chronic kidney disease (CKD). The study obtained aerial images from GSI covering census tracts in 7 cities (Cleveland, Ohio; Fremont, California; Kansas City, Missouri; Detroit, Michigan; Bellevue, Washington; Brownsville, Texas; and Denver, Colorado). The study used census tract-level data from the US Centers for Disease Control and Prevention's 500 Cities project. The data were originally collected from the Behavioral Risk Factor Surveillance System that surveyed people 18 years and older across the country. Analyses were conducted from February to December 2022.

EXPOSURES: GSI images of built environment and cardiometabolic disease prevalence.

MAIN OUTCOMES AND MEASURES: Census tract-level estimated prevalence of CHD, stroke, and CKD based on image-based built environment features.

RESULTS: The study obtained 31 786 aerial images from GSI covering 789 census tracts. Built environment features extracted from GSI using machine learning were associated with prevalence of CHD (R2 = 0.60), stroke (R2 = 0.65), and CKD (R2 = 0.64). The model performed better at distinguishing differences between cardiometabolic prevalence between cities than within cities (eg, highest within-city R2 = 0.39 vs between-city R2 = 0.64 for CKD). Addition of GSI features both outperformed and improved the model that only included age, sex, race, income, education, and composite indices for social determinants of health (R2 = 0.83 vs R2 = 0.76 for CHD; P <.001). Activation maps from the features revealed certain health-related built environment such as roads, highways, and railroads and recreational facilities such as amusement parks, arenas, and baseball parks.

CONCLUSIONS AND RELEVANCE: In this cross-sectional study, a significant portion of cardiometabolic disease prevalence was associated with GSI-based built environment using convolutional neural networks.

PMID:38691380 | DOI:10.1001/jamacardio.2024.0749

Categories: Literature Watch

Integrating the Physical Environment Within a Population Neuroscience Perspective

Wed, 2024-05-01 06:00

Curr Top Behav Neurosci. 2024 May 2. doi: 10.1007/7854_2024_477. Online ahead of print.

ABSTRACT

Population neuroscience recognises the role of the environment in shaping brain, behaviour, and mental health. An overview of current evidence from neuroscientific and epidemiological studies highlights the protective effects of nature on cognitive function and stress reduction, the detrimental effects of urban living on mental health, and emerging concerns relating to extreme weather events and eco-anxiety. Despite the growing body of evidence in this area, knowledge gaps remain due to inconsistent measures of exposure and a reliance on small samples. In this chapter, attention is given to the physical environment and population-level studies as a necessary starting point for exploring the long-term impacts of environmental exposures on mental health, and for informing future research that may capture immediate emotional and neural responses to the environment. Key data sources, including remote sensing imagery, administrative, sensor, and social media data, are outlined. Appropriate measures of exposure are advocated for, recognising the value of area-level measures for estimating exposure over large study samples and spatial and temporal scales. Although integrating data from multiple sources requires consideration for data quality and completeness, deep learning and the increasing availability of high-resolution data present opportunities to build a more complete picture of physical environments. Advances in leveraging detailed locational data are discussed as a subsequent approach for building upon initial observations from population studies and improving understanding of the mechanisms underlying behaviour and human-environment interactions.

PMID:38691314 | DOI:10.1007/7854_2024_477

Categories: Literature Watch

Parotid Gland Segmentation Using Purely Transformer-Based U-Shaped Network and Multimodal MRI

Wed, 2024-05-01 06:00

Ann Biomed Eng. 2024 May 1. doi: 10.1007/s10439-024-03510-3. Online ahead of print.

ABSTRACT

Parotid gland tumors account for approximately 2% to 10% of head and neck tumors. Segmentation of parotid glands and tumors on magnetic resonance images is essential in accurately diagnosing and selecting appropriate surgical plans. However, segmentation of parotid glands is particularly challenging due to their variable shape and low contrast with surrounding structures. Recently, deep learning has developed rapidly, and Transformer-based networks have performed well on many computer vision tasks. However, Transformer-based networks have yet to be well used in parotid gland segmentation tasks. We collected a multi-center multimodal parotid gland MRI dataset and implemented parotid gland segmentation using a purely Transformer-based U-shaped segmentation network. We used both absolute and relative positional encoding to improve parotid gland segmentation and achieved multimodal information fusion without increasing the network computation. In addition, our novel training approach reduces the clinician's labeling workload by nearly half. Our method achieved good segmentation of both parotid glands and tumors. On the test set, our model achieved a Dice-Similarity Coefficient of 86.99%, Pixel Accuracy of 99.19%, Mean Intersection over Union of 81.79%, and Hausdorff Distance of 3.87. The purely Transformer-based U-shaped segmentation network we used outperforms other convolutional neural networks. In addition, our method can effectively fuse the information from multi-center multimodal MRI dataset, thus improving the parotid gland segmentation.

PMID:38691234 | DOI:10.1007/s10439-024-03510-3

Categories: Literature Watch

Complement inhibition treatment for geographic atrophy (GA): functional and morphological efficacy and relevant biomarkers in clinical practice

Wed, 2024-05-01 06:00

Ophthalmologie. 2024 Apr 30. doi: 10.1007/s00347-024-02039-z. Online ahead of print.

ABSTRACT

The approval of complement inhibitory therapeutic agents for the treatment of geographic atrophy (GA) has highlighted the need for reliable and reproducible measurement of disease progression and therapeutic efficacy. Due to its availability and imaging characteristics optical coherence tomography (OCT) is the method of choice. Using OCT analysis based on artificial intelligence (AI), the therapeutic efficacy of pegcetacoplan was demonstrated at the levels of both the retinal pigment epithelium (RPE) and photoreceptors (PR). Cloud-based solutions that enable monitoring of GA are already available.

PMID:38691156 | DOI:10.1007/s00347-024-02039-z

Categories: Literature Watch

Faster, More Practical, but Still Accurate: Deep Learning for Diagnosis of Progressive Supranuclear Palsy

Wed, 2024-05-01 06:00

Radiol Artif Intell. 2024 May;6(3):e240181. doi: 10.1148/ryai.240181.

NO ABSTRACT

PMID:38691010 | DOI:10.1148/ryai.240181

Categories: Literature Watch

What predicts citation counts and translational impact in headache research? A machine learning analysis

Wed, 2024-05-01 06:00

Cephalalgia. 2024 May;44(5):3331024241251488. doi: 10.1177/03331024241251488.

ABSTRACT

BACKGROUND: We aimed to develop the first machine learning models to predict citation counts and the translational impact, defined as inclusion in guidelines or policy documents, of headache research, and assess which factors are most predictive.

METHODS: Bibliometric data and the titles, abstracts, and keywords from 8600 publications in three headache-oriented journals from their inception to 31 December 2017 were used. A series of machine learning models were implemented to predict three classes of 5-year citation count intervals (0-5, 6-14 and, >14 citations); and the translational impact of a publication. Models were evaluated out-of-sample with area under the receiver operating characteristics curve (AUC).

RESULTS: The top performing gradient boosting model predicted correct citation count class with an out-of-sample AUC of 0.81. Bibliometric data such as page count, number of references, first and last author citation counts and h-index were among the most important predictors. Prediction of translational impact worked optimally when including both bibliometric data and information from the title, abstract and keywords, reaching an out-of-sample AUC of 0.71 for the top performing random forest model.

CONCLUSION: Citation counts are best predicted by bibliometric data, while models incorporating both bibliometric data and publication content identifies the translational impact of headache research.

PMID:38690640 | DOI:10.1177/03331024241251488

Categories: Literature Watch

Machine learning for predicting Plasmodium liver stage development in vitro using microscopy imaging

Wed, 2024-05-01 06:00

Comput Struct Biotechnol J. 2024 Apr 18;24:334-342. doi: 10.1016/j.csbj.2024.04.029. eCollection 2024 Dec.

ABSTRACT

Malaria, a significant global health challenge, is caused by Plasmodium parasites. The Plasmodium liver stage plays a pivotal role in the establishment of the infection. This study focuses on the liver stage development of the model organism Plasmodium berghei, employing fluorescent microscopy imaging and convolutional neural networks (CNNs) for analysis. Convolutional neural networks have been recently proposed as a viable option for tasks such as malaria detection, prediction of host-pathogen interactions, or drug discovery. Our research aimed to predict the transition of Plasmodium-infected liver cells to the merozoite stage, a key development phase, 15 hours in advance. We collected and analyzed hourly imaging data over a span of at least 38 hours from 400 sequences, encompassing 502 parasites. Our method was compared to human annotations to validate its efficacy. Performance metrics, including the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity, were evaluated on an independent test dataset. The outcomes revealed an AUC of 0.873, a sensitivity of 84.6%, and a specificity of 83.3%, underscoring the potential of our CNN-based framework to predict liver stage development of P. berghei. These findings not only demonstrate the feasibility of our methodology but also could potentially contribute to the broader understanding of parasite biology.

PMID:38690550 | PMC:PMC11059334 | DOI:10.1016/j.csbj.2024.04.029

Categories: Literature Watch

Integrating predictive coding and a user-centric interface for enhanced auditing and quality in cancer registry data

Wed, 2024-05-01 06:00

Comput Struct Biotechnol J. 2024 Apr 7;24:322-333. doi: 10.1016/j.csbj.2024.04.007. eCollection 2024 Dec.

ABSTRACT

Data curation for a hospital-based cancer registry heavily relies on the labor-intensive manual abstraction process by cancer registrars to identify cancer-related information from free-text electronic health records. To streamline this process, a natural language processing system incorporating a hybrid of deep learning-based and rule-based approaches for identifying lung cancer registry-related concepts, along with a symbolic expert system that generates registry coding based on weighted rules, was developed. The system is integrated with the hospital information system at a medical center to provide cancer registrars with a patient journey visualization platform. The embedded system offers a comprehensive view of patient reports annotated with significant registry concepts to facilitate the manual coding process and elevate overall quality. Extensive evaluations, including comparisons with state-of-the-art methods, were conducted using a lung cancer dataset comprising 1428 patients from the medical center. The experimental results illustrate the effectiveness of the developed system, consistently achieving F1-scores of 0.85 and 1.00 across 30 coding items. Registrar feedback highlights the system's reliability as a tool for assisting and auditing the abstraction. By presenting key registry items along the timeline of a patient's reports with accurate code predictions, the system improves the quality of registrar outcomes and reduces the labor resources and time required for data abstraction. Our study highlights advancements in cancer registry coding practices, demonstrating that the proposed hybrid weighted neural-symbolic cancer registry system is reliable and efficient for assisting cancer registrars in the coding workflow and contributing to clinical outcomes.

PMID:38690549 | PMC:PMC11059324 | DOI:10.1016/j.csbj.2024.04.007

Categories: Literature Watch

Automatic Lenke classification of adolescent idiopathic scoliosis with deep learning

Wed, 2024-05-01 06:00

JOR Spine. 2024 Apr 30;7(2):e1327. doi: 10.1002/jsp2.1327. eCollection 2024 Jun.

ABSTRACT

PURPOSE: The Lenke classification system is widely utilized as the preoperative evaluation protocol for adolescent idiopathic scoliosis (AIS). However, manual measurement is susceptible to observer-induced variability, which consequently impacts the evaluation of progression. The goal of this investigation was to develop an automated Lenke classification system utilizing innovative deep learning algorithms.

METHODS: Using the database from the First Affiliated Hospital of Sun Yat-sen University, the whole spinal x-rays images were retrospectively collected. Specifically, images collection was divided into AIS and control group. The control group consisted of individuals who underwent routine health checks and did not have scoliosis. Afterwards, relative features of all images were annotated. Deep learning was implemented through the utilization of the key-point based detection method to realize the vertebral detection, and Cobb angle measurement and scoliosis classification were performed based on relevant standards. Besides, the segmentation method was employed to achieve the recognition of lumbar vertebral pedicle to determine the type of lumbar spine modifier. Finally, the model performance was further quantitatively analyzed.

RESULTS: In the study, a total of 2082 spinal x-ray images were collected from 407 AIS patients and 227 individuals in the control group. The model for vertebral detection achieved an F1-score of 0.809 for curve type evaluation and an F1-score of 0.901 for thoracic sagittal profile. The intraclass correlation efficient (ICC) of the Cobb angle measurement was 0.925. In the analysis of performance for vertebra pedicle segmentation model, the F1-score of lumbar modification profile was 0.942, the intersection over union (IOU) of the target pixels was 0.827, and the Hausdorff distance (HD) was 6.565 ± 2.583 mm. Specifically, the F1-score for ultimate Lenke type classifier was 0.885.

CONCLUSIONS: This study has constructed an automated Lenke classification system by employing the deep learning networks to achieve the recognition pattern and feature extraction. Our models require further validation in additional cases in the future.

PMID:38690524 | PMC:PMC11058480 | DOI:10.1002/jsp2.1327

Categories: Literature Watch

A data-driven approach for the partial reconstruction of individual human molar teeth using generative deep learning

Wed, 2024-05-01 06:00

Front Artif Intell. 2024 Apr 16;7:1339193. doi: 10.3389/frai.2024.1339193. eCollection 2024.

ABSTRACT

BACKGROUND AND OBJECTIVE: Due to the high prevalence of dental caries, fixed dental restorations are regularly required to restore compromised teeth or replace missing teeth while retaining function and aesthetic appearance. The fabrication of dental restorations, however, remains challenging due to the complexity of the human masticatory system as well as the unique morphology of each individual dentition. Adaptation and reworking are frequently required during the insertion of fixed dental prostheses (FDPs), which increase cost and treatment time. This article proposes a data-driven approach for the partial reconstruction of occlusal surfaces based on a data set that comprises 92 3D mesh files of full dental crown restorations.

METHODS: A Generative Adversarial Network (GAN) is considered for the given task in view of its ability to represent extensive data sets in an unsupervised manner with a wide variety of applications. Having demonstrated good capabilities in terms of image quality and training stability, StyleGAN-2 has been chosen as the main network for generating the occlusal surfaces. A 2D projection method is proposed in order to generate 2D representations of the provided 3D tooth data set for integration with the StyleGAN architecture. The reconstruction capabilities of the trained network are demonstrated by means of 4 common inlay types using a Bayesian Image Reconstruction method. This involves pre-processing the data in order to extract the necessary information of the tooth preparations required for the used method as well as the modification of the initial reconstruction loss.

RESULTS: The reconstruction process yields satisfactory visual and quantitative results for all preparations with a root mean square error (RMSE) ranging from 0.02 mm to 0.18 mm. When compared against a clinical procedure for CAD inlay fabrication, the group of dentists preferred the GAN-based restorations for 3 of the total 4 inlay geometries.

CONCLUSIONS: This article shows the effectiveness of the StyleGAN architecture with a downstream optimization process for the reconstruction of 4 different inlay geometries. The independence of the reconstruction process and the initial training of the GAN enables the application of the method for arbitrary inlay geometries without time-consuming retraining of the GAN.

PMID:38690195 | PMC:PMC11058210 | DOI:10.3389/frai.2024.1339193

Categories: Literature Watch

PED: a novel predictor-encoder-decoder model for Alzheimer drug molecular generation

Wed, 2024-05-01 06:00

Front Artif Intell. 2024 Apr 16;7:1374148. doi: 10.3389/frai.2024.1374148. eCollection 2024.

ABSTRACT

Alzheimer's disease (AD) is a gradually advancing neurodegenerative disorder characterized by a concealed onset. Acetylcholinesterase (AChE) is an efficient hydrolase that catalyzes the hydrolysis of acetylcholine (ACh), which regulates the concentration of ACh at synapses and then terminates ACh-mediated neurotransmission. There are inhibitors to inhibit the activity of AChE currently, but its side effects are inevitable. In various application fields where Al have gained prominence, neural network-based models for molecular design have recently emerged and demonstrate encouraging outcomes. However, in the conditional molecular generation task, most of the current generation models need additional optimization algorithms to generate molecules with intended properties which make molecular generation inefficient. Consequently, we introduce a cognitive-conditional molecular design model, termed PED, which leverages the variational auto-encoder. Its primary function is to adeptly produce a molecular library tailored for specific properties. From this library, we can then identify molecules that inhibit AChE activity without adverse effects. These molecules serve as lead compounds, hastening AD treatment and concurrently enhancing the AI's cognitive abilities. In this study, we aim to fine-tune a VAE model pre-trained on the ZINC database using active compounds of AChE collected from Binding DB. Different from other molecular generation models, the PED can simultaneously perform both property prediction and molecule generation, consequently, it can generate molecules with intended properties without additional optimization process. Experiments of evaluation show that proposed model performs better than other methods benchmarked on the same data sets. The results indicated that the model learns a good representation of potential chemical space, it can well generate molecules with intended properties. Extensive experiments on benchmark datasets confirmed PED's efficiency and efficacy. Furthermore, we also verified the binding ability of molecules to AChE through molecular docking. The results showed that our molecular generation system for AD shows excellent cognitive capacities, the molecules within the molecular library could bind well to AChE and inhibit its activity, thus preventing the hydrolysis of ACh.

PMID:38690194 | PMC:PMC11058643 | DOI:10.3389/frai.2024.1374148

Categories: Literature Watch

Predicting rectal cancer prognosis from histopathological images and clinical information using multi-modal deep learning

Wed, 2024-05-01 06:00

Front Oncol. 2024 Apr 15;14:1353446. doi: 10.3389/fonc.2024.1353446. eCollection 2024.

ABSTRACT

OBJECTIVE: The objective of this study was to provide a multi-modal deep learning framework for forecasting the survival of rectal cancer patients by utilizing both digital pathological images data and non-imaging clinical data.

MATERIALS AND METHODS: The research included patients diagnosed with rectal cancer by pathological confirmation from January 2015 to December 2016. Patients were allocated to training and testing sets in a randomized manner, with a ratio of 4:1. The tissue microarrays (TMAs) and clinical indicators were obtained. Subsequently, we selected distinct deep learning models to individually forecast patient survival. We conducted a scanning procedure on the TMAs in order to transform them into digital pathology pictures. Additionally, we performed pre-processing on the clinical data of the patients. Subsequently, we selected distinct deep learning algorithms to conduct survival prediction analysis using patients' pathological images and clinical data, respectively.

RESULTS: A total of 292 patients with rectal cancer were randomly allocated into two groups: a training set consisting of 234 cases, and a testing set consisting of 58 instances. Initially, we make direct predictions about the survival status by using pre-processed Hematoxylin and Eosin (H&E) pathological images of rectal cancer. We utilized the ResNest model to extract data from histopathological images of patients, resulting in a survival status prediction with an AUC (Area Under the Curve) of 0.797. Furthermore, we employ a multi-head attention fusion (MHAF) model to combine image features and clinical features in order to accurately forecast the survival rate of rectal cancer patients. The findings of our experiment show that the multi-modal structure works better than directly predicting from histopathological images. It achieves an AUC of 0.837 in predicting overall survival (OS).

CONCLUSIONS: Our study highlights the potential of multi-modal deep learning models in predicting survival status from histopathological images and clinical information, thus offering valuable insights for clinical applications.

PMID:38690169 | PMC:PMC11060749 | DOI:10.3389/fonc.2024.1353446

Categories: Literature Watch

Decentralized multi-agent reinforcement learning based on best-response policies

Wed, 2024-05-01 06:00

Front Robot AI. 2024 Apr 16;11:1229026. doi: 10.3389/frobt.2024.1229026. eCollection 2024.

ABSTRACT

Introduction: Multi-agent systems are an interdisciplinary research field that describes the concept of multiple decisive individuals interacting with a usually partially observable environment. Given the recent advances in single-agent reinforcement learning, multi-agent reinforcement learning (RL) has gained tremendous interest in recent years. Most research studies apply a fully centralized learning scheme to ease the transfer from the single-agent domain to multi-agent systems. Methods: In contrast, we claim that a decentralized learning scheme is preferable for applications in real-world scenarios as this allows deploying a learning algorithm on an individual robot rather than deploying the algorithm to a complete fleet of robots. Therefore, this article outlines a novel actor-critic (AC) approach tailored to cooperative MARL problems in sparsely rewarded domains. Our approach decouples the MARL problem into a set of distributed agents that model the other agents as responsive entities. In particular, we propose using two separate critics per agent to distinguish between the joint task reward and agent-based costs as commonly applied within multi-robot planning. On one hand, the agent-based critic intends to decrease agent-specific costs. On the other hand, each agent intends to optimize the joint team reward based on the joint task critic. As this critic still depends on the joint action of all agents, we outline two suitable behavior models based on Stackelberg games: a game against nature and a dyadic game against each agent. Following these behavior models, our algorithm allows fully decentralized execution and training. Results and Discussion: We evaluate our presented method using the proposed behavior models within a sparsely rewarded simulated multi-agent environment. Although our approach already outperforms the state-of-the-art learners, we conclude this article by outlining possible extensions of our algorithm that future research may build upon.

PMID:38690119 | PMC:PMC11059992 | DOI:10.3389/frobt.2024.1229026

Categories: Literature Watch

Adaptive habitat biogeography-based optimizer for optimizing deep CNN hyperparameters in image classification

Wed, 2024-05-01 06:00

Heliyon. 2024 Mar 21;10(7):e28147. doi: 10.1016/j.heliyon.2024.e28147. eCollection 2024 Apr 15.

ABSTRACT

Deep Convolutional Neural Networks (DCNNs) have shown remarkable success in image classification tasks, but optimizing their hyperparameters can be challenging due to their complex structure. This paper develops the Adaptive Habitat Biogeography-Based Optimizer (AHBBO) for tuning the hyperparameters of DCNNs in image classification tasks. In complicated optimization problems, the BBO suffers from premature convergence and insufficient exploration. In this regard, an adaptable habitat is presented as a solution to these problems; it would permit variable habitat sizes and regulated mutation. Better optimization performance and a greater chance of finding high-quality solutions across a wide range of problem domains are the results of this modification's increased exploration and population diversity. AHBBO is tested on 53 benchmark optimization functions and demonstrates its effectiveness in improving initial stochastic solutions and converging faster to the optimum. Furthermore, DCNN-AHBBO is compared to 23 well-known image classifiers on nine challenging image classification problems and shows superior performance in reducing the error rate by up to 5.14%. Our proposed algorithm outperforms 13 benchmark classifiers in 87 out of 95 evaluations, providing a high-performance and reliable solution for optimizing DNNs in image classification tasks. This research contributes to the field of deep learning by proposing a new optimization algorithm that can improve the efficiency of deep neural networks in image classification.

PMID:38689992 | PMC:PMC11059399 | DOI:10.1016/j.heliyon.2024.e28147

Categories: Literature Watch

Developing a multivariate time series forecasting framework based on stacked autoencoders and multi-phase feature

Wed, 2024-05-01 06:00

Heliyon. 2024 Mar 19;10(7):e27860. doi: 10.1016/j.heliyon.2024.e27860. eCollection 2024 Apr 15.

ABSTRACT

Time series forecasting across different domains has received massive attention as it eases intelligent decision-making activities. Recurrent neural networks and various deep learning algorithms have been applied to modeling and forecasting multivariate time series data. Due to intricate non-linear patterns and significant variations in the randomness of characteristics across various categories of real-world time series data, achieving effectiveness and robustness simultaneously poses a considerable challenge for specific deep-learning models. We have proposed a novel prediction framework with a multi-phase feature selection technique, a long short-term memory-based autoencoder, and a temporal convolution-based autoencoder to fill this gap. The multi-phase feature selection is applied to retrieve the optimal feature selection and optimal lag window length for different features. Moreover, the customized stacked autoencoder strategy is employed in the model. The first autoencoder is used to resolve the random weight initialization problem. Additionally, the second autoencoder models the temporal relation between non-linear correlated features with convolution networks and recurrent neural networks. Finally, the model's ability to generalize, predict accurately, and perform effectively is validated through experimentation with three distinct real-world time series datasets. In this study, we conducted experiments on three real-world datasets: Energy Appliances, Beijing PM2.5 Concentration, and Solar Radiation. The Energy Appliances dataset consists of 29 attributes with a training size of 15,464 instances and a testing size of 4239 instances. For the Beijing PM2.5 Concentration dataset, there are 18 attributes, with 34,952 instances in the training set and 8760 instances in the testing set. The Solar Radiation dataset comprises 11 attributes, with 22,857 instances in the training set and 9797 instances in the testing set. The experimental setup involved evaluating the performance of forecasting models using two distinct error measures: root mean square error and mean absolute error. To ensure robust evaluation, the errors were calculated at the identical scale of the data. The results of the experiments demonstrate the superiority of the proposed model compared to existing models, as evidenced by significant advantages in various metrics such as mean squared error and mean absolute error. For PM2.5 air quality data, the proposed model's mean absolute error is 7.51 over 12.45, about ∼40% improvement. Similarly, the mean square error for the dataset is improved from 23.75 to 11.62, which is ∼51%of improvement. For the solar radiation dataset, the proposed model resulted in ∼34.7% improvement in means squared error and ∼75% in mean absolute error. The recommended framework demonstrates outstanding capabilities in generalization and outperforms datasets spanning multiple indigenous domains.

PMID:38689959 | PMC:PMC11059412 | DOI:10.1016/j.heliyon.2024.e27860

Categories: Literature Watch

Using UAV Images and Deep Learning in Investigating Potential Breeding Sites of Aedes albopictus

Tue, 2024-04-30 06:00

Acta Trop. 2024 Apr 28:107234. doi: 10.1016/j.actatropica.2024.107234. Online ahead of print.

ABSTRACT

Aedes albopictus (Diptera: Culicidae) plays a crucial role as a vector for mosquito-borne diseases like dengue and zika. Given the limited availability of effective vaccines, the prevention of Aedes-borne diseases mainly relies on extensive efforts in vector surveillance and control. In multiple mosquito control methods, the identification and elimination of potential breeding sites (PBS) for Aedes are recognized as effective methods for population control. Previous studies utilizing unmanned aerial vehicles (UAVs) and deep learning to identify PBS have primarily focused on large, regularly-shaped containers. However, there has been a small amount of empirical research into their practical application in the field. We have thus constructed a PBS dataset specifically tailored for Ae. albopictus, including items such as buckets, bowls, bins, aquatic plants, jars, lids, pots, boxes, and sinks that were common in the Yangtze River Basin in China. Then, a YOLO v7 model for identifying these PBS was developed. Finally, we recognized and labeled the area with the highest PBS density, as well as the subarea with the most urgent need for source reduction in the empirical region, by calculating the kernel density value. Based on the above research, we proposed a UAV-AI-based methodological framework to locate the spatial distribution of PBS, and conducted empirical research on Jinhulu New Village, a typical model community. The results revealed that the YOLO v7 model achieved an excellent result on the F1 score and mAP(both above 0.99), with 97% of PBS correctly located. The predicted distribution of different PBS categories in each subarea was completely consistent with true distribution; the five houses with the most PBS were correctly located. The results of the kernel density map indicate the subarea 4 with the highest density of PBS, where PBS needs to be removed or destroyed with immediate effect. These results demonstrate the reliability of the prediction results and the feasibility of the UAV-AI-based methodological framework. It can minimize repetitive labor, enhance efficiency, and provide guidance for the removal and destruction of PBS. The research can shed light on the investigation of mosquito PBS investigation both methodologically and practically.

PMID:38688444 | DOI:10.1016/j.actatropica.2024.107234

Categories: Literature Watch

Single particle mass spectral signatures from on-road and non-road vehicle exhaust particles and their application in refined source apportionment using deep learning

Tue, 2024-04-30 06:00

Sci Total Environ. 2024 Apr 28:172822. doi: 10.1016/j.scitotenv.2024.172822. Online ahead of print.

ABSTRACT

With advances in vehicle emission control technology, updating source profiles to meet the current requirements of source apportionment has become increasingly crucial. In this study, on-road and non-road vehicle particles were collected, and then the chemical compositions of individual particles were analyzed using single particle aerosol mass spectrometry. The data were grouped using an adaptive resonance theory neural network to identify signatures and establish a mass spectral database of mobile sources. In addition, a deep learning-based model (DeepAerosolClassifier) for classifying aerosol particles was established. The objective of this model was to accomplish source apportionment. During the training process, the model achieved an accuracy of 98.49 % for the validation set and an accuracy of 93.36 % for the testing set. Regarding the model interpretation, ideal spectra were generated using the model, verifying its accurate recognition of the characteristic patterns in the mass spectra. In a practical application, the model performed hourly source apportionment at three specific field monitoring sites. The effectiveness of the model in field measurement was validated by combining traffic flow and spatial information with the model results. Compared with other machine learning methods, our model achieved highly automated source apportionment while eliminating the need for feature selection, and it enables end-to-end operation. Thus, in the future, it can be applied in refined and online source apportionment of particulate matter.

PMID:38688364 | DOI:10.1016/j.scitotenv.2024.172822

Categories: Literature Watch

Pages