Semantic Web

Transcranial direct current stimulation in semantic variant of primary progressive aphasia: a state-of-the-art review

Wed, 2023-11-29 06:00

Front Hum Neurosci. 2023 Nov 8;17:1219737. doi: 10.3389/fnhum.2023.1219737. eCollection 2023.

ABSTRACT

The semantic variant of primary progressive aphasia (svPPA), known also as "semantic dementia (SD)," is a neurodegenerative disorder that pertains to the frontotemporal lobar degeneration clinical syndromes. There is currently no approved pharmacological therapy for all frontotemporal dementia variants. Transcranial direct current stimulation (tDCS) is a promising non-invasive brain stimulation technique capable of modulating cortical excitability through a sub-threshold shift in neuronal resting potential. This technique has previously been applied as adjunct treatment in Alzheimer's disease, while data for frontotemporal dementia are controversial. In this scoped review, we summarize and critically appraise the currently available evidence regarding the use of tDCS for improving performance in naming and/or matching tasks in patients with svPPA. Clinical trials addressing this topic were identified through MEDLINE (accessed by PubMed) and Web of Science, as of November 2022, week 3. Clinical trials have been unable to show a significant benefit of tDCS in enhancing semantic performance in svPPA patients. The heterogeneity of the studies available in the literature might be a possible explanation. Nevertheless, the results of these studies are promising and may offer valuable insights into methodological differences and overlaps, raising interest among researchers in identifying new non-pharmacological strategies for treating svPPA patients. Further studies are therefore warranted to investigate the potential therapeutic role of tDCS in svPPA.

PMID:38021245 | PMC:PMC10663282 | DOI:10.3389/fnhum.2023.1219737

Categories: Literature Watch

Effects of short-term second language learning on the development of individual semantic networks in written and spoken language

Sat, 2023-11-25 06:00

Neurosci Lett. 2024 Jan 1;818:137558. doi: 10.1016/j.neulet.2023.137558. Epub 2023 Nov 23.

ABSTRACT

Previous studies have primarily focused on the relationship between native language (L1) and second language (L2) in the brain, specifically in one language modality, such as written or spoken language. However, there is limited research on how L2 proficiency impacts both modalities. This study aimed to investigate the functional networks involved in reading and speech comprehension for both L1 and L2, and observe changes in these networks as L2 proficiency improves. The dataset used in this study was obtained from a previous research conducted by Gurunandan et al., which involved Spanish-English bilingual participants undergoing a three-month English training program. Participants underwent fMRI scanning and performed a semantic animacy judgment task in both spoken and written language before and after training. Through analysis, distinct neural networks associated with spoken and written language were found between individuals' L1 and L2, both before and after training. Moreover, as L2 proficiency improved, the spoken and written networks for L2 remained distinct from those of the L1. These findings suggest that short-term L2 learning experiences can modify neural networks, but may not be enough to achieve native-like proficiency, supporting the accommodation hypothesis. These results have important implications for language learning and education, indicating that additional short-term training and exposure alone may not bridge the gap between L1 and L2 processing networks.

PMID:38007086 | DOI:10.1016/j.neulet.2023.137558

Categories: Literature Watch

Cox regression with linked data

Tue, 2023-11-21 06:00

Stat Med. 2024 Jan 30;43(2):296-314. doi: 10.1002/sim.9960. Epub 2023 Nov 20.

ABSTRACT

Record linkage is increasingly used, especially in medical studies, to combine data from different databases that refer to the same entities. The linked data can bring analysts novel and valuable knowledge that is impossible to obtain from a single database. However, linkage errors are usually unavoidable, regardless of record linkage methods, and ignoring these errors may lead to biased estimates. While different methods have been developed to deal with the linkage errors in the generalized linear model, there is not much interest on Cox regression model, although this is one of the most important statistical models in clinical and epidemiological research. In this work, we propose an adjusted estimating equation for secondary Cox regression analysis, where linked data have been prepared by a third-party operator, and no information on matching variables is available to the analyst. Through a Monte Carlo simulation study, the proposed method is shown to lead to substantial bias reductions in the estimation of the parameters of the Cox model caused by false links. An asymptotically unbiased variance estimator for the adjusted estimators of Cox regression coefficients is also proposed. Finally, the proposed method is applied to a linked database from the Brest stroke registry in France.

PMID:37985942 | DOI:10.1002/sim.9960

Categories: Literature Watch

ActionCLIP: Adapting Language-Image Pretrained Models for Video Action Recognition

Tue, 2023-11-21 06:00

IEEE Trans Neural Netw Learn Syst. 2023 Nov 21;PP. doi: 10.1109/TNNLS.2023.3331841. Online ahead of print.

ABSTRACT

The canonical approach to video action recognition dictates a neural network model to do a classic and standard 1-of-N majority vote task. They are trained to predict a fixed set of predefined categories, limiting their transferability on new datasets with unseen concepts. In this article, we provide a new perspective on action recognition by attaching importance to the semantic information of label texts rather than simply mapping them into numbers. Specifically, we model this task as a video-text matching problem within a multimodal learning framework, which strengthens the video representation with more semantic language supervision and enables our model to do zero-shot action recognition without any further labeled data or parameters' requirements. Moreover, to handle the deficiency of label texts and make use of tremendous web data, we propose a new paradigm based on this multimodal learning framework for action recognition, which we dub "pre-train, adapt and fine-tune." This paradigm first learns powerful representations from pre-training on a large amount of web image-text or video-text data. Then, it makes the action recognition task to act more like pre-training problems via adaptation engineering. Finally, it is fine-tuned end-to-end on target datasets to obtain strong performance. We give an instantiation of the new paradigm, ActionCLIP, which not only has superior and flexible zero-shot/few-shot transfer ability but also reaches a top performance on general action recognition task, achieving 83.8% top-1 accuracy on Kinetics-400 with a ViT-B/16 as the backbone. Code is available at https://github.com/sallymmx/ActionCLIP.git.

PMID:37988204 | DOI:10.1109/TNNLS.2023.3331841

Categories: Literature Watch

A knowledge graph-based data harmonization framework for secondary data reuse

Sun, 2023-11-19 06:00

Comput Methods Programs Biomed. 2023 Nov 10;243:107918. doi: 10.1016/j.cmpb.2023.107918. Online ahead of print.

ABSTRACT

BACKGROUND AND OBJECTIVE: The adoption of new technologies in clinical care systems has propitiated the availability of a great amount of valuable data. However, this data is usually heterogeneous, requiring its harmonization to be integrated and analysed. We propose a semantic-driven harmonization framework that (1) enables the meaningful sharing and integration of healthcare data across institutions and (2) facilitates the analysis and exploitation of the shared data.

METHODS: The framework includes an ontology-based common data model (i.e. SCDM), a data transformation pipeline and a semantic query system. Heterogeneous datasets, mapped to different terminologies, are integrated by using an ontology-based infrastructure rooted in a top-level ontology. A graph database is generated by using these mappings, and web-based semantic query system facilitates data exploration.

RESULTS: Several datasets from different European institutions have been integrated by using the framework in the context of the European H2020 Precise4Q project. Through the query system, data scientists were able to explore data and use it for building machine learning models.

CONCLUSIONS: The flexible data representation using RDF, together with the formal semantic underpinning provided by the SCDM, have enabled the semantic integration, query and advanced exploitation of heterogeneous data in the context of the Precise4Q project.

PMID:37981455 | DOI:10.1016/j.cmpb.2023.107918

Categories: Literature Watch

Factors Influencing the Answerability and Popularity of a Health-Related Post in the Question-and-Answer Community: Infodemiology Study of Metafilter

Fri, 2023-11-17 06:00

J Med Internet Res. 2023 Nov 17;25:e48858. doi: 10.2196/48858.

ABSTRACT

BACKGROUND: The web-based health question-and-answer (Q&A) community has become the primary and handy way for people to access health information and knowledge directly.

OBJECTIVE: The objective of our study is to investigate how content-related, context-related, and user-related variables influence the answerability and popularity of health-related posts based on a user-dynamic, social network, and topic-dynamic semantic network, respectively.

METHODS: Full-scale data on health consultations were acquired from the Metafilter Q&A community. These variables were designed in terms of context, content, and contributors. Negative binomial regression models were used to examine the influence of these variables on the favorite and comment counts of a health-related post.

RESULTS: A total of 18,099 post records were collected from a well-known Q&A community. The findings of this study include the following. Content-related variables have a strong impact on both the answerability and popularity of posts. Notably, sentiment values were positively related to favorite counts and negatively associated with comment counts. User-related variables significantly affected the answerability and popularity of posts. Specifically, participation intensity was positively related to comment count and negatively associated with favorite count. Sociability breadth only had a significant impact on comment count. Context-related variables have a more substantial influence on the popularity of posts than on their answerability. The topic diversity variable exhibits an inverse correlation with the comment count while manifesting a positive correlation with the favorite count. Nevertheless, topic intensity has a significant effect only on favorite count.

CONCLUSIONS: The research results not only reveal the factors influencing the answerability and popularity of health-related posts, which can help them obtain high-quality answers more efficiently, but also provide a theoretical basis for platform operators to enhance user engagement within health Q&A communities.

PMID:37976090 | DOI:10.2196/48858

Categories: Literature Watch

Digital Personal Health Coaching Platform for Promoting Human Papillomavirus Infection Vaccinations and Cancer Prevention: Knowledge Graph-Based Recommendation System

Wed, 2023-11-15 06:00

JMIR Form Res. 2023 Nov 15;7:e50210. doi: 10.2196/50210.

ABSTRACT

BACKGROUND: Health promotion can empower populations to gain more control over their well-being by using digital interventions that focus on preventing the root causes of diseases. Digital platforms for personalized health coaching can improve health literacy and information-seeking behavior, leading to better health outcomes. Personal health records have been designed to enhance patients' self-management of a disease or condition. Existing personal health records have been mostly designed and deployed as a supplementary service that acts as views into electronic health records.

OBJECTIVE: We aim to overcome some of the limitations of electronic health records. This study aims to design and develop a personal health library (PHL) that generates personalized recommendations for human papillomavirus (HPV) vaccine promotion and cancer prevention.

METHODS: We have designed a proof-of-concept prototype of the Digital Personal Health Librarian, which leverages machine learning; natural language processing; and several innovative technological infrastructures, including the Semantic Web, social linked data, web application programming interfaces, and hypermedia-based discovery, to generate a personal health knowledge graph.

RESULTS: We have designed and implemented a proof-of-the-concept prototype to showcase and demonstrate how the PHL can be used to store an individual's health data, for example, a personal health knowledge graph. This is integrated with web-scale knowledge to support HPV vaccine promotion and prevent HPV-associated cancers among adolescents and their caregivers. We also demonstrated how the Digital Personal Health Librarian uses the PHL to provide evidence-based insights and knowledge-driven explanations that are personalized and inform health decision-making.

CONCLUSIONS: Digital platforms such as the PHL can be instrumental in improving precision health promotion and education strategies that address population-specific needs (ie, health literacy, digital competency, and language barriers) and empower individuals by facilitating knowledge acquisition to make healthy choices.

PMID:37966885 | DOI:10.2196/50210

Categories: Literature Watch

Factors Associated With Transition From Community to Permanent Residential Aged Care Following Stroke: A Linked Registry Data Study

Mon, 2023-11-13 06:00

Stroke. 2023 Dec;54(12):3117-3127. doi: 10.1161/STROKEAHA.123.043972. Epub 2023 Nov 13.

ABSTRACT

BACKGROUND: Understanding factors that influence the transition to permanent residential aged care following a stroke or transient ischemic attack may inform strategies to support people to live at home longer. We aimed to identify the demographic, clinical, and system factors that may influence the transition from living in the community to permanent residential care in the 6 to 18 months following stroke/transient ischemic attack.

METHODS: Linked data cohort analysis of adults from Queensland and Victoria aged ≥65 years and registered in the Australian Stroke Clinical Registry (2012-2016) with a clinical diagnosis of stroke/transient ischemic attack and living in the community in the first 6 months post-hospital discharge. Participant data were linked with primary care, pharmaceutical, aged care, death, and hospital data. Multivariable survival analysis was performed to determine demographic, clinical, and system factors associated with the transition to permanent residential care in the 6 to 18 months following stroke, with death modeled as a competing risk.

RESULTS: Of 11 176 included registrants (median age, 77.2 years; 44% female), 520 (5%) transitioned to permanent residential care between 6 and 18 months. Factors most associated with transition included the history of urinary tract infections (subhazard ratio [SHR], 1.41 [95% CI, 1.16-1.71]), dementia (SHR, 1.66 [95% CI, 1.14-2.42]), increasing age (65-74 versus 85+ years; SHR, 1.75 [95% CI, 1.31-2.34]), living in regional Australia (SHR, 31 [95% CI, 1.08-1.60]), and aged care service approvals: respite (SHR, 4.54 [95% CI, 3.51-5.85]) and high-level home support (SHR, 1.80 [95% CI, 1.30-2.48]). Protective factors included being dispensed antihypertensive medications (SHR, 0.68 [95% CI, 0.53-0.87]), seeing a cardiologist (SHR, 0.72 [95% CI, 0.57-0.91]) following stroke, and less severe stroke (SHR, 0.71 [95% CI, 0.58-0.88]).

CONCLUSIONS: Our findings provide an improved understanding of factors that influence the transition from community to permanent residential care following stroke and can inform future strategies designed to delay this transition.

PMID:37955141 | DOI:10.1161/STROKEAHA.123.043972

Categories: Literature Watch

Normalization of drug and therapeutic concepts with Thera-Py

Mon, 2023-11-13 06:00

JAMIA Open. 2023 Nov 8;6(4):ooad093. doi: 10.1093/jamiaopen/ooad093. eCollection 2023 Dec.

ABSTRACT

OBJECTIVE: The diversity of nomenclature and naming strategies makes therapeutic terminology difficult to manage and harmonize. As the number and complexity of available therapeutic ontologies continues to increase, the need for harmonized cross-resource mappings is becoming increasingly apparent. This study creates harmonized concept mappings that enable the linking together of like-concepts despite source-dependent differences in data structure or semantic representation.

MATERIALS AND METHODS: For this study, we created Thera-Py, a Python package and web API that constructs searchable concepts for drugs and therapeutic terminologies using 9 public resources and thesauri. By using a directed graph approach, Thera-Py captures commonly used aliases, trade names, annotations, and associations for any given therapeutic and combines them under a single concept record.

RESULTS: We highlight the creation of 16 069 unique merged therapeutic concepts from 9 distinct sources using Thera-Py and observe an increase in overlap of therapeutic concepts in 2 or more knowledge bases after harmonization using Thera-Py (9.8%-41.8%).

CONCLUSION: We observe that Thera-Py tends to normalize therapeutic concepts to their underlying active ingredients (excluding nondrug therapeutics, eg, radiation therapy, biologics), and unifies all available descriptors regardless of ontological origin.

PMID:37954974 | PMC:PMC10637840 | DOI:10.1093/jamiaopen/ooad093

Categories: Literature Watch

Self-supervised multi-modal training from uncurated images and reports enables monitoring AI in radiology

Sun, 2023-11-12 06:00

Med Image Anal. 2023 Nov 7;91:103021. doi: 10.1016/j.media.2023.103021. Online ahead of print.

ABSTRACT

The escalating demand for artificial intelligence (AI) systems that can monitor and supervise human errors and abnormalities in healthcare presents unique challenges. Recent advances in vision-language models reveal the challenges of monitoring AI by understanding both visual and textual concepts and their semantic correspondences. However, there has been limited success in the application of vision-language models in the medical domain. Current vision-language models and learning strategies for photographic images and captions call for a web-scale data corpus of image and text pairs which is not often feasible in the medical domain. To address this, we present a model named medical cross-attention vision-language model (Medical X-VL), which leverages key components to be tailored for the medical domain. The model is based on the following components: self-supervised unimodal models in medical domain and a fusion encoder to bridge them, momentum distillation, sentencewise contrastive learning for medical reports, and sentence similarity-adjusted hard negative mining. We experimentally demonstrated that our model enables various zero-shot tasks for monitoring AI, ranging from the zero-shot classification to zero-shot error correction. Our model outperformed current state-of-the-art models in two medical image datasets, suggesting a novel clinical application of our monitoring AI model to alleviate human errors. Our method demonstrates a more specialized capacity for fine-grained understanding, which presents a distinct advantage particularly applicable to the medical domain.

PMID:37952385 | DOI:10.1016/j.media.2023.103021

Categories: Literature Watch

MMLKG: Knowledge Graph for Mathematical Definitions, Statements and Proofs

Fri, 2023-11-10 06:00

Sci Data. 2023 Nov 10;10(1):791. doi: 10.1038/s41597-023-02681-3.

ABSTRACT

Nowadays, Knowledge Graphs (KGs) are important and developing in different areas. However, there is a lack of genuinely interoperable datasets representing mathematics that allow for information exchange between datasets in the Web ecosystem. In this paper, we address this matter based on the Mizar Mathematical Library (MML), a collection of articles written in the Mizar language. MML includes definitions and theorems with proofs to which authors can easily refer from newly written Mizar articles. However, extracting information directly from Mizar scripts by external projects is not very straightforward. Therefore, we propose a new data storage and retrieval approach based on the Knowledge Organization System (KOS) model and the KG concept that provides a way to organize and access knowledge. We present Mizar Mathematical Library Knowledge Graph (MMLKG), a thesaurus for describing mathematical objects. MMLKG supports semantic interoperability and allows linking data from different sources, e.g., Wikidata. Moreover, it satisfies the FAIR data principles. The data is publicly available via a Cypher endpoint.

PMID:37949866 | DOI:10.1038/s41597-023-02681-3

Categories: Literature Watch

Ontology-driven analysis of marine metagenomics: what more can we learn from our data?

Thu, 2023-11-09 06:00

Gigascience. 2022 Dec 28;12:giad088. doi: 10.1093/gigascience/giad088.

ABSTRACT

BACKGROUND: The proliferation of metagenomic sequencing technologies has enabled novel insights into the functional genomic potentials and taxonomic structure of microbial communities. However, cyberinfrastructure efforts to manage and enable the reproducible analysis of sequence data have not kept pace. Thus, there is increasing recognition of the need to make metagenomic data discoverable within machine-searchable frameworks compliant with the FAIR (Findability, Accessibility, Interoperability, and Reusability) principles for data stewardship. Although a variety of metagenomic web services exist, none currently leverage the hierarchically structured terminology encoded within common life science ontologies to programmatically discover data.

RESULTS: Here, we integrate large-scale marine metagenomic datasets with community-driven life science ontologies into a novel FAIR web service. This approach enables the retrieval of data discovered by intersecting the knowledge represented within ontologies against the functional genomic potential and taxonomic structure computed from marine sequencing data. Our findings highlight various microbial functional and taxonomic patterns relevant to the ecology of prokaryotes in various aquatic environments.

CONCLUSIONS: In this work, we present and evaluate a novel Semantic Web architecture that can be used to ask novel biological questions of existing marine metagenomic datasets. Finally, the FAIR ontology searchable data products provided by our API can be leveraged by future research efforts.

PMID:37941395 | DOI:10.1093/gigascience/giad088

Categories: Literature Watch

Modulating Complex Sentence Processing in Aphasia Through Attention and Semantic Networks

Tue, 2023-11-07 06:00

J Speech Lang Hear Res. 2023 Dec 11;66(12):5011-5035. doi: 10.1044/2023_JSLHR-23-00298. Epub 2023 Nov 7.

ABSTRACT

PURPOSE: Lexical processing impairments such as delayed and reduced activation of lexical-semantic information have been linked to syntactic processing disruptions and sentence comprehension deficits in individuals with aphasia (IWAs). Lexical-level deficits can also preclude successful lexical encoding during sentence processing and amplify the processing costs of similarity-based interference during syntactic retrieval. We investigate whether two manipulations to engage attention and pre-activate semantic features of a target (to-be-retrieved) noun will (a) boost lexical activation during initial lexical encoding and (b) facilitate syntactic dependency linking through improved resolution of interference in IWAs and neurologically unimpaired age-matched controls (AMCs).

METHOD: Eye-tracking-while-listening with a visual world paradigm was used to investigate whether semantic and attentional manipulations modulated initial lexical processing and downstream syntactic retrieval of the direct-object noun in object-relative sentences.

RESULTS: In the attention and semantic manipulations, the AMC group showed no changes in initial lexical access levels; however, gaze patterns revealed clear facilitations in dependency linking and interference resolution. In the IWA group, the attentional cue increased and maintained activation of N1 with modest facilitations in dependency linking. In the semantic condition, IWA results showed a greater degree of facilitation during dependency linking.

CONCLUSIONS: The results suggest that attention and semantic activation are parameters that may be manipulated to strengthen encoding of lexical representations to facilitate retrieval (i.e., dependency linking) and mitigate similarity-based interference. In IWAs, these manipulations may help to reduce lexical processing deficits that can preclude successful encoding.

PMID:37934886 | DOI:10.1044/2023_JSLHR-23-00298

Categories: Literature Watch

Dimensions of equality in uptake of COVID-19 vaccination in Wales, UK: A multivariable linked data population analysis

Mon, 2023-11-06 06:00

Vaccine. 2023 Nov 30;41(49):7333-7341. doi: 10.1016/j.vaccine.2023.10.066. Epub 2023 Nov 4.

ABSTRACT

Vaccination has proven to be effective at preventing severe outcomes of COVID-19 infection, and uptake in the population has been high in Wales. However, there is a risk that high-level vaccination coverage statistics may mask hidden inequalities in under-served populations, many of whom may be at increased risk of severe outcomes of COVID-19 infection. The study population included 1,436,229 individuals aged 18 years and over, alive and residence in Wales as at 31st July 2022, and excluded immunosuppressed or care home residents. We compared people who had received one or more vaccinations to those with no vaccination using linked data from nine datasets within the Secure Anonymised Information Linkage (SAIL) databank. Multivariable analysis was undertaken to determine the impact of a range of sociodemographic characteristics on vaccination uptake, including ethnicity, country of birth, severe mental illness, homelessness and substance use. We found that overall uptake of first dose of COVID-19 vaccination was high in Wales (92.1 %), with the highest among those aged 80 years and over and females. Those aged under 40 years, household composition (aOR 0.38 95 %CI 0.35-0.41 for 10+ size household compared to two adult household) and being born outside the UK (aOR 0.44 95 %CI 0.43-0.46) had the strongest negative associations with vaccination uptake. This was followed by a history of substance misuse (aOR 0.45 95 %CI 0.44-0.46). Despite high-level population coverage in Wales, significant inequalities remain across several underserved groups. Factors associated with vaccination uptake should not be considered in isolation, to avoid drawing incorrect conclusions. Ensuring equitable access to vaccination is essential to protecting under-served groups from COVID-19 and further work needs to be done to address these gaps in coverage, with focus on tailored vaccination pathways and advocacy, using trusted partners and communities.

PMID:37932133 | DOI:10.1016/j.vaccine.2023.10.066

Categories: Literature Watch

CAENet: Contrast adaptively enhanced network for medical image segmentation based on a differentiable pooling function

Thu, 2023-11-02 06:00

Comput Biol Med. 2023 Dec;167:107578. doi: 10.1016/j.compbiomed.2023.107578. Epub 2023 Oct 17.

ABSTRACT

Pixel differences between classes with low contrast in medical image semantic segmentation tasks often lead to confusion in category classification, posing a typical challenge for recognition of small targets. To address this challenge, we propose a Contrastive Adaptive Augmented Semantic Segmentation Network with a differentiable pooling function. Firstly, an Adaptive Contrast Augmentation module is constructed to automatically extract local high-frequency information, thereby enhancing image details and accentuating the differences between classes. Subsequently, the Frequency-Efficient Channel Attention mechanism is designed to select useful features in the encoding phase, where multifrequency information is employed to extract channel features. One-dimensional convolutional cross-channel interactions are adopted to reduce model complexity. Finally, a differentiable approximation of max pooling is introduced in order to replace standard max pooling, strengthening the connectivity between neurons and reducing information loss caused by downsampling. We evaluated the effectiveness of our proposed method through several ablation experiments and comparison experiments under homogeneous conditions. The experimental results demonstrate that our method competes favorably with other state-of-the-art networks on five medical image datasets, including four public medical image datasets and one clinical image dataset. It can be effectively applied to medical image segmentation.

PMID:37918260 | DOI:10.1016/j.compbiomed.2023.107578

Categories: Literature Watch

Shape Expressions (ShEx) Schemas for the FHIR R5 Specification

Thu, 2023-11-02 06:00

J Biomed Inform. 2023 Oct 31:104534. doi: 10.1016/j.jbi.2023.104534. Online ahead of print.

ABSTRACT

This work continues along a visionary path of using Semantic Web standards such as RDF and ShEx to make healthcare data easier to integrate for research and leading-edge patient care. The work extends the ability to use ShEx schemas to validate FHIR RDF data, thereby enhancing the semantic web ecosystem for working with FHIR and non-FHIR data using the same ShEx validation framework. It updates FHIR's ShEx schemas to fix outstanding issues and reflect changes in the definition of FHIR RDF. In addition, it experiments with expressing FHIRPath constraints (which are not captured in the XML or JSON schemas) in ShEx schemas. These extended ShEx schemas were incorporated into the FHIR R5 specification and used to successfully validate FHIR R5 examples that are included with the FHIR specification, revealing several errors in the examples.

PMID:37918622 | DOI:10.1016/j.jbi.2023.104534

Categories: Literature Watch

Data quality and patient characteristics in European ANCA-associated vasculitis registries: data retrieval by federated querying

Tue, 2023-10-31 06:00

Ann Rheum Dis. 2023 Oct 31:ard-2023-224571. doi: 10.1136/ard-2023-224571. Online ahead of print.

ABSTRACT

OBJECTIVES: This study aims to describe the data structure and harmonisation process, explore data quality and define characteristics, treatment, and outcomes of patients across six federated antineutrophil cytoplasmic antibody-associated vasculitis (AAV) registries.

METHODS: Through creation of the vasculitis-specific Findable, Accessible, Interoperable, Reusable, VASCulitis ontology, we harmonised the registries and enabled semantic interoperability. We assessed data quality across the domains of uniqueness, consistency, completeness and correctness. Aggregated data were retrieved using the semantic query language SPARQL Protocol and Resource Description Framework Query Language (SPARQL) and outcome rates were assessed through random effects meta-analysis.

RESULTS: A total of 5282 cases of AAV were identified. Uniqueness and data-type consistency were 100% across all assessed variables. Completeness and correctness varied from 49%-100% to 60%-100%, respectively. There were 2754 (52.1%) cases classified as granulomatosis with polyangiitis (GPA), 1580 (29.9%) as microscopic polyangiitis and 937 (17.7%) as eosinophilic GPA. The pattern of organ involvement included: lung in 3281 (65.1%), ear-nose-throat in 2860 (56.7%) and kidney in 2534 (50.2%). Intravenous cyclophosphamide was used as remission induction therapy in 982 (50.7%), rituximab in 505 (17.7%) and pulsed intravenous glucocorticoid use was highly variable (11%-91%). Overall mortality and incidence rates of end-stage kidney disease were 28.8 (95% CI 19.7 to 42.2) and 24.8 (95% CI 19.7 to 31.1) per 1000 patient-years, respectively.

CONCLUSIONS: In the largest reported AAV cohort-study, we federated patient registries using semantic web technologies and highlighted concerns about data quality. The comparison of patient characteristics, treatment and outcomes was hampered by heterogeneous recruitment settings.

PMID:37907255 | DOI:10.1136/ard-2023-224571

Categories: Literature Watch

AIMS: An Automatic Semantic Machine Learning Microservice Framework to Support Biomedical and Bioengineering Research

Sat, 2023-10-28 06:00

Bioengineering (Basel). 2023 Sep 27;10(10):1134. doi: 10.3390/bioengineering10101134.

ABSTRACT

The fusion of machine learning and biomedical research offers novel ways to understand, diagnose, and treat various health conditions. However, the complexities of biomedical data, coupled with the intricate process of developing and deploying machine learning solutions, often pose significant challenges to researchers in these fields. Our pivotal achievement in this research is the introduction of the Automatic Semantic Machine Learning Microservice (AIMS) framework. AIMS addresses these challenges by automating various stages of the machine learning pipeline, with a particular emphasis on the ontology of machine learning services tailored to the biomedical domain. This ontology encompasses everything from task representation, service modeling, and knowledge acquisition to knowledge reasoning and the establishment of a self-supervised learning policy. Our framework has been crafted to prioritize model interpretability, integrate domain knowledge effortlessly, and handle biomedical data with efficiency. Additionally, AIMS boasts a distinctive feature: it leverages self-supervised knowledge learning through reinforcement learning techniques, paired with an ontology-based policy recording schema. This enables it to autonomously generate, fine-tune, and continually adapt to machine learning models, especially when faced with new tasks and data. Our work has two standout contributions demonstrating that machine learning processes in the biomedical domain can be automated, while integrating a rich domain knowledge base and providing a way for machines to have self-learning ability, ensuring they handle new tasks effectively. To showcase AIMS in action, we have highlighted its prowess in three case studies of biomedical tasks. These examples emphasize how our framework can simplify research routines, uplift the caliber of scientific exploration, and set the stage for notable advances.

PMID:37892864 | DOI:10.3390/bioengineering10101134

Categories: Literature Watch

Models and Approaches for Comprehension of Dysarthric Speech Using Natural Language Processing: Systematic Review

Fri, 2023-10-27 06:00

JMIR Rehabil Assist Technol. 2023 Oct 27;10:e44489. doi: 10.2196/44489.

ABSTRACT

BACKGROUND: Speech intelligibility and speech comprehension for dysarthric speech has attracted much attention recently. Dysarthria is characterized by irregularities in the speed, strength, pitch, breath control, range, steadiness, and accuracy of muscle movements required for articulatory aspects of speech production.

OBJECTIVE: This study examined the contributions made by other studies involved in dysarthric speech comprehension. We focused on the modes of meaning extraction used in generalizing speaker-listener underpinnings in light of semantic ontology extraction as a desired technique, applied method types, speech representations used, and databases sourced from.

METHODS: This study involved a systematic literature review using 7 electronic databases: Cochrane Database of Systematic Reviews, Web of Science Core Collection, Scopus, PubMed, ACM, IEEE Xplore, and Google Scholar. The main eligibility criterion was the extraction of meaning from dysarthric speech using natural language processing or understanding approaches to improve on dysarthric speech comprehension. In total, out of 834 search results, 30 studies that matched the eligibility requirements were acquired following screening by 2 independent reviewers, with a lack of consensus being resolved through joint discussion or consultation with a third party. In order to evaluate the studies' methodological quality, the risk of bias assessment was based on the Cochrane risk-of-bias tool version 2 (RoB2) with 23 of the studies (77%) registering low risk of bias and 7 studies (33%) raising some concern over the risk of bias. The overall quality assessment of the study was done using TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis).

RESULTS: Following a review of 30 primary studies, this study revealed that the reviewed studies focused on natural language understanding or clinical approaches, with an increase in proposed solutions from 2020 onwards. Most studies relied on speaker-dependent speech features, while others used speech patterns, semantic knowledge, or hybrid approaches. The prevalent use of vector representation aligned with natural language understanding models, while Mel-frequency cepstral coefficient representation and no representation approaches were applied in neural networks. Hybrid representation studies aimed to reconstruct dysarthric speech or improve comprehension. Comprehensive databases, like TORGO and UA-Speech, were commonly used in combination with other curated databases, while primary data was preferred for specific or unique research objectives.

CONCLUSIONS: We found significant gaps in dysarthric speech comprehension characterized by the lack of inclusion of important listener or speech-independent features in the speech representations, mode of extraction, and data sources used. Further research is therefore proposed regarding the formulation of models that accommodate listener and speech-independent features through semantic ontologies that will be useful in the inclusion of key features of listener and speech-independent features for meaning extraction of dysarthric speech.

PMID:37889538 | DOI:10.2196/44489

Categories: Literature Watch

Chemical Species Ontology for Data Integration and Knowledge Discovery

Thu, 2023-10-26 06:00

J Chem Inf Model. 2023 Oct 26. doi: 10.1021/acs.jcim.3c00820. Online ahead of print.

ABSTRACT

Web ontologies are important tools in modern scientific research because they provide a standardized way to represent and manage web-scale amounts of complex data. In chemistry, a semantic database for chemical species is indispensable for its ability to interrelate and infer relationships, enabling a more precise analysis and prediction of chemical behavior. This paper presents OntoSpecies, a web ontology designed to represent chemical species and their properties. The ontology serves as a core component of The World Avatar knowledge graph chemistry domain and includes a wide range of identifiers, chemical and physical properties, chemical classifications and applications, and spectral information associated with each species. The ontology includes provenance and attribution metadata, ensuring the reliability and traceability of data. Most of the information about chemical species are sourced from PubChem and ChEBI data on the respective compound Web pages using a software agent, making OntoSpecies a comprehensive semantic database of chemical species able to solve novel types of problems in the field. Access to this reliable source of chemical data is provided through a SPARQL end point. The paper presents example use cases to demonstrate the contribution of OntoSpecies in solving complex tasks that require integrated semantically searchable chemical data. The approach presented in this paper represents a significant advancement in the field of chemical data management, offering a powerful tool for representing, navigating, and analyzing chemical information to support scientific research.

PMID:37883649 | DOI:10.1021/acs.jcim.3c00820

Categories: Literature Watch

Pages