Semantic Web

Automating the extraction of information from a historical text and building a linked data model for the domain of ecology and conservation science

Thu, 2022-10-20 06:00

Heliyon. 2022 Oct 4;8(10):e10710. doi: 10.1016/j.heliyon.2022.e10710. eCollection 2022 Oct.

ABSTRACT

Data heterogeneity is a pressing issue and is further compounded if we have to deal with data from textual documents. The unstructured nature of such documents implies that collating, comparing and analysing the information contained therein can be a challenging task. Automating these processes can help to unleash insightful knowledge that otherwise remains buried in them. Moreover, integrating the extracted information from the documents with other related information can help to make more information-rich queries. In this context, the paper presents a comprehensive review of text extraction and data integration techniques to enable this automation process in an ecological context. The paper investigates into extracting valuable floristic information from a historical Botany journal. The purpose behind this extraction is to bring to light relevant pieces of information contained within the document. In addition, the paper also explores the need to integrate the extracted information together with other related information from disparate sources. All the information is then rendered into a query-able form in order to make unified queries. Hence, the paper makes use of a combination of Machine Learning, Natural Language Processing and Semantic Web techniques to achieve this. The proposed approach is demonstrated through the information extracted from the journal and the information-rich queries made through the integration process. The paper shows that the approach has a merit in extracting relevant information from the journal, discusses how the machine learning models have been designed to classify complex information and also gives a measure of their performance. The paper also shows that the approach has a merit in query time in regard to querying floristic information from a multi-source linked data model.

PMID:36262290 | PMC:PMC9573881 | DOI:10.1016/j.heliyon.2022.e10710

Categories: Literature Watch

Using logical constraints to validate statistical information about disease outbreaks in collaborative knowledge graphs: the case of COVID-19 epidemiology in Wikidata

Thu, 2022-10-20 06:00

PeerJ Comput Sci. 2022 Sep 29;8:e1085. doi: 10.7717/peerj-cs.1085. eCollection 2022.

ABSTRACT

Urgent global research demands real-time dissemination of precise data. Wikidata, a collaborative and openly licensed knowledge graph available in RDF format, provides an ideal forum for exchanging structured data that can be verified and consolidated using validation schemas and bot edits. In this research article, we catalog an automatable task set necessary to assess and validate the portion of Wikidata relating to the COVID-19 epidemiology. These tasks assess statistical data and are implemented in SPARQL, a query language for semantic databases. We demonstrate the efficiency of our methods for evaluating structured non-relational information on COVID-19 in Wikidata, and its applicability in collaborative ontologies and knowledge graphs more broadly. We show the advantages and limitations of our proposed approach by comparing it to the features of other methods for the validation of linked web data as revealed by previous research.

PMID:36262159 | PMC:PMC9575845 | DOI:10.7717/peerj-cs.1085

Categories: Literature Watch

Divergent semantic integration (DSI): Extracting creativity from narratives with distributional semantic modeling

Mon, 2022-10-17 06:00

Behav Res Methods. 2022 Oct 17. doi: 10.3758/s13428-022-01986-2. Online ahead of print.

ABSTRACT

We developed a novel conceptualization of one component of creativity in narratives by integrating creativity theory and distributional semantics theory. We termed the new construct divergent semantic integration (DSI), defined as the extent to which a narrative connects divergent ideas. Across nine studies, 27 different narrative prompts, and over 3500 short narratives, we compared six models of DSI that varied in their computational architecture. The best-performing model employed Bidirectional Encoder Representations from Transformers (BERT), which generates context-dependent numerical representations of words (i.e., embeddings). BERT DSI scores demonstrated impressive predictive power, explaining up to 72% of the variance in human creativity ratings, even approaching human inter-rater reliability for some tasks. BERT DSI scores showed equivalently high predictive power for expert and nonexpert human ratings of creativity in narratives. Critically, DSI scores generalized across ethnicity and English language proficiency, including individuals identifying as Hispanic and L2 English speakers. The integration of creativity and distributional semantics theory has substantial potential to generate novel hypotheses about creativity and novel operationalizations of its underlying processes and components. To facilitate new discoveries across diverse disciplines, we provide a tutorial with code (osf.io/ath2s) on how to compute DSI and a web app ( osf.io/ath2s ) to freely retrieve DSI scores.

PMID:36253596 | DOI:10.3758/s13428-022-01986-2

Categories: Literature Watch

Semantic Sentiment Classification for COVID-19 Tweets Using Universal Sentence Encoder

Mon, 2022-10-17 06:00

Comput Intell Neurosci. 2022 Oct 5;2022:6354543. doi: 10.1155/2022/6354543. eCollection 2022.

ABSTRACT

The spread of data on the web has increased in the last twenty years. One of the reasons is the appearance of social media. The data on social sites describe many real-life events in our daily lives. In the period of the COVID-19 pandemic, a lot of people and media organizations were writing and documenting their health status and the latest news about the coronavirus on social media. Using these tweets (sentiments) about the coronavirus and analyzing them in a computational model can help decision makers in measuring public opinion and yielding remarkable findings. In this research article, we introduce a deep learning sentiment analysis model based on Universal Sentence Encoder. The dataset used in this research was collected from Twitter, and it was classified as positive, neutral, and negative. The sentence embedding model determines the meaning of word sequences instead of individual words. The model divides the dataset into training and testing and depends on the sentence similarity in detecting sentiment class. The obtained accuracy results reached 78.062%, and this result outperforms many traditional ML classifiers based on TF-IDF applied on the same dataset and another model based on the CNN classifier.

PMID:36248924 | PMC:PMC9556213 | DOI:10.1155/2022/6354543

Categories: Literature Watch

Efficient recognition of dynamic user emotions based on deep neural networks

Mon, 2022-10-17 06:00

Front Neurorobot. 2022 Sep 29;16:1006755. doi: 10.3389/fnbot.2022.1006755. eCollection 2022.

ABSTRACT

The key issue at this stage is how to mine the large amount of valuable user sentiment information from the massive amount of web text and create a suitable dynamic user text sentiment analysis technique. Hence, this study offers a writing feature abstraction process based on ON-LSTM and attention mechanism to address the problem that syntactic information is ignored in emotional text feature extraction. The study found that the Att-ON-LSTM improved the micro-average F1 value by 2.27% and the macro-average F value by 1.7% compared to the Bi-LSTM model with the added attentivity mechanisms. It is demonstrated that it can perform better extraction of semantic information and hierarchical structure information in emotional text and obtain more comprehensive emotional text features. In addition, the ON-LSTM-LS, a sentiment analysis model based on ON-LSTM and tag semantics, is planned to address the problem that tag semantics is ignored in the process of text sentiment analysis. The experimental consequences exposed that the accuracy of the ON-LSTM and labeled semantic sentiment analysis model on the test set is improved by 0.78% with the addition of labeled word directions compared to the model Att-ON-LSTM without the addition of labeled semantic information. The macro-averaged F1 value improved by 1.04%, which indicates that the sentiment analysis process based on ON-LSTM and tag semantics can effectively perform the text sentiment analysis task and improve the sentiment classification effect to some extent. In conclusion, deep learning models for dynamic user sentiment analysis possess high application capabilities.

PMID:36247360 | PMC:PMC9559588 | DOI:10.3389/fnbot.2022.1006755

Categories: Literature Watch

MHADTI: predicting drug-target interactions via multiview heterogeneous information network embedding with hierarchical attention mechanisms

Sat, 2022-10-15 06:00

Brief Bioinform. 2022 Oct 14:bbac434. doi: 10.1093/bib/bbac434. Online ahead of print.

ABSTRACT

MOTIVATION: Discovering the drug-target interactions (DTIs) is a crucial step in drug development such as the identification of drug side effects and drug repositioning. Since identifying DTIs by web-biological experiments is time-consuming and costly, many computational-based approaches have been proposed and have become an efficient manner to infer the potential interactions. Although extensive effort is invested to solve this task, the prediction accuracy still needs to be improved. More especially, heterogeneous network-based approaches do not fully consider the complex structure and rich semantic information in these heterogeneous networks. Therefore, it is still a challenge to predict DTIs efficiently.

RESULTS: In this study, we develop a novel method via Multiview heterogeneous information network embedding with Hierarchical Attention mechanisms to discover potential Drug-Target Interactions (MHADTI). Firstly, MHADTI constructs different similarity networks for drugs and targets by utilizing their multisource information. Combined with the known DTI network, three drug-target heterogeneous information networks (HINs) with different views are established. Secondly, MHADTI learns embeddings of drugs and targets from multiview HINs with hierarchical attention mechanisms, which include the node-level, semantic-level and graph-level attentions. Lastly, MHADTI employs the multilayer perceptron to predict DTIs with the learned deep feature representations. The hierarchical attention mechanisms could fully consider the importance of nodes, meta-paths and graphs in learning the feature representations of drugs and targets, which makes their embeddings more comprehensively. Extensive experimental results demonstrate that MHADTI performs better than other SOTA prediction models. Moreover, analysis of prediction results for some interested drugs and targets further indicates that MHADTI has advantages in discovering DTIs.

AVAILABILITY AND IMPLEMENTATION: https://github.com/pxystudy/MHADTI.

PMID:36242566 | DOI:10.1093/bib/bbac434

Categories: Literature Watch

COVID-19 Rapid Antigen Tests: Bibliometric Analysis of the Scientific Literature

Fri, 2022-10-14 06:00

Int J Environ Res Public Health. 2022 Sep 30;19(19):12493. doi: 10.3390/ijerph191912493.

ABSTRACT

As the COVID-19 pandemic continues to disrupt health systems worldwide, conducting Rapid Antigen Testing (RAT) at specified intervals has become an essential part of many people's lives around the world. We identified and analyzed the academic literature on COVID-19 RAT. The Web of Science electronic database was queried on 6 July 2022 to find relevant publications. Publication and citation data were retrieved directly from the database. VOSviewer, a bibliometric software, was then used to relate these data to the semantic content from the titles, abstracts, and keywords. The analysis was based on data from 1000 publications. The most productive authors were from Japan and the United States, led by Dr. Koji Nakamura from Japan (n = 10, 1.0%). The most academically productive countries were in the North America, Europe and Asia, led by the United States of America (n = 266, 26.6%). Sensitivity (n = 32, 3.2%) and specificity (n = 23, 2.3%) were among the most frequently recurring author keywords. Regarding sampling methods, "saliva" (n = 54, 5.4%) was mentioned more frequently than "nasal swab" (n = 32, 3.2%) and "nasopharyngeal swab" (n = 22, 2.2%). Recurring scenarios that required RAT were identified: emergency department, healthcare worker, mass screening, airport, traveler, and workplace. Our bibliometric analysis revealed that COVID-19 RAT has been utilized in a range of studies. RAT results were cross-checked with RT-PCR tests for sensitivity and specificity. These results are consistent with comparable exchanges of methods, results or discussions among laboratorians, authors, institutions and publishers in the involved countries of the world.

PMID:36231789 | DOI:10.3390/ijerph191912493

Categories: Literature Watch

Use of linked data to assess the impact of including out-of-hospital deaths on 30-day in-hospital mortality indicators: a retrospective cohort study

Tue, 2022-10-11 06:00

CMAJ Open. 2022 Oct 11;10(4):E882-E888. doi: 10.9778/cmajo.20210264. Print 2022 Sep-Oct.

ABSTRACT

BACKGROUND: The Canadian Institute for Health Information (CIHI) annually reports on health system performance indicators, including various 30-day in-hospital mortality rates. We aimed to assess the impact of including out-of-hospital deaths on 3 CIHI indicators: 30-day acute myocardial infarction (AMI) in-hospital mortality, 30-day stroke in-hospital mortality and hospital deaths following major surgery.

METHODS: We followed national cohorts of patients admitted to hospital in 1 of 9 Canadian provinces for AMI, stroke and major surgery for 30-day all-cause mortality in 2 fiscal years (2011/12 and 2016/17). We calculated descriptive statistics to characterize the cohorts. The CIHI Discharge Abstract Database was linked with the Canadian Vital Statistics Death Database using a probabilistic algorithm to identify out-of-hospital deaths. We calculated absolute numbers, relative proportions and 30-day mortality rates for in-hospital, out-of-hospital and all deaths. We compared results between fiscal years.

RESULTS: We found that hospital admissions increased between fiscal years for each indicator; however, cohort characteristics remained consistent. In 2016/17, the number of out-of-hospital deaths that occurred was 325 for AMI, 545 for stroke and 820 for major surgery. The relative proportions of out-of-hospital deaths ranged from 12.3% for AMI to 14.9% for major surgery in 2016/17 (an increase from 10.6% and 13.1%, respectively, from 2011/12). In-hospital mortality rates improved over time for all 3 indicators, while out-of-hospital mortality rates remained consistent between fiscal years at 0.8% for AMI, 1.9%-2.0% for stroke and 0.2%-0.3% for major surgery.

INTERPRETATION: Improvements between fiscal years were attributable to reductions in in-hospital mortality, rather than deaths occurring outside of hospitals. Trends over time were the same for each indicator irrespective of whether in-hospital mortality or all deaths were measured.

PMID:36220181 | DOI:10.9778/cmajo.20210264

Categories: Literature Watch

ViSpa (Vision Spaces): A computer-vision-based representation system for individual images and concept prototypes, with large-scale evaluation

Thu, 2022-10-06 06:00

Psychol Rev. 2022 Oct 6. doi: 10.1037/rev0000392. Online ahead of print.

ABSTRACT

Quantitative, data-driven models for mental representations have long enjoyed popularity and success in psychology (e.g., distributional semantic models in the language domain), but have largely been missing for the visual domain. To overcome this, we present ViSpa (Vision Spaces), high-dimensional vector spaces that include vision-based representation for naturalistic images as well as concept prototypes. These vectors are derived directly from visual stimuli through a deep convolutional neural network trained to classify images and allow us to compute vision-based similarity scores between any pair of images and/or concept prototypes. We successfully evaluate these similarities against human behavioral data in a series of large-scale studies, including off-line judgments-visual similarity judgments for the referents of word pairs (Study 1) and for image pairs (Study 2), and typicality judgments for images given a label (Study 3)-as well as online processing times and error rates in a discrimination (Study 4) and priming task (Study 5) with naturalistic image material. ViSpa similarities predict behavioral data across all tasks, which renders ViSpa a theoretically appealing model for vision-based representations and a valuable research tool for data analysis and the construction of experimental material: ViSpa allows for precise control over experimental material consisting of images and/or words denoting imageable concepts and introduces a specifically vision-based similarity for word pairs. To make ViSpa available to a wide audience, this article (a) includes (video) tutorials on how to use ViSpa in R and (b) presents a user-friendly web interface at http://vispa.fritzguenther.de. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

PMID:36201829 | DOI:10.1037/rev0000392

Categories: Literature Watch

EAGS: An extracting auxiliary knowledge graph model in multi-turn dialogue generation

Wed, 2022-10-05 06:00

World Wide Web. 2022 Sep 30:1-22. doi: 10.1007/s11280-022-01100-8. Online ahead of print.

ABSTRACT

Multi-turn dialogue generation is an essential and challenging subtask of text generation in the question answering system. Existing methods focused on extracting latent topic-level relevance or utilizing relevant external background knowledge. However, they are prone to ignore the fact that relying too much on latent aspects will lose subjective key information. Furthermore, there is not so much relevant external knowledge that can be used for referencing or a graph that has complete entity links. Dependency tree is a special structure that can be extracted from sentences, it covers the explicit key information of sentences. Therefore, in this paper, we proposed the EAGS model, which combines the subjective pivotal information from the explicit dependency tree with sentence implicit semantic information. The EAGS model is a knowledge graph enabled multi-turn dialogue generation model, and it doesn't need extra external knowledge, it can not only extract and build a dependency knowledge graph from existing sentences, but also prompt the node representation, which is shared with Bi-GRU each time step word embedding in node semantic level. We store the specific domain subgraphs built by the EAGS, which can be retrieved as external knowledge graph in the future multi-turn dialogue generation task. We design a multi-task training approach to enhance semantics and structure local feature extraction, and balance with the global features. Finally, we conduct experiments on Ubuntu large-scale English multi-turn dialogue community dataset and English Daily dialogue dataset. Experiment results show that our EAGS model performs well on both automatic evaluation and human evaluation compared with the existing baseline models.

PMID:36196376 | PMC:PMC9523637 | DOI:10.1007/s11280-022-01100-8

Categories: Literature Watch

Net activism and whistleblowing on YouTube: a text mining analysis

Tue, 2022-10-04 06:00

Multimed Tools Appl. 2022 Sep 29:1-21. doi: 10.1007/s11042-022-13777-0. Online ahead of print.

ABSTRACT

Social media is more and more dominant in everyday life for people around the world. YouTube content is a resource that may be useful, in social computational science, for understanding key questions about society. Using this resource, we performed web scraping to create a dataset of 644,575 video transcriptions concerning net activism and whistleblowing. We automatically performed linguistic feature extraction to capture a representation of each video using its title, description and transcription (downloaded metadata). The next step was to clean the dataset using automatic clustering with linguistic representation to identify unmatched videos and noisy keywords. Using these keywords to exclude videos, we finally obtained a dataset that was reduced by 95%, i.e., it contained 35,730 video transcriptions. Then, we again automatically clustered the videos using a lexical representation and split the dataset into subsets, leading to hundreds of clusters that we interpreted manually to identify a hierarchy of topics of interest concerning whistleblowing. We used the dataset to learn a lexical representation for a specific topic and to detect unknown whistleblowing videos for this topic; the accuracy of this detection is 57.4%. We also used the dataset to identify interesting context linguistic markers around the names of whistleblowers. From a given list of names, we automatically extracted all 5-g word sequences from the dataset and identified interesting markers in the left and right contexts for each name by manual interpretation. The results of our study are the following: a dataset (raw and cleaned collections) concerning whistleblowing, a hierarchy of topics about whistleblowing, the automatic prediction of whistleblowing and the semi-automatic semantic analysis of markers around whistleblower names. This text mining analysis can be exploited for digital sociology and e-democracy studies.

PMID:36193288 | PMC:PMC9520105 | DOI:10.1007/s11042-022-13777-0

Categories: Literature Watch

An examination of the association between marital status and prenatal mental disorders using linked health administrative data

Sat, 2022-10-01 06:00

BMC Pregnancy Childbirth. 2022 Oct 1;22(1):735. doi: 10.1186/s12884-022-05045-8.

ABSTRACT

BACKGROUND: International research shows marital status impacts the mental health of pregnant women, with prenatal depression and anxiety being higher among non-partnered women. However, there have been few studies examining the relationship between marital status and prenatal mental disorders among Australian women.

METHODS: This is a population-based retrospective cohort study using linked data from the New South Wales (NSW) Perinatal Data Collection (PDC) and Admitted Patients Data Collection (APDC). The cohort consists of a total of 598,599 pregnant women with 865,349 admissions. Identification of pregnant women for mental disorders was conducted using the 10th version International Classification of Diseases and Related Health Problems, Australian Modification (ICD-10-AM). A binary logistic regression model was used to estimate the relationship between marital status and prenatal mental disorder after adjusting for confounders.

RESULTS: Of the included pregnant women, 241 (0.04%), 107 (0.02%) and 4359 (0.5%) were diagnosed with depressive disorder, anxiety disorder, and self-harm, respectively. Non-partnered pregnant women had a higher likelihood of depressive disorder (Adjusted Odds Ratio (AOR) = 2.75; 95% CI: 2.04, 3.70) and anxiety disorder (AOR = 3.16, 95% CI: 2.03, 4.91), compared with partnered women. Furthermore, the likelihood of experiencing self-harm was two times higher among non-partnered pregnant women (AOR = 2.00; 95% CI: 1.82, 2.20) than partnered pregnant women.

CONCLUSIONS: Non-partnered marital status has a significant positive association with prenatal depressive disorder, anxiety disorder and self-harm. This suggests it would be highly beneficial for maternal health care professionals to screen non-partnered pregnant women for prenatal mental health problems such as depression, anxiety and self-harm.

PMID:36182904 | PMC:PMC9526285 | DOI:10.1186/s12884-022-05045-8

Categories: Literature Watch

RTX-KG2: a system for building a semantically standardized knowledge graph for translational biomedicine

Thu, 2022-09-29 06:00

BMC Bioinformatics. 2022 Sep 29;23(1):400. doi: 10.1186/s12859-022-04932-3.

ABSTRACT

BACKGROUND: Biomedical translational science is increasingly using computational reasoning on repositories of structured knowledge (such as UMLS, SemMedDB, ChEMBL, Reactome, DrugBank, and SMPDB in order to facilitate discovery of new therapeutic targets and modalities. The NCATS Biomedical Data Translator project is working to federate autonomous reasoning agents and knowledge providers within a distributed system for answering translational questions. Within that project and the broader field, there is a need for a framework that can efficiently and reproducibly build an integrated, standards-compliant, and comprehensive biomedical knowledge graph that can be downloaded in standard serialized form or queried via a public application programming interface (API).

RESULTS: To create a knowledge provider system within the Translator project, we have developed RTX-KG2, an open-source software system for building-and hosting a web API for querying-a biomedical knowledge graph that uses an Extract-Transform-Load approach to integrate 70 knowledge sources (including the aforementioned core six sources) into a knowledge graph with provenance information including (where available) citations. The semantic layer and schema for RTX-KG2 follow the standard Biolink model to maximize interoperability. RTX-KG2 is currently being used by multiple Translator reasoning agents, both in its downloadable form and via its SmartAPI-registered interface. Serializations of RTX-KG2 are available for download in both the pre-canonicalized form and in canonicalized form (in which synonyms are merged). The current canonicalized version (KG2.7.3) of RTX-KG2 contains 6.4M nodes and 39.3M edges with a hierarchy of 77 relationship types from Biolink.

CONCLUSION: RTX-KG2 is the first knowledge graph that integrates UMLS, SemMedDB, ChEMBL, DrugBank, Reactome, SMPDB, and 64 additional knowledge sources within a knowledge graph that conforms to the Biolink standard for its semantic layer and schema. RTX-KG2 is publicly available for querying via its API at arax.rtx.ai/api/rtxkg2/v1.2/openapi.json . The code to build RTX-KG2 is publicly available at github:RTXteam/RTX-KG2 .

PMID:36175836 | DOI:10.1186/s12859-022-04932-3

Categories: Literature Watch

Unlocking Potential within Health Systems Using Privacy-Preserving Record Linkage: Exploring Chronic Kidney Disease Outcomes through Linked Data Modelling

Wed, 2022-09-28 06:00

Appl Clin Inform. 2022 Aug;13(4):901-909. doi: 10.1055/s-0042-1757174. Epub 2022 Sep 28.

ABSTRACT

BACKGROUND: Chronic kidney disease (CKD) is a major global health problem that affects approximately one in 10 adults. Up to 90% of individuals with CKD go undetected until its progression to advanced stages, invariably leading to death in the absence of treatment. The project aims to fill information gaps around the burden of CKD in the Western Australian (WA) population, including incidence, prevalence, rate of progression, and economic cost to the health system.

METHODS: Given the sensitivity of the information involved, the project employed a privacy preserving record linkage methodology to link data from four major pathology providers in WA to hospital records, to establish a CKD registry with continuous medical record for individuals with biochemical specification for CKD. This method uses encrypted personal identifying information in a probability-based linkage framework (Bloom filters) to help mitigate risk while maximizing linkage quality.

RESULTS: The project developed interoperable technology to create a transparent CKD data catalogue which is linkable to other datasets. This technology has been designed to support the aspirations of the research program to provide linked de-identified pathology, morbidity, and mortality data that can be used to derive insights to enable better CKD patient outcomes. The cohort includes over 1 million individuals with creatinine results over the period 2002 to 2021.

CONCLUSION: Using linked data from across the care continuum, researchers are able to evaluate the effectiveness of service delivery and provide evidence for policy and program development. The CKD registry will enable an innovative review of the epidemiology of CKD in WA. Linking pathology records can identify cases of CKD that are missed in the early stages due to disaggregation of results, enabling identification of at-risk populations that represent targets for early intervention and management.

PMID:36170880 | DOI:10.1055/s-0042-1757174

Categories: Literature Watch

Chronic diseases and compliance with provincial guidelines for outpatient antibiotic prescription in cases of otitis media and respiratory infections: a population-based study of linked data in Quebec, Canada, 2010-2017

Tue, 2022-09-27 06:00

CMAJ Open. 2022 Sep 27;10(3):E841-E847. doi: 10.9778/cmajo.20210257. Print 2022 Sep-Oct.

ABSTRACT

BACKGROUND: In Quebec, antibiotic use is higher among outpatients with chronic diseases. We sought to measure compliance with provincial guidelines for the treatment of otitis media and common respiratory infections, and to measure variations in compliance according to the presence of certain chronic diseases.

METHODS: We conducted a population-based study of linked data on antibiotic dispensing covered by the public drug insurance plan between April 2010 and March 2017. We included patients who had consulted a primary care physician within 2 days before being dispensed an antibiotic for an infection targeted by provincial guidelines, including bronchitis in patients with chronic obstructive pulmonary disease, otitis media, pharyngitis, pneumonia and sinusitis. We computed proportions of prescriptions compliant with guidelines (use of recommended antibiotic for children, and use of recommended antibiotic and dosage for adults) by age group (children or adults) and chronic disease (respiratory, cardiovascular, diabetes, mental disorder or none). We measured the impact of chronic diseases on compliance using robust Poisson regression.

RESULTS: We analyzed between 14 677 and 198 902 prescriptions for each infection under study. Compliance was greater than 87% among children, but was lower among children with asthma (proportion ratios between 0.97 and 1.00). In adults, the chosen antibiotic was compliant for at least 73% of prescriptions, except for pharyngitis (≤ 61%). Accounting for dosage lowered compliance to between 31% and 61%. Compliance was lower in the presence of chronic diseases (proportion ratios between 0.94 and 0.98).

INTERPRETATION: It is possible that prescribing noncompliant prescriptions was sometimes appropriate, but the high frequency of noncompliance suggests room for improvement. Given that variations associated with chronic diseases were small, disease-specific guidelines for antibiotic prescriptions are likely to have a limited impact on compliance.

PMID:36167419 | DOI:10.9778/cmajo.20210257

Categories: Literature Watch

Availability and readiness of healthcare facilities and their effects on long-acting modern contraceptive use in Bangladesh: analysis of linked data

Wed, 2022-09-21 06:00

BMC Health Serv Res. 2022 Sep 21;22(1):1180. doi: 10.1186/s12913-022-08565-3.

ABSTRACT

AIM: Increasing access to long-acting modern contraceptives (LMAC) is one of the key factors in preventing unintended pregnancy and protecting women's health rights. However, the availability and accessibility of health facilities and their impacts on LAMC utilisation (implant, intrauterine devices, sterilisation) in low- and middle-income countries is an understudied topic. This study aimed to examine the association between the availability and readiness of health facilities and the use of LAMC in Bangladesh.

METHODS: In this survey study, we linked the 2017/18 Bangladesh Demographic and Health Survey data with the 2017 Bangladesh Health Facility Survey data using the administrative-boundary linkage method. Mixed-effect multilevel logistic regressions were conducted. The sample comprised 10,938 married women of 15-49 years age range who were fertile but did not desire a child within 2 years of the date of survey. The outcome variable was the current use of LAMC (yes, no), and the explanatory variables were health facility-, individual-, household- and community-level factors.

RESULTS: Nearly 34% of participants used LAMCs with significant variations across areas in Bangladesh. The average scores of the health facility management and health facility infrastructure were 0.79 and 0.83, respectively. Of the facilities where LAMCs were available, 69% of them were functional and ready to provide LAMCs to the respondents. The increase in scores for the management (adjusted odds ratio (aOR), 1.59; 95% CI, 1.21-2.42) and infrastructure (aOR, 1.44; 95% CI, 1.01-1.69) of health facilities was positively associated with the overall uptake of LAMC. For per unit increase in the availability and readiness scores to provide LAMC at the nearest health facilities, the aORs for women to report using LAMC were 2.16 (95% CI, 1.18-3.21) and 1.74 (95% CI, 1.15-3.20), respectively. A nearly 27% decline in the likelihood of LAMC uptake was observed for every kilometre increase in the average regional-level distance between women's homes and the nearest health facilities.

CONCLUSION: The proximity of health facilities and their improved management, infrastructure, and readiness to provide LAMCs to women significantly increase their uptake. Policies and programs should prioritise improving health facility readiness to increase LAMC uptake.

PMID:36131314 | DOI:10.1186/s12913-022-08565-3

Categories: Literature Watch

Sensorimotor distance: A grounded measure of semantic similarity for 800 million concept pairs

Wed, 2022-09-21 06:00

Behav Res Methods. 2022 Sep 21. doi: 10.3758/s13428-022-01965-7. Online ahead of print.

ABSTRACT

Experimental design and computational modelling across the cognitive sciences often rely on measures of semantic similarity between concepts. Traditional measures of semantic similarity are typically derived from distance in taxonomic databases (e.g. WordNet), databases of participant-produced semantic features, or corpus-derived linguistic distributional similarity (e.g. CBOW), all of which are theoretically problematic in their lack of grounding in sensorimotor experience. We present a new measure of sensorimotor distance between concepts, based on multidimensional comparisons of their experiential strength across 11 perceptual and action-effector dimensions in the Lancaster Sensorimotor Norms. We demonstrate that, in modelling human similarity judgements, sensorimotor distance has comparable explanatory power to other measures of semantic similarity, explains variance in human judgements which is missed by other measures, and does so with the advantages of remaining both grounded and computationally efficient. Moreover, sensorimotor distance is equally effective for both concrete and abstract concepts. We further introduce a web-based tool ( https://lancaster.ac.uk/psychology/smdistance ) for easily calculating and visualising sensorimotor distance between words, featuring coverage of nearly 800 million word pairs. Supplementary materials are available at https://osf.io/d42q6/ .

PMID:36131199 | DOI:10.3758/s13428-022-01965-7

Categories: Literature Watch

Availability and readiness of health care facilities and their effects on under-five mortality in Bangladesh: Analysis of linked data

Fri, 2022-09-16 06:00

J Glob Health. 2022 Sep 17;12:04081. doi: 10.7189/jogh.12.04081.

ABSTRACT

BACKGROUND: Under-five mortality is unacceptably high in Bangladesh instead of governmental level efforts to reduce its prevalence over the years. Increased availability and accessibility to the health care facility and its services can play a significant role to reduce its occurrence. We explored the associations of several forms of child mortality with health facility level factors.

METHODS: The 2017-18 Bangladesh Demographic and Health Survey (BDHS) data and 2017 Bangladesh Health Facility Survey (BHFS) data were linked and analysed. The outcome variables were neonatal mortality, infant mortality, and under-five mortality. Health facility level factors were considered as major explanatory variables. They were the basic management and administrative system of the nearest health care facility where child health care services are available, degree of availability of the child health care services at the nearest health care facility, degree of readiness of the nearest health care facility (where child health care services are available) to provide child health care services and average distance of the nearest health care facility from mothers' homes where child health care services are available. The associations between the outcome variables and explanatory variables were determined using the multilevel mixed-effect logistic regression model.

RESULTS: Reported under-five, infant and neonatal mortality were 40, 27, and 22 per 10 000 live births, respectively. The likelihood of neonatal mortality was found to be declined by 15% for every unit increase in the score of the basic management and administrative system of the mothers' homes nearest health care facility where child health care services are available. Similarly, degree of availability and readiness of the mothers' homes nearest health care facilities to provide child health care services were found to be linked with 18%-24% reduction in neonatal and infant mortality. On contrary, for every kilometre increased distance between mothers' homes and its nearest health care facility was found to be associated with a 15%-20% increase in the likelihoods of neonatal, infant and under-five mortality.

CONCLUSIONS: The availability of health facilities providing child health care services close to mothers' residence and its readiness to provide child health care services play a significant role in reducing under-five mortality in Bangladesh. Policies and programs should be taken to increase the availability and accessibility of health facilities that provide child health care services.

PMID:36112406 | DOI:10.7189/jogh.12.04081

Categories: Literature Watch

Context-Enriched Learning Models for Aligning Biomedical Vocabularies at Scale in the UMLS Metathesaurus

Thu, 2022-09-15 06:00

Proc Int World Wide Web Conf. 2022 Apr;2022:1037-1046. doi: 10.1145/3485447.3511946. Epub 2022 Apr 25.

ABSTRACT

The Unified Medical Language System (UMLS) Metathesaurus construction process mainly relies on lexical algorithms and manual expert curation for integrating over 200 biomedical vocabularies. A lexical-based learning model (LexLM) was developed to predict synonymy among Metathesaurus terms and largely outperforms a rule-based approach (RBA) that approximates the current construction process. However, the LexLM has the potential for being improved further because it only uses lexical information from the source vocabularies, while the RBA also takes advantage of contextual information. We investigate the role of multiple types of contextual information available to the UMLS editors, namely source synonymy (SS), source semantic group (SG), and source hierarchical relations (HR), for the UMLS vocabulary alignment (UVA) problem. In this paper, we develop multiple variants of context-enriched learning models (ConLMs) by adding to the LexLM the types of contextual information listed above. We represent these context types in context-enriched knowledge graphs (ConKGs) with four variants ConSS, ConSG, ConHR, and ConAll. We train these ConKG embeddings using seven KG embedding techniques. We create the ConLMs by concatenating the ConKG embedding vectors with the word embedding vectors from the LexLM. We evaluate the performance of the ConLMs using the UVA generalization test datasets with hundreds of millions of pairs. Our extensive experiments show a significant performance improvement from the ConLMs over the LexLM, namely +5.0% in precision (93.75%), +0.69% in recall (93.23%), +2.88% in F1 (93.49%) for the best ConLM. Our experiments also show that the ConAll variant including the three context types takes more time, but does not always perform better than other variants with a single context type. Finally, our experiments show that the pairs of terms with high lexical similarity benefit most from adding contextual information, namely +6.56% in precision (94.97%), +2.13% in recall (93.23%), +4.35% in F1 (94.09%) for the best ConLM. The pairs with lower degrees of lexical similarity also show performance improvement with +0.85% in F1 (96%) for low similarity and +1.31% in F1 (96.34%) for no similarity. These results demonstrate the importance of using contextual information in the UVA problem.

PMID:36108322 | PMC:PMC9455675 | DOI:10.1145/3485447.3511946

Categories: Literature Watch

Bio-SODA UX: enabling natural language question answering over knowledge graphs with user disambiguation

Tue, 2022-09-13 06:00

Distrib Parallel Databases. 2022;40(2-3):409-440. doi: 10.1007/s10619-022-07414-w. Epub 2022 Jul 16.

ABSTRACT

The problem of natural language processing over structured data has become a growing research field, both within the relational database and the Semantic Web community, with significant efforts involved in question answering over knowledge graphs (KGQA). However, many of these approaches are either specifically targeted at open-domain question answering using DBpedia, or require large training datasets to translate a natural language question to SPARQL in order to query the knowledge graph. Hence, these approaches often cannot be applied directly to complex scientific datasets where no prior training data is available. In this paper, we focus on the challenges of natural language processing over knowledge graphs of scientific datasets. In particular, we introduce Bio-SODA, a natural language processing engine that does not require training data in the form of question-answer pairs for generating SPARQL queries. Bio-SODA uses a generic graph-based approach for translating user questions to a ranked list of SPARQL candidate queries. Furthermore, Bio-SODA uses a novel ranking algorithm that includes node centrality as a measure of relevance for selecting the best SPARQL candidate query. Our experiments with real-world datasets across several scientific domains, including the official bioinformatics Question Answering over Linked Data (QALD) challenge, as well as the CORDIS dataset of European projects, show that Bio-SODA outperforms publicly available KGQA systems by an F1-score of least 20% and by an even higher factor on more complex bioinformatics datasets. Finally, we introduce Bio-SODA UX, a graphical user interface designed to assist users in the exploration of large knowledge graphs and in dynamically disambiguating natural language questions that target the data available in these graphs.

PMID:36097541 | PMC:PMC9458692 | DOI:10.1007/s10619-022-07414-w

Categories: Literature Watch

Pages