Semantic Web
PCLiON: An Ontology for Data Standardization and Sharing of Prostate Cancer Associated Lifestyles.
PCLiON: An Ontology for Data Standardization and Sharing of Prostate Cancer Associated Lifestyles.
Int J Med Inform. 2020 Nov 07;145:104332
Authors: Chen Y, Yu C, Liu X, Xi T, Xu G, Sun Y, Zhu F, Shen B
Abstract
BACKGROUND: Researches on Lifestyle medicine (LM) have emerged in recent years to garner wide attention. Prostate cancer (PCa) could be prevented and treated by positive lifestyles, but the association between lifestyles and PCa is always personalized.
OBJECTIVES: In order to solve the heterogeneity and diversity of different data types related to PCa, establish a standardized lifestyle ontology, promote the exchange and sharing of disease lifestyle knowledge, and support text mining and knowledge discovery.
METHODS: The overall construction of PCLiON was created in accordance with the principles and methodology of ontology construction. Following the principles of evidence-based medicine, we screened and integrated the lifestyles and their related attributes. Protégé was used to construct and validate the semantic framework. All annotations in PCLiON were based on SNOMED CT, NCI Thesaurus, the Cochrane Library and FooDB, etc. HTML5 and ASP.NET was used to develop the independent Web page platform and corresponding intelligent terminal application. The PCLiON also uploaded to the National Center for Biomedical Ontology BioPortal.
RESULTS: PCLiON integrates 397 lifestyles and lifestyle-related factors associated with PCa, and is the first of its kind for a specific disease. It contains 320 attribute annotations and 11 object attributes. The logical relationship and completeness meet the ontology requirements. Qualitative analysis was carried out for 329 terms in PCLiON, including factors which are protective, risk or associated but functional unclear, etc. PCLiON is publicly available both at http://pcaontology.net/PCaLifeStyleDefault.aspx and https://bioportal.bioontology.org/ontologies/PCALION.
CONCLUSIONS: Through the bilingual online platforms, complex lifestyle research data can be transformed into standardized, reliable and responsive knowledge, which can promote the shared-decision making (SDM) on lifestyle intervention and assist patients in lifestyle self-management toward the goal of PCa targeted prevention.
PMID: 33186790 [PubMed - as supplied by publisher]
Random survival forests using linked data to measure illness burden among individuals before or after a cancer diagnosis: Development and internal validation of the SEER-CAHPS illness burden index
Int J Med Inform. 2021 Jan;145:104305. doi: 10.1016/j.ijmedinf.2020.104305. Epub 2020 Oct 21.
ABSTRACT
PURPOSE: To develop and internally validate an illness burden index among Medicare beneficiaries before or after a cancer diagnosis.
METHODS: Data source: SEER-CAHPS, linking Surveillance, Epidemiology, and End Results (SEER) cancer registry, Medicare enrollment and claims, and Medicare Consumer Assessment of Healthcare Providers and Systems (Medicare CAHPS) survey data providing self-reported sociodemographic, health, and functional status information. To generate a score for everyone in the dataset, we tabulated 4 groups within each annual subsample (2007-2013): 1) Medicare Advantage (MA) beneficiaries or 2) Medicare fee-for-service (FFS) beneficiaries, surveyed before cancer diagnosis; 3) MA beneficiaries or 4) Medicare FFS beneficiaries surveyed after diagnosis. Random survival forests (RSFs) predicted 12-month all-cause mortality and drew predictor variables (mean per subsample = 44) from 8 domains: sociodemographic, cancer-specific, health status, chronic conditions, healthcare utilization, activity limitations, proxy, and location-based factors. Roughly two-thirds of the sample was held out for algorithm training. Error rates based on the validation ("out-of-bag," OOB) samples reflected the correctly classified percentage. Illness burden scores represented predicted cumulative mortality hazard.
RESULTS: The sample included 116,735 Medicare beneficiaries with cancer, of whom 73 % were surveyed after their cancer diagnosis; overall mean mortality rate in the 12 months after survey response was 6%. SEER-CAHPS Illness Burden Index (SCIBI) scores were positively skewed (median range: 0.29 [MA, pre-diagnosis] to 2.85 [FFS, post-diagnosis]; mean range: 2.08 [MA, pre-diagnosis] to 4.88 [MA, post-diagnosis]). The highest decile of the distribution had a 51 % mortality rate (range: 29-71 %); the bottom decile had a 1% mortality rate (range: 0-2 %). The error rate was 20 % overall (range: 9% [among FFS enrollees surveyed after diagnosis] to 36 % [MA enrollees surveyed before diagnosis]).
CONCLUSIONS: This new morbidity measure for Medicare beneficiaries with cancer may be useful to future SEER-CAHPS users who wish to adjust for comorbidity.
PMID:33188949 | PMC:PMC7736519 | DOI:10.1016/j.ijmedinf.2020.104305
Suicide Risk Assessment Using Machine Learning and Social Networks: a Scoping Review.
Suicide Risk Assessment Using Machine Learning and Social Networks: a Scoping Review.
J Med Syst. 2020 Nov 09;44(12):205
Authors: Castillo-Sánchez G, Marques G, Dorronzoro E, Rivera-Romero O, Franco-Martín M, De la Torre-Díez I
Abstract
According to the World Health Organization (WHO) report in 2016, around 800,000 of individuals have committed suicide. Moreover, suicide is the second cause of unnatural death in people between 15 and 29 years. This paper reviews state of the art on the literature concerning the use of machine learning methods for suicide detection on social networks. Consequently, the objectives, data collection techniques, development process and the validation metrics used for suicide detection on social networks are analyzed. The authors conducted a scoping review using the methodology proposed by Arksey and O'Malley et al. and the PRISMA protocol was adopted to select the relevant studies. This scoping review aims to identify the machine learning techniques used to predict suicide risk based on information posted on social networks. The databases used are PubMed, Science Direct, IEEE Xplore and Web of Science. In total, 50% of the included studies (8/16) report explicitly the use of data mining techniques for feature extraction, feature detection or entity identification. The most commonly reported method was the Linguistic Inquiry and Word Count (4/8, 50%), followed by Latent Dirichlet Analysis, Latent Semantic Analysis, and Word2vec (2/8, 25%). Non-negative Matrix Factorization and Principal Component Analysis were used only in one of the included studies (12.5%). In total, 3 out of 8 research papers (37.5%) combined more than one of those techniques. Supported Vector Machine was implemented in 10 out of the 16 included studies (62.5%). Finally, 75% of the analyzed studies implement machine learning-based models using Python.
PMID: 33165729 [PubMed - in process]
Explainable Artificial Intelligence Recommendation System by Leveraging the Semantics of Adverse Childhood Experiences: Proof-of-Concept Prototype Development.
Explainable Artificial Intelligence Recommendation System by Leveraging the Semantics of Adverse Childhood Experiences: Proof-of-Concept Prototype Development.
JMIR Med Inform. 2020 Nov 04;8(11):e18752
Authors: Ammar N, Shaban-Nejad A
Abstract
BACKGROUND: The study of adverse childhood experiences and their consequences has emerged over the past 20 years. Although the conclusions from these studies are available, the same is not true of the data. Accordingly, it is a complex problem to build a training set and develop machine-learning models from these studies. Classic machine learning and artificial intelligence techniques cannot provide a full scientific understanding of the inner workings of the underlying models. This raises credibility issues due to the lack of transparency and generalizability. Explainable artificial intelligence is an emerging approach for promoting credibility, accountability, and trust in mission-critical areas such as medicine by combining machine-learning approaches with explanatory techniques that explicitly show what the decision criteria are and why (or how) they have been established. Hence, thinking about how machine learning could benefit from knowledge graphs that combine "common sense" knowledge as well as semantic reasoning and causality models is a potential solution to this problem.
OBJECTIVE: In this study, we aimed to leverage explainable artificial intelligence, and propose a proof-of-concept prototype for a knowledge-driven evidence-based recommendation system to improve mental health surveillance.
METHODS: We used concepts from an ontology that we have developed to build and train a question-answering agent using the Google DialogFlow engine. In addition to the question-answering agent, the initial prototype includes knowledge graph generation and recommendation components that leverage third-party graph technology.
RESULTS: To showcase the framework functionalities, we here present a prototype design and demonstrate the main features through four use case scenarios motivated by an initiative currently implemented at a children's hospital in Memphis, Tennessee. Ongoing development of the prototype requires implementing an optimization algorithm of the recommendations, incorporating a privacy layer through a personal health library, and conducting a clinical trial to assess both usability and usefulness of the implementation.
CONCLUSIONS: This semantic-driven explainable artificial intelligence prototype can enhance health care practitioners' ability to provide explanations for the decisions they make.
PMID: 33146623 [PubMed - as supplied by publisher]
Benefits of not smoking during pregnancy for Australian Aboriginal and Torres Strait Islander women and their babies: a retrospective cohort study using linked data.
Benefits of not smoking during pregnancy for Australian Aboriginal and Torres Strait Islander women and their babies: a retrospective cohort study using linked data.
BMJ Open. 2019 11 21;9(11):e032763
Authors: McInerney C, Ibiebele I, Ford JB, Randall D, Morris JM, Meharg D, Mitchell J, Milat A, Torvaldsen S
Abstract
OBJECTIVES: To provide evidence for targeted smoking cessation policy, the aim of this study was to compare pregnancy outcomes of Aboriginal mothers who reported not smoking during pregnancy with Aboriginal mothers who reported smoking during pregnancy.
DESIGN: Population based retrospective cohort study using linked data.
SETTING: New South Wales, the most populous Australian state.
POPULATION: 18 154 singleton babies born to 13 477 Aboriginal mothers between 2010 and 2014 were identified from routinely collected New South Wales datasets. Aboriginality was determined from birth records and from four linked datasets through an Enhanced Reporting of Aboriginality algorithm.
EXPOSURE: Not smoking at any time during pregnancy.
MAIN OUTCOME MEASURES: Unadjusted and adjusted relative risks (aRR) and 95% CIs from modified Poisson regression were used to examine associations between not smoking during pregnancy and maternal and perinatal outcomes including severe morbidity, inter-hospital transfer, perinatal death, preterm birth and small-for-gestational age. Population attributable fractions (PAFs) were calculated using adjusted relative risks.
RESULTS: Compared with babies born to mothers who smoked during pregnancy, babies born to non-smoking mothers had a lower risk of all adverse perinatal outcomes including perinatal death (aRR=0.58, 95% CI 0.44 to 0.76), preterm birth (aRR=0.58, 95% CI 0.53 to 0.64) and small-for-gestational age (aRR=0.35, 95% CI 0.32 to 0.39). PAFs (%) were 27% for perinatal death, 26% for preterm birth and 48% for small-for-gestational-age. Compared with women who smoked during pregnancy (n=8919), those who did not smoke (n=9235) had a lower risk of being transferred to another hospital (aRR=0.76, 95% CI 0.66 to 0.89).
CONCLUSIONS: Babies born to women who did not smoke during pregnancy had a lower risk of adverse perinatal outcomes. Rates of adverse outcomes among Aboriginal non-smokers were similar to those among the general population. These results quantify the proportion of adverse perinatal outcomes due to smoking and highlight why effective smoking cessation programme are urgently required for this population.
PMID: 31753897 [PubMed - indexed for MEDLINE]
Shouting at each other into the void: A linguistic network analysis of vaccine hesitance and support in online discourse regarding California law SB277.
Shouting at each other into the void: A linguistic network analysis of vaccine hesitance and support in online discourse regarding California law SB277.
Soc Sci Med. 2020 Aug 28;266:113216
Authors: DeDominicis K, Buttenheim AM, Howa AC, Delamater PL, Salmon D, Omer SB, Klein NP
Abstract
In 2015, California passed Senate Bill 277 and became the third state in the United States to ban all nonmedical exemptions from school immunization requirements, effectively prohibiting religious and personal belief exemptions. This attracted grassroots opposition and considerable debate among vaccine hesitant factions online. This mixed-methods study used quantitative linguistic analysis, semantic network analysis, and content analysis techniques to examine 2424 online documents drawn from newspapers, blogs, health websites, government information pages, web forums, personal websites, Facebook groups, among others. The study examined which words and phrases were used most frequently by vaccine skeptics, vaccine defenders, and more neutral media accounts to illuminate how groups with different attitudes towards vaccination discuss and disseminate information about vaccines and vaccine policy online. We proposed an innovative methodology for examining online discourse surrounding vaccine hesitance, as well as for studying the online dissemination of misinformation about vaccines. Our findings highlighted discrepancies in the narratives between what vaccine supporters believe causes vaccine skepticism and the issues that vaccine skeptics actually discuss within their own digital spaces. For example, in these exchanges, the importance of parental rights overshadowed that of children's rights; supporters of vaccines brought up autism in more distinct documents than skeptics do; distrust of government regulators and researchers seemed to unite vaccine skeptics and defenders; and politicians, doctors, and even celebrities often served as proxies in heated exchanges about factual evidence, believability, and the importance of expertise in public discourse.
PMID: 33126093 [PubMed - as supplied by publisher]
The Semantic Data Dictionary - An Approach for Describing and Annotating Data.
The Semantic Data Dictionary - An Approach for Describing and Annotating Data.
Data Intell. 2020;2(4):443-486
Authors: Rashid SM, McCusker JP, Pinheiro P, Bax MP, Santos H, Stingone JA, Das AK, McGuinness DL
Abstract
It is common practice for data providers to include text descriptions for each column when publishing datasets in the form of data dictionaries. While these documents are useful in helping an end-user properly interpret the meaning of a column in a dataset, existing data dictionaries typically are not machine-readable and do not follow a common specification standard. We introduce the Semantic Data Dictionary, a specification that formalizes the assignment of a semantic representation of data, enabling standardization and harmonization across diverse datasets. In this paper, we present our Semantic Data Dictionary work in the context of our work with biomedical data; however, the approach can and has been used in a wide range of domains. The rendition of data in this form helps promote improved discovery, interoperability, reuse, traceability, and reproducibility. We present the associated research and describe how the Semantic Data Dictionary can help address existing limitations in the related literature. We discuss our approach, present an example by annotating portions of the publicly available National Health and Nutrition Examination Survey dataset, present modeling challenges, and describe the use of this approach in sponsored research, including our work on a large NIH-funded exposure and health data portal and in the RPI-IBM collaborative Health Empowerment by Analytics, Learning, and Semantics project. We evaluate this work in comparison with traditional data dictionaries, mapping languages, and data integration tools.
PMID: 33103120 [PubMed]
Cartolabe: A Web-Based Scalable Visualization of Large Document Collections.
Cartolabe: A Web-Based Scalable Visualization of Large Document Collections.
IEEE Comput Graph Appl. 2020 Oct 23;PP:
Authors: Caillou P, Renault J, Fekete JD, Letournel AC, Sebag M
Abstract
We describe CARTOLABE, a web-based multi-scale system for visualizing and exploring large textual corpora based on topics, introducing a novel mechanism for the progressive visualization of filtering queries. Initially designed to represent and navigate through scientific publications in different disciplines, CARTOLABE has evolved to become a generic system and accommodate various corpora, ranging from Wikipedia (4.5M entries) to the French National Debate (4.3M entries). CARTOLABE is made of two modules: the first relies on Natural Language Processing methods, converting a corpus and its entities (documents, authors, concepts) into high-dimensional vectors, computing their projection on the 2D plane, and extracting meaningful labels for regions of the plane. The second module is a Web-based visualization, displaying tiles computed from the multidimensional projection of the corpus using the UMAP projection method. This visualization module aims at enabling users with no expertise in visualization and data analysis to get an overview of their corpus, and to interact with it: exploring, querying, filtering, panning and zooming on regions of semantic interest. Three use cases are discussed to illustrate CARTOLABE's versatility and ability to bring large scale textual corpus visualization and exploration to a wide audience.
PMID: 33095705 [PubMed - as supplied by publisher]
Automatically Assessing Quality of Online Health Articles.
Automatically Assessing Quality of Online Health Articles.
IEEE J Biomed Health Inform. 2020 Oct 20;PP:
Authors: Afsana F, Kabir MA, Hassan N, Paul M
Abstract
Today information in the world wide web is overwhelmed by unprecedented quantity of data on versatile topics with varied quality. However, the quality of information disseminated in the field of medicine has been questioned as the negative health consequences of health misinformation can be life-threatening.There is currently no generic automated tool for evaluating the quality of online health information spanned over broad range. To address this gap, in this paper, we applied data mining approach to automatically assess the quality of online health articles based on 10 quality criteria. We have prepared a labelled dataset with 53012 features and applied different feature selection methods to identify the best feature subset with which our trained classifier achieved an accuracy of 84%-90% varied over 10 criteria. Our semantic analysis of features shows the underpinning associations between the selected features and assessment criteria and further rationalize our assessment approach. Our findings will help in identifying high quality health articles and thus aiding users in shaping their opinion to make right choice while picking health related help from online.
PMID: 33079686 [PubMed - as supplied by publisher]
Integrating Unified Medical Language System and Kleinberg's Burst Detection Algorithm into Research Topics of Medications for Post-Traumatic Stress Disorder.
Integrating Unified Medical Language System and Kleinberg's Burst Detection Algorithm into Research Topics of Medications for Post-Traumatic Stress Disorder.
Drug Des Devel Ther. 2020;14:3899-3913
Authors: Xu S, Xu D, Wen L, Zhu C, Yang Y, Han S, Guan P
Abstract
Background: The treatment of post-traumatic stress disorder (PTSD) has long been a challenge because the symptoms of PTSD are multifaceted. PTSD is primarily treated with psychotherapy and medication, or a combination of psychotherapy and medication. The present study was designed to analyze the literature on medications for PTSD and explore high-frequency common drugs and low-frequency burst drugs by burst detection algorithm combined with Unified Medical Language System (UMLS) and provide references for developing new drugs for PTSD.
Methods: Publications related to medications for PTSD from 2010 to 2019 were identified through PubMed, Web of Science Core Collection, and BIOSIS Previews. SemRep and SemRep semantic result processing system were performed to extract the set of drug concepts with therapeutic relationship according to the semantic relationship of UMLS. Kleinberg's burst detection algorithm was applied to calculate the burst weight index of drug concepts by a Java-based program. These concepts were sorted according to the frequency and the burst weight index.
Results: Four hundred and fifty-nine treatment-related drug concepts were extracted. The drug with the highest burst weight index was "Psilocybine", a hallucinogen, which was more likely to be a hotspot for the pharmacotherapy of PTSD. The highest frequency concept was "prazosin", which was more likely to be the focus of research in the medications for PTSD.
Conclusion: The present study assessed the medication-related literature on PTSD treatment, providing a framework of burst words detection-based method, a baseline of information for future research and the new attempt for the discovery of textual knowledge. The bibliometric analysis based on the burst detection algorithm combined with UMLS has shown certain feasibility in amplifying the microscopic changes of a specific research direction in a field, it can also be used in other aspects of disease and to explore the trends of various disciplines.
PMID: 33061296 [PubMed - in process]
Creativity in temporal social networks: how divergent thinking is impacted by one's choice of peers.
Creativity in temporal social networks: how divergent thinking is impacted by one's choice of peers.
J R Soc Interface. 2020 Oct;17(171):20200667
Authors: Baten RA, Bagley D, Tenesaca A, Clark F, Bagrow JP, Ghoshal G, Hoque E
Abstract
Creativity is viewed as one of the most important skills in the context of future-of-work. In this paper, we explore how the dynamic (self-organizing) nature of social networks impacts the fostering of creative ideas. We run six trials (N = 288) of a web-based experiment involving divergent ideation tasks. We find that network connections gradually adapt to individual creative performances, as the participants predominantly seek to follow high-performing peers for creative inspirations. We unearth both opportunities and bottlenecks afforded by such self-organization. While exposure to high-performing peers is associated with better creative performances of the followers, we see a counter-effect that choosing to follow the same peers introduces semantic similarities in the followers' ideas. We formulate an agent-based simulation model to capture these intuitions in a tractable manner, and experiment with corner cases of various simulation parameters to assess the generality of the findings. Our findings may help design large-scale interventions to improve the creative aptitude of people interacting in a social network.
PMID: 33050776 [PubMed - in process]
Protein ontology on the semantic web for knowledge discovery.
Protein ontology on the semantic web for knowledge discovery.
Sci Data. 2020 Oct 12;7(1):337
Authors: Chen C, Huang H, Ross KE, Cowart JE, Arighi CN, Wu CH, Natale DA
Abstract
The Protein Ontology (PRO) provides an ontological representation of protein-related entities, ranging from protein families to proteoforms to complexes. Protein Ontology Linked Open Data (LOD) exposes, shares, and connects knowledge about protein-related entities on the Semantic Web using Resource Description Framework (RDF), thus enabling integration with other Linked Open Data for biological knowledge discovery. For example, proteins (or variants thereof) can be retrieved on the basis of specific disease associations. As a community resource, we strive to follow the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles, disseminate regular updates of our data, support multiple methods for accessing, querying and downloading data in various formats, and provide documentation both for scientists and programmers. PRO Linked Open Data can be browsed via faceted browser interface and queried using SPARQL via YASGUI. RDF data dumps are also available for download. Additionally, we developed RESTful APIs to support programmatic data access. We also provide W3C HCLS specification compliant metadata description for our data. The PRO Linked Open Data is available at https://lod.proconsortium.org/ .
PMID: 33046717 [PubMed - in process]
The Synthetic Biology Open Language (SBOL) Version 3: Simplified Data Exchange for Bioengineering.
The Synthetic Biology Open Language (SBOL) Version 3: Simplified Data Exchange for Bioengineering.
Front Bioeng Biotechnol. 2020;8:1009
Authors: McLaughlin JA, Beal J, Mısırlı G, Grünberg R, Bartley BA, Scott-Brown J, Vaidyanathan P, Fontanarrosa P, Oberortner E, Wipat A, Gorochowski TE, Myers CJ
Abstract
The Synthetic Biology Open Language (SBOL) is a community-developed data standard that allows knowledge about biological designs to be captured using a machine-tractable, ontology-backed representation that is built using Semantic Web technologies. While early versions of SBOL focused only on the description of DNA-based components and their sub-components, SBOL can now be used to represent knowledge across multiple scales and throughout the entire synthetic biology workflow, from the specification of a single molecule or DNA fragment through to multicellular systems containing multiple interacting genetic circuits. The third major iteration of the SBOL standard, SBOL3, is an effort to streamline and simplify the underlying data model with a focus on real-world applications, based on experience from the deployment of SBOL in a variety of scientific and industrial settings. Here, we introduce the SBOL3 specification both in comparison to previous versions of SBOL and through practical examples of its use.
PMID: 33015004 [PubMed]
A Web Resource for Exploring the CORD-19 Dataset Using Root- and Rule-Based Phrases.
A Web Resource for Exploring the CORD-19 Dataset Using Root- and Rule-Based Phrases.
J Indian Inst Sci. 2020 Sep 29;:1-7
Authors: Collard J, Bhat T, Subrahmanian E, Monarch I, Tash J, Sriram R, Elliot J
Abstract
This short paper describes a web resource-the NIST CORD-19 Web Resource-for community explorations of the COVID-19 Open Research Dataset (CORD-19). The tools for exploration in the web resource make use of the NIST-developed Root- and Rule-based method, which exploits underlying linguistic structures to create terms that represent phrases in a corpus. The method allows for auto-suggesting-related terms to discover terms to refine the search of a COVID-19 heterogenous document base. The method also produces taxonomic structures in the target domain as well as providing semantic information about the relationships between terms. This term structure can serve as a basis for creating topic modeling and trend analysis tools. In this paper, we describe use of a novel search engine to demonstrate some of the capabilities above.
PMID: 33013023 [PubMed - as supplied by publisher]
Microblog topic identification using Linked Open Data.
Microblog topic identification using Linked Open Data.
PLoS One. 2020;15(8):e0236863
Authors: Yıldırım A, Uskudarli S
Abstract
Much valuable information is embedded in social media posts (microposts) which are contributed by a great variety of persons about subjects that of interest to others. The automated utilization of this information is challenging due to the overwhelming quantity of posts and the distributed nature of the information related to subjects across several posts. Numerous approaches have been proposed to detect topics from collections of microposts, where the topics are represented by lists of terms such as words, phrases, or word embeddings. Such topics are used in tasks like classification and recommendations. The interpretation of topics is considered a separate task in such methods, albeit they are becoming increasingly human-interpretable. This work proposes an approach for identifying machine-interpretable topics of collective interest. We define topics as a set of related elements that are associated by having posted in the same contexts. To represent topics, we introduce an ontology specified according to the W3C recommended standards. The elements of the topics are identified via linking entities to resources published on Linked Open Data (LOD). Such representation enables processing topics to provide insights that go beyond what is explicitly expressed in the microposts. The feasibility of the proposed approach is examined by generating topics from more than one million tweets collected from Twitter during various events. The utility of these topics is demonstrated with a variety of topic-related tasks along with a comparison of the effort required to perform the same tasks with words-list-based representations. Manual evaluation of randomly selected 36 sets of topics yielded 81.0% and 93.3% for the precision and F1 scores respectively.
PMID: 32780736 [PubMed - indexed for MEDLINE]
Developing an ontology for representing the domain knowledge specific to non-pharmacological treatment for agitation in dementia.
Developing an ontology for representing the domain knowledge specific to non-pharmacological treatment for agitation in dementia.
Alzheimers Dement (N Y). 2020;6(1):e12061
Authors: Zhang Z, Yu P, Chang HCR, Lau SK, Tao C, Wang N, Yin M, Deng C
Abstract
Introduction: A large volume of clinical care data has been generated for managing agitation in dementia. However, the valuable information in these data has not been used effectively to generate insights for improving the quality of care. Application of artificial intelligence technologies offers us enormous opportunities to reuse these data. For health data science to achieve this, this study focuses on using ontology to coding clinical knowledge for non-pharmacological treatment of agitation in a machine-readable format.
Methods: The resultant ontology-Dementia-Related Agitation Non-Pharmacological Treatment Ontology (DRANPTO)-was developed using a method adopted from the NeOn methodology.
Results: DRANPTO consisted of 569 concepts and 48 object properties. It meets the standards for biomedical ontology.
Discussion: DRANPTO is the first comprehensive semantic representation of non-pharmacological management for agitation in dementia in the long-term care setting. As a knowledge base, it will play a vital role to facilitate the development of intelligent systems for managing agitation in dementia.
PMID: 32995470 [PubMed]
Intersection of the Web-Based Vaping Narrative With COVID-19: Topic Modeling Study
J Med Internet Res. 2020 Oct 30;22(10):e21743. doi: 10.2196/21743.
ABSTRACT
BACKGROUND: The COVID-19 outbreak was designated a global pandemic on March 11, 2020. The relationship between vaping and contracting COVID-19 is unclear, and information on the internet is conflicting. There is some scientific evidence that vaping cannabidiol (CBD), an active ingredient in cannabis that is obtained from the hemp plant, or other substances is associated with more severe manifestations of COVID-19. However, there is also inaccurate information that vaping can aid COVID-19 treatment, as well as expert opinion that CBD, possibly administered through vaping, can mitigate COVID-19 symptoms. Thus, it is necessary to study the spread of inaccurate information to better understand how to promote scientific knowledge and curb inaccurate information, which is critical to the health of vapers. Inaccurate information about vaping and COVID-19 may affect COVID-19 treatment outcomes.
OBJECTIVE: Using structural topic modeling, we aimed to map temporal trends in the web-based vaping narrative (a large data set comprising web-based vaping chatter from several sources) to indicate how the narrative changed from before to during the COVID-19 pandemic.
METHODS: We obtained data using a textual query that scanned a data pool of approximately 200,000 different domains (4,027,172 documents and 361,100,284 words) such as public internet forums, blogs, and social media, from August 1, 2019, to April 21, 2020. We then used structural topic modeling to understand changes in word prevalence and semantic structures within topics around vaping before and after December 31, 2019, when COVID-19 was reported to the World Health Organization.
RESULTS: Broadly, the web-based vaping narrative can be organized into the following groups or archetypes: harms from vaping; Vaping Regulation; Vaping as Harm Reduction or Treatment; and Vaping Lifestyle. Three archetypes were observed prior to the emergence of COVID-19; however, four archetypes were identified post-COVID-19 (Vaping as Harm Reduction or Treatment was the additional archetype). A topic related to CBD product preference emerged after COVID-19 was first reported, which may be related to the use of CBD by vapers as a COVID-19 treatment.
CONCLUSIONS: Our main finding is the emergence of a vape-administered CBD treatment narrative around COVID-19 when comparing the web-based vaping narratives before and during the COVID-19 pandemic. These results are key to understanding how vapers respond to inaccurate information about COVID-19, optimizing treatment of vapers who contract COVID-19, and possibly minimizing instances of inaccurate information. The findings have implications for the management of COVID-19 among vapers and the monitoring of web-based content pertinent to tobacco to develop targeted interventions to manage COVID-19 among vapers.
PMID:33001829 | PMC:PMC7641646 | DOI:10.2196/21743
PMO: A knowledge representation model towards precision medicine.
PMO: A knowledge representation model towards precision medicine.
Math Biosci Eng. 2020 Jun 08;17(4):4098-4114
Authors: Hou L, Wu M, Kang HY, Zheng S, Shen L, Qian Q, Li J
Abstract
With the rapid development of biomedical technology, amounts of data in the field of precision medicine (PM) are growing exponentially. Valuable knowledge is included in scattered data in which meaningful biomedical entities and their semantic relationships are buried. Therefore, it is necessary to develop a knowledge representation model like ontology to formally represent the relationships among diseases, phenotypes, genes, mutations, drugs, etc. and achieve effective integration of heterogeneous data. On basis of existing work, our study focus on solving the following issues: (i) Selecting the primary entities in PM domain; (ii) collecting and integrating biomedical vocabularies related to the above entities; (iii) defining and normalizing semantic relationships among these entities. We proposed a semi-automated method which improved the original Ontology Development 101 method to build the Precision Medicine Ontology (PMO), including defining the scope of the PMO according to the definition of PM, collecting terms from different biomedical resources, integrating and normalizing the terms by a combination of machine and manual work, defining the annotation properties, reusing existing ontologies and taxonomies, defining semantic relationships, evaluating PMO and creating the PMO website. Finally, the Precision Medicine Vocabulary (PMV) contains 4.53 million terms collected from 62 biomedical vocabularies, and the PMO includes eleven branches of PM concepts such as disease, chemical and drug, phenotype, gene, mutation, gene product and cell, described by 93 semantic relationships among them. PMO is an open, extensible ontology of PM, all of the terms and relationships in which could be obtained from the PMO website (http://www.phoc.org.cn/pmo/). Compared to existing project, our work has brought a broader and deeper coverage of mutation, gene and gene product, which enriches the semantic type and vocabulary in PM domain and benefits all users in terms of medical literature annotation, text mining and knowledge base construction.
PMID: 32987570 [PubMed - in process]
Semantic strategies in ubiquitous music: Deploying the sound sphere ecology in transitional settings.
Semantic strategies in ubiquitous music: Deploying the sound sphere ecology in transitional settings.
Heliyon. 2020 Sep;6(9):e04843
Authors: Keller D, Freitas B, Bessa WRB, Simurra I, Farias FM
Abstract
We report the results of a study involving twenty subjects doing musical activities in transitional settings, supported by an ecology of tools based on the metaphor for creative action Sound Sphere. The Sound Sphere Ecology (SFS) is a set of web-based tools, loosely organized around audio mixing and processing tasks. It employs verbal strategies for knowledge transfer to provide support for lay participants and specialists. To understand how the stakeholders influence and are influenced by this design strategy, we carried out a series of experiments involving assessments of the participants' behaviours and of the sonic products during various creative musical tasks with SFS. The overall results were positive, indicating that the proposed metaphor provides effective support for casual interaction, highlighting the participants' level of engagement. As a downside, the assessments pointed to ease of use as the lowest and less consistent item among the rated creative factors. We discuss the implications of these results and propose various design enhancements to enable the usage of a larger pool of resources. Considering the heterogeneous profiles of casual stakeholders, methodological refinements are also proposed to assess the knowledge gained by the participants during the exploratory activities, while augmenting their ability to share knowledge. This is one of the first studies on creativity-action metaphors for casual interaction.
PMID: 32984585 [PubMed]
Indoor location identification of patients for directing virtual care: An AI approach using machine learning and knowledge-based methods.
Indoor location identification of patients for directing virtual care: An AI approach using machine learning and knowledge-based methods.
Artif Intell Med. 2020 Aug;108:101931
Authors: Van Woensel W, Roy PC, Abidi SSR, Abidi SR
Abstract
In a digitally enabled healthcare setting, we posit that an individual's current location is pivotal for supporting many virtual care services-such as tailoring educational content towards an individual's current location, and, hence, current stage in an acute care process; improving activity recognition for supporting self-management in a home-based setting; and guiding individuals with cognitive decline through daily activities in their home. However, unobtrusively estimating an individual's indoor location in real-world care settings is still a challenging problem. Moreover, the needs of location-specific care interventions go beyond absolute coordinates and require the individual's discrete semantic location; i.e., it is the concrete type of an individual's location (e.g., exam vs. waiting room; bathroom vs. kitchen) that will drive the tailoring of educational content or recognition of activities. We utilized Machine Learning methods to accurately identify an individual's discrete location, together with knowledge-based models and tools to supply the associated semantics of identified locations. We considered clustering solutions to improve localization accuracy at the expense of granularity; and investigate sensor fusion-based heuristics to rule out false location estimates. We present an AI-driven indoor localization approach that integrates both data-driven and knowledge-based processes and artifacts. We illustrate the application of our approach in two compelling healthcare use cases, and empirically validated our localization approach at the emergency unit of a large Canadian pediatric hospital.
PMID: 32972660 [PubMed - in process]