Semantic Web

ISO-FOOD ontology: A formal representation of the knowledge within the domain of isotopes for food science.

Fri, 2018-12-07 12:32
Related Articles

ISO-FOOD ontology: A formal representation of the knowledge within the domain of isotopes for food science.

Food Chem. 2019 Mar 30;277:382-390

Authors: Eftimov T, Ispirova G, Potočnik D, Ogrinc N, Koroušić Seljak B

Abstract
To link and harmonize different knowledge repositories with respect to isotopic data, we propose an ISO-FOOD ontology as a domain ontology for describing isotopic data within Food Science. The ISO-FOOD ontology consists of metadata and provenance data that needs to be stored together with data elements in order to describe isotopic measurements with all necessary information required for future analysis. The new domain has been linked with existing ontologies, such as Units of Measurements Ontology, Food, Nutrient and the Bibliographic Ontology. To show how such an ontology can be used in practise, it was populated with 20 isotopic measurements of Slovenian food samples. Describing data in this way offers a powerful technique for organizing and sharing stable isotope data across Food Science.

PMID: 30502161 [PubMed - in process]

Categories: Literature Watch

Design Methodology of Microservices to Support Predictive Analytics for IoT Applications.

Thu, 2018-12-06 11:57

Design Methodology of Microservices to Support Predictive Analytics for IoT Applications.

Sensors (Basel). 2018 Dec 02;18(12):

Authors: Ali S, Jarwar MA, Chong I

Abstract
In the era of digital transformation, the Internet of Things (IoT) is emerging with improved data collection methods, advanced data processing mechanisms, enhanced analytic techniques, and modern service platforms. However, one of the major challenges is to provide an integrated design that can provide analytic capability for heterogeneous types of data and support the IoT applications with modular and robust services in an environment where the requirements keep changing. An enhanced analytic functionality not only provides insights from IoT data, but also fosters productivity of processes. Developing an efficient and easily maintainable IoT analytic system is a challenging endeavor due to many reasons such as heterogeneous data sources, growing data volumes, and monolithic service development approaches. In this view, the article proposes a design methodology that presents analytic capabilities embedded in modular microservices to realize efficient and scalable services in order to support adaptive IoT applications. Algorithms for analytic procedures are developed to underpin the model. We implement the Web Objects to virtualize IoT resources. The semantic data modeling is used to promote interoperability across the heterogeneous systems. We demonstrate the use case scenario and validate the proposed design with a prototype implementation.

PMID: 30513822 [PubMed - in process]

Categories: Literature Watch

Does pre-activating domain knowledge foster elaborated online information search strategies? Comparisons between young and old web user adults.

Wed, 2018-12-05 11:17

Does pre-activating domain knowledge foster elaborated online information search strategies? Comparisons between young and old web user adults.

Appl Ergon. 2019 Feb;75:201-213

Authors: Sanchiz M, Amadieu F, Fu WT, Chevalier A

Abstract
The present study aimed at investigating how pre-activating prior topic knowledge before browsing the web can support information search performance and strategies of young and older users. The experiment focus on analyzing to what extent prior knowledge pre-activation might cope with older users' difficulties when interacting with a search engine. 26 older (age 60 to 77) and 22 young (age 18 to 32) adults performed 6 information search problems related to health and fantastic movies. Overall, results showed that pre-activating prior topic knowledge increased the time spent evaluating the search engine results pages, fostered deeper processing of the navigational paths elaborated (and thus reduced the exploration of different navigational paths) and improved the semantic specificity of queries. Pre-activating prior knowledge helped older adults produced semantically more specific queries when they had lower prior-knowledge than young adults. Moderation analyses indicated that the pre-activation supported older adults' search performance under the condition that participants generated semantically relevant keywords during this pre-activation task. Implications of these results show that prior topic knowledge pre-activation may be a good lead to support the beneficial role of prior knowledge in older users' search behavior and performance. Recommendations for design pre-activation support tool are provided.

PMID: 30509528 [PubMed - in process]

Categories: Literature Watch

Agronomic Linked Data (AgroLD): A knowledge-based system to enable integrative biology in agronomy.

Tue, 2018-12-04 07:42
Related Articles

Agronomic Linked Data (AgroLD): A knowledge-based system to enable integrative biology in agronomy.

PLoS One. 2018;13(11):e0198270

Authors: Venkatesan A, Tagny Ngompe G, Hassouni NE, Chentli I, Guignon V, Jonquet C, Ruiz M, Larmande P

Abstract
Recent advances in high-throughput technologies have resulted in a tremendous increase in the amount of omics data produced in plant science. This increase, in conjunction with the heterogeneity and variability of the data, presents a major challenge to adopt an integrative research approach. We are facing an urgent need to effectively integrate and assimilate complementary datasets to understand the biological system as a whole. The Semantic Web offers technologies for the integration of heterogeneous data and their transformation into explicit knowledge thanks to ontologies. We have developed the Agronomic Linked Data (AgroLD- www.agrold.org), a knowledge-based system relying on Semantic Web technologies and exploiting standard domain ontologies, to integrate data about plant species of high interest for the plant science community e.g., rice, wheat, arabidopsis. We present some integration results of the project, which initially focused on genomics, proteomics and phenomics. AgroLD is now an RDF (Resource Description Format) knowledge base of 100M triples created by annotating and integrating more than 50 datasets coming from 10 data sources-such as Gramene.org and TropGeneDB-with 10 ontologies-such as the Gene Ontology and Plant Trait Ontology. Our evaluation results show users appreciate the multiple query modes which support different use cases. AgroLD's objective is to offer a domain specific knowledge platform to solve complex biological and agronomical questions related to the implication of genes/proteins in, for instances, plant disease resistance or high yield traits. We expect the resolution of these questions to facilitate the formulation of new scientific hypotheses to be validated with a knowledge-oriented approach.

PMID: 30500839 [PubMed - in process]

Categories: Literature Watch

Developing a healthcare dataset information resource (DIR) based on Semantic Web.

Wed, 2018-11-21 09:02
Related Articles

Developing a healthcare dataset information resource (DIR) based on Semantic Web.

BMC Med Genomics. 2018 Nov 20;11(Suppl 5):102

Authors: Shi J, Zheng M, Yao L, Ge Y

Abstract
BACKGROUND: The right dataset is essential to obtain the right insights in data science; therefore, it is important for data scientists to have a good understanding of the availability of relevant datasets as well as the content, structure, and existing analyses of these datasets. While a number of efforts are underway to integrate the large amount and variety of datasets, the lack of an information resource that focuses on specific needs of target users of datasets has existed as a problem for years. To address this gap, we have developed a Dataset Information Resource (DIR), using a user-oriented approach, which gathers relevant dataset knowledge for specific user types. In the present version, we specifically address the challenges of entry-level data scientists in learning to identify, understand, and analyze major datasets in healthcare. We emphasize that the DIR does not contain actual data from the datasets but aims to provide comprehensive knowledge about the datasets and their analyses.
METHODS: The DIR leverages Semantic Web technologies and the W3C Dataset Description Profile as the standard for knowledge integration and representation. To extract tailored knowledge for target users, we have developed methods for manual extractions from dataset documentations as well as semi-automatic extractions from related publications, using natural language processing (NLP)-based approaches. A semantic query component is available for knowledge retrieval, and a parameterized question-answering functionality is provided to facilitate the ease of search.
RESULTS: The DIR prototype is composed of four major components-dataset metadata and related knowledge, search modules, question answering for frequently-asked questions, and blogs. The current implementation includes information on 12 commonly used large and complex healthcare datasets. The initial usage evaluation based on health informatics novices indicates that the DIR is helpful and beginner-friendly.
CONCLUSIONS: We have developed a novel user-oriented DIR that provides dataset knowledge specialized for target user groups. Knowledge about datasets is effectively represented in the Semantic Web. At this initial stage, the DIR has already been able to provide sophisticated and relevant knowledge of 12 datasets to help entry health informacians learn healthcare data analysis using suitable datasets. Further development of both content and function levels is underway.

PMID: 30453940 [PubMed - in process]

Categories: Literature Watch

Automated Characterization of Mobile Health Apps' Features by Extracting Information From the Web: An Exploratory Study.

Tue, 2018-11-20 08:33
Related Articles

Automated Characterization of Mobile Health Apps' Features by Extracting Information From the Web: An Exploratory Study.

Am J Audiol. 2018 Nov 19;27(3S):482-492

Authors: Paglialonga A, Schiavo M, Caiani EG

Abstract
Purpose: The aim of this study was to test the viability of a novel method for automated characterization of mobile health apps.
Method: In this exploratory study, we developed the basic modules of an automated method, based on text analytics, able to characterize the apps' medical specialties by extracting information from the web. We analyzed apps in the Medical and Health & Fitness categories on the U.S. iTunes store.
Results: We automatically crawled 42,007 Medical and 79,557 Health & Fitness apps' webpages. After removing duplicates and non-English apps, the database included 80,490 apps. We tested the accuracy of the automated method on a subset of 400 apps. We observed 91% accuracy for the identification of apps related to health or medicine, 95% accuracy for sensory systems apps, and an average of 82% accuracy for classification into medical specialties.
Conclusions: These preliminary results suggested the viability of automated characterization of apps based on text analytics and highlighted directions for improvement in terms of classification rules and vocabularies, analysis of semantic types, and extraction of key features (promoters, services, and users). The availability of automated tools for app characterization is important as it may support health care professionals in informed, aware selection of health apps to recommend to their patients.

PMID: 30452752 [PubMed - in process]

Categories: Literature Watch

Neo4j graph database realizes efficient storage performance of oilfield ontology.

Sun, 2018-11-18 07:25
Related Articles

Neo4j graph database realizes efficient storage performance of oilfield ontology.

PLoS One. 2018;13(11):e0207595

Authors: Gong F, Ma Y, Gong W, Li X, Li C, Yuan X

Abstract
The integration of oilfield multidisciplinary ontology is increasingly important for the growth of the Semantic Web. However, current methods encounter performance bottlenecks either in storing data and searching for information when processing large amounts of data. To overcome these challenges, we propose a domain-ontology process based on the Neo4j graph database. In this paper, we focus on data storage and information retrieval of oilfield ontology. We have designed mapping rules from ontology files to regulate the Neo4j database, which can greatly reduce the required storage space. A two-tier index architecture, including object and triad indexing, is used to keep loading times low and match with different patterns for accurate retrieval. Therefore, we propose a retrieval method based on this architecture. Based on our evaluation, the retrieval method can save 13.04% of the storage space and improve retrieval efficiency by more than 30 times compared with the methods of relational databases.

PMID: 30444913 [PubMed - in process]

Categories: Literature Watch

Association of Monoclonal Gammopathy with Progression to ESKD among US Veterans.

Sun, 2018-11-18 07:25
Related Articles

Association of Monoclonal Gammopathy with Progression to ESKD among US Veterans.

Clin J Am Soc Nephrol. 2018 Nov 15;:

Authors: Burwick N, Adams SV, Todd-Stenberg JA, Burrows NR, Pavkov ME, O'Hare AM

Abstract
BACKGROUND AND OBJECTIVES: Whether patients with monoclonal protein are at a higher risk for progression of kidney disease is not known. The goal of this study was to measure the association of monoclonal protein with progression to ESKD.
DESIGN, SETTING, PARTICIPANTS, & MEASUREMENTS: This was a retrospective cohort study of 2,156,317 patients who underwent serum creatinine testing between October 1, 2000 and September 30, 2001 at a Department of Veterans Affairs medical center, among whom 21,898 had paraprotein testing within 1 year before or after cohort entry. Progression to ESKD was measured using linked data from the US Renal Data System.
RESULTS: Overall, 1,741,707 cohort members had an eGFR≥60 ml/min per 1.73 m2, 283,988 had an eGFR of 45-59 ml/min per 1.73 m2, 103,123 had an eGFR of 30-44 ml/min per 1.73 m2 and 27,499 had an eGFR of 15-29 ml/min per 1.73 m2. The crude incidence of ESKD ranged from 0.7 to 80 per 1000 person-years from the highest to lowest eGFR category. Patients with low versus preserved eGFR were more likely to be tested for monoclonal protein but no more likely to have a positive test result. In adjusted analyses, a positive versus negative test result was associated with a higher risk of ESKD among patients with an eGFR≥60 ml/min per 1.73 m2 (hazard ratio, 1.67; 95% confidence interval, 1.22 to 2.29) and those with an eGFR of 15-29 ml/min per 1.73 m2 (hazard ratio, 1.38; 95% confidence interval, 1.07 to 1.77), but not among those with an eGFR of 30-59 ml/min per 1.73 m2 . Progression to ESKD was attributed to a monoclonal process in 21 out of 76 versus seven out of 174 patients with monoclonal protein and preserved versus severely reduced eGFR at cohort entry.
CONCLUSIONS: The detection of monoclonal protein provides little information on ESKD risk for most patients with a low eGFR. Further study is required to better understand factors contributing to a positive association of monoclonal protein with ESKD risk in patients with preserved and severely reduced levels of eGFR.

PMID: 30442867 [PubMed - as supplied by publisher]

Categories: Literature Watch

A Genetic Circuit Compiler: Generating Combinatorial Genetic Circuits with Web Semantics and Inference.

Fri, 2018-11-09 07:57
Related Articles

A Genetic Circuit Compiler: Generating Combinatorial Genetic Circuits with Web Semantics and Inference.

ACS Synth Biol. 2018 Nov 08;:

Authors: Waites W, Misirli G, Cavaliere M, Danos V, Wipat A

Abstract
A central strategy of synthetic biology is to understand the basic processes of living creatures through engineering organisms using the same building blocks. Biological machines described in terms of parts can be studied by computer simulation in any of several languages or robotically assembled in vitro. In this paper we present a language, the Genetic Circuit Description Language (GCDL) and a compiler, the Genetic Circuit Compiler (GCC). This language describes genetic circuits at a level of granularity appropriate both for automated assembly in the laboratory and deriving simulation code. The GCDL follows Semantic Web practice and the compiler makes novel use of the logical inference facilities that are therefore available. We present the GCDL and compiler structure as a study of a tool for generating κ-language simulations from semantic descriptions of genetic circuits.

PMID: 30408409 [PubMed - as supplied by publisher]

Categories: Literature Watch

SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes.

Thu, 2018-11-08 07:23
Related Articles

SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes.

BMC Bioinformatics. 2018 Nov 06;19(1):405

Authors: Tchechmedjiev A, Abdaoui A, Emonet V, Zevio S, Jonquet C

Abstract
BACKGROUND: Despite a wide adoption of English in science, a significant amount of biomedical data are produced in other languages, such as French. Yet a majority of natural language processing or semantic tools as well as domain terminologies or ontologies are only available in English, and cannot be readily applied to other languages, due to fundamental linguistic differences. However, semantic resources are required to design semantic indexes and transform biomedical (text)data into knowledge for better information mining and retrieval.
RESULTS: We present the SIFR Annotator ( http://bioportal.lirmm.fr/annotator ), a publicly accessible ontology-based annotation web service to process biomedical text data in French. The service, developed during the Semantic Indexing of French Biomedical Data Resources (2013-2019) project is included in the SIFR BioPortal, an open platform to host French biomedical ontologies and terminologies based on the technology developed by the US National Center for Biomedical Ontology. The portal facilitates use and fostering of ontologies by offering a set of services -search, mappings, metadata, versioning, visualization, recommendation- including for annotation purposes. We introduce the adaptations and improvements made in applying the technology to French as well as a number of language independent additional features -implemented by means of a proxy architecture- in particular annotation scoring and clinical context detection. We evaluate the performance of the SIFR Annotator on different biomedical data, using available French corpora -Quaero (titles from French MEDLINE abstracts and EMEA drug labels) and CépiDC (ICD-10 coding of death certificates)- and discuss our results with respect to the CLEF eHealth information extraction tasks.
CONCLUSIONS: We show the web service performs comparably to other knowledge-based annotation approaches in recognizing entities in biomedical text and reach state-of-the-art levels in clinical context detection (negation, experiencer, temporality). Additionally, the SIFR Annotator is the first openly web accessible tool to annotate and contextualize French biomedical text with ontology concepts leveraging a dictionary currently made of 28 terminologies and ontologies and 333 K concepts. The code is openly available, and we also provide a Docker packaging for easy local deployment to process sensitive (e.g., clinical) data in-house ( https://github.com/sifrproject ).

PMID: 30400805 [PubMed - in process]

Categories: Literature Watch

Analysis of Errors in Dictated Clinical Documents Assisted by Speech Recognition Software and Professional Transcriptionists.

Tue, 2018-10-30 09:02
Related Articles

Analysis of Errors in Dictated Clinical Documents Assisted by Speech Recognition Software and Professional Transcriptionists.

JAMA Netw Open. 2018 Jul;1(3):

Authors: Zhou L, Blackley SV, Kowalski L, Doan R, Acker WW, Landman AB, Kontrient E, Mack D, Meteer M, Bates DW, Goss FR

Abstract
IMPORTANCE: Accurate clinical documentation is critical to health care quality and safety. Dictation services supported by speech recognition (SR) technology and professional medical transcriptionists are widely used by US clinicians. However, the quality of SR-assisted documentation has not been thoroughly studied.
OBJECTIVE: To identify and analyze errors at each stage of the SR-assisted dictation process.
DESIGN SETTING AND PARTICIPANTS: This cross-sectional study collected a stratified random sample of 217 notes (83 office notes, 75 discharge summaries, and 59 operative notes) dictated by 144 physicians between January 1 and December 31, 2016, at 2 health care organizations using Dragon Medical 360 | eScription (Nuance). Errors were annotated in the SR engine-generated document (SR), the medical transcriptionist-edited document (MT), and the physician's signed note (SN). Each document was compared with a criterion standard created from the original audio recordings and medical record review.
MAIN OUTCOMES AND MEASURES: Error rate; mean errors per document; error frequency by general type (eg, deletion), semantic type (eg, medication), and clinical significance; and variations by physician characteristics, note type, and institution.
RESULTS: Among the 217 notes, there were 144 unique dictating physicians: 44 female (30.6%) and 10 unknown sex (6.9%). Mean (SD) physician age was 52 (12.5) years (median [range] age, 54 [28-80] years). Among 121 physicians for whom specialty information was available (84.0%), 35 specialties were represented, including 45 surgeons (37.2%), 30 internists (24.8%), and 46 others (38.0%). The error rate in SR notes was 7.4% (ie, 7.4 errors per 100 words). It decreased to 0.4% after transcriptionist review and 0.3% in SNs. Overall, 96.3% of SR notes, 58.1% of MT notes, and 42.4% of SNs contained errors. Deletions were most common (34.7%), then insertions (27.0%). Among errors at the SR, MT, and SN stages, 15.8%, 26.9%, and 25.9%, respectively, involved clinical information, and 5.7%, 8.9%, and 6.4%, respectively, were clinically significant. Discharge summaries had higher mean SR error rates than other types (8.9% vs 6.6%; difference, 2.3%; 95% CI, 1.0%-3.6%; P < .001). Surgeons' SR notes had lower mean error rates than other physicians' (6.0% vs 8.1%; difference, 2.2%; 95% CI, 0.8%-3.5%; P = .002). One institution had a higher mean SR error rate (7.6% vs 6.6%; difference, 1.0%; 95% CI, -0.2% to 2.8%; P = .10) but lower mean MT and SN error rates (0.3% vs 0.7%; difference, -0.3%; 95% CI, -0.63% to -0.04%; P = .03 and 0.2% vs 0.6%; difference, -0.4%; 95% CI, -0.7% to -0.2%; P = .003).
CONCLUSIONS AND RELEVANCE: Seven in 100 words in SR-generated documents contain errors; many errors involve clinical information. That most errors are corrected before notes are signed demonstrates the importance of manual review, quality assurance, and auditing.

PMID: 30370424 [PubMed]

Categories: Literature Watch

The Semantic Student: Using Knowledge Modeling Activities to Enhance Enquiry-Based Group Learning in Engineering Education.

Tue, 2018-10-30 06:00

The Semantic Student: Using Knowledge Modeling Activities to Enhance Enquiry-Based Group Learning in Engineering Education.

Stud Health Technol Inform. 2018;256:431-443

Authors: Stacey P

Abstract
This paper argues that training engineering students in basic knowledge modeling techniques, using linked data principles, and semantic Web tools - within an enquiry-based group learning environment - enables them to enhance their domain knowledge, and their meta-cognitive skills. Knowledge modeling skills are in keeping with the principles of Universal Design for instruction. Learners are empowered with the regulation of cognition as they become more aware of their own development. This semantic student approach was trialed with a group of 3rd year Computer Engineering Students taking a module on computer architecture. An enquiry-based group learning activity was developed to help learners meet selected module learning outcomes. Learners were required to use semantic feature analysis and linked data principles to create a visual model of their knowledge structure. Results show that overall student attainment was increased when knowledge modeling activities were included as part of the learning process. A recommendation for practice to incorporate knowledge modeling as a learning strategy within an overall engineering curriculum framework is described. This can be achieved using semantic Web technologies such as semantic wikis and linked data tools.

PMID: 30371401 [PubMed - in process]

Categories: Literature Watch

An online tool for measuring and visualizing phenotype similarities using HPO.

Sun, 2018-10-28 07:57

An online tool for measuring and visualizing phenotype similarities using HPO.

BMC Genomics. 2018 Aug 13;19(Suppl 6):571

Authors: Peng J, Xue H, Hui W, Lu J, Chen B, Jiang Q, Shang X, Wang Y

Abstract
BACKGROUND: The Human Phenotype Ontology (HPO) is one of the most popular bioinformatics resources. Recently, HPO-based phenotype semantic similarity has been effectively applied to model patient phenotype data. However, the existing tools are revised based on the Gene Ontology (GO)-based term similarity. The design of the models are not optimized for the unique features of HPO. In addition, existing tools only allow HPO terms as input and only provide pure text-based outputs.
RESULTS: We present PhenoSimWeb, a web application that allows researchers to measure HPO-based phenotype semantic similarities using four approaches borrowed from GO-based similarity measurements. Besides, we provide a approach considering the unique properties of HPO. And, PhenoSimWeb allows text that describes phenotypes as input, since clinical phenotype data is always in text. PhenoSimWeb also provides a graphic visualization interface to visualize the resulting phenotype network.
CONCLUSIONS: PhenoSimWeb is an easy-to-use and functional online application. Researchers can use it to calculate phenotype similarity conveniently, predict phenotype associated genes or diseases, and visualize the network of phenotype interactions. PhenoSimWeb is available at http://120.77.47.2:8080.

PMID: 30367579 [PubMed - in process]

Categories: Literature Watch

An Ontology-Driven Approach for Integrating Intelligence to Manage Human and Ecological Health Risks in the Geospatial Sensor Web.

Sun, 2018-10-28 07:57
Related Articles

An Ontology-Driven Approach for Integrating Intelligence to Manage Human and Ecological Health Risks in the Geospatial Sensor Web.

Sensors (Basel). 2018 Oct 25;18(11):

Authors: Meng X, Wang F, Xie Y, Song G, Ma S, Hu S, Bai J, Yang Y

Abstract
Due to the rapid installation of a massive number of fixed and mobile sensors, monitoring machines are intentionally or unintentionally involved in the production of a large amount of geospatial data. Environmental sensors and related software applications are rapidly altering human lifestyles and even impacting ecological and human health. However, there are rarely specific geospatial sensor web (GSW) applications for certain ecological public health questions. In this paper, we propose an ontology-driven approach for integrating intelligence to manage human and ecological health risks in the GSW. We design a Human and Ecological health Risks Ontology (HERO) based on a semantic sensor network ontology template. We also illustrate a web-based prototype, the Human and Ecological Health Risk Management System (HaEHMS), which helps health experts and decision makers to estimate human and ecological health risks. We demonstrate this intelligent system through a case study of automatic prediction of air quality and related health risk.

PMID: 30366399 [PubMed - in process]

Categories: Literature Watch

Automated ontology generation framework powered by linked biomedical ontologies for disease-drug domain.

Sat, 2018-10-20 07:22
Related Articles

Automated ontology generation framework powered by linked biomedical ontologies for disease-drug domain.

Comput Methods Programs Biomed. 2018 Oct;165:117-128

Authors: Alobaidi M, Malik KM, Hussain M

Abstract
OBJECTIVE AND BACKGROUND: The exponential growth of the unstructured data available in biomedical literature, and Electronic Health Record (EHR), requires powerful novel technologies and architectures to unlock the information hidden in the unstructured data. The success of smart healthcare applications such as clinical decision support systems, disease diagnosis systems, and healthcare management systems depends on knowledge that is understandable by machines to interpret and infer new knowledge from it. In this regard, ontological data models are expected to play a vital role to organize, integrate, and make informative inferences with the knowledge implicit in that unstructured data and represent the resultant knowledge in a form that machines can understand. However, constructing such models is challenging because they demand intensive labor, domain experts, and ontology engineers. Such requirements impose a limit on the scale or scope of ontological data models. We present a framework that will allow mitigating the time-intensity to build ontologies and achieve machine interoperability.
METHODS: Empowered by linked biomedical ontologies, our proposed novel Automated Ontology Generation Framework consists of five major modules: a) Text Processing using compute on demand approach. b) Medical Semantic Annotation using N-Gram, ontology linking and classification algorithms, c) Relation Extraction using graph method and Syntactic Patterns, d), Semantic Enrichment using RDF mining, e) Domain Inference Engine to build the formal ontology.
RESULTS: Quantitative evaluations show 84.78% recall, 53.35% precision, and 67.70% F-measure in terms of disease-drug concepts identification; 85.51% recall, 69.61% precision, and F-measure 76.74% with respect to taxonomic relation extraction; and 77.20% recall, 40.10% precision, and F-measure 52.78% with respect to biomedical non-taxonomic relation extraction.
CONCLUSION: We present an automated ontology generation framework that is empowered by Linked Biomedical Ontologies. This framework integrates various natural language processing, semantic enrichment, syntactic pattern, and graph algorithm based techniques. Moreover, it shows that using Linked Biomedical Ontologies enables a promising solution to the problem of automating the process of disease-drug ontology generation.

PMID: 30337066 [PubMed - in process]

Categories: Literature Watch

The Future of Computational Chemogenomics.

Fri, 2018-10-19 19:02
Related Articles

The Future of Computational Chemogenomics.

Methods Mol Biol. 2018;1825:425-450

Authors: Jacoby E, Brown JB

Abstract
Following the elucidation of the human genome, chemogenomics emerged in the beginning of the twenty-first century as an interdisciplinary research field with the aim to accelerate target and drug discovery by making best usage of the genomic data and the data linkable to it. What started as a systematization approach within protein target families now encompasses all types of chemical compounds and gene products. A key objective of chemogenomics is the establishment, extension, analysis, and prediction of a comprehensive SAR matrix which by application will enable further systematization in drug discovery. Herein we outline future perspectives of chemogenomics including the extension to new molecular modalities, or the potential extension beyond the pharma to the agro and nutrition sectors, and the importance for environmental protection. The focus is on computational sciences with potential applications for compound library design, virtual screening, hit assessment, analysis of phenotypic screens, lead finding and optimization, and systems biology-based prediction of toxicology and translational research.

PMID: 30334216 [PubMed - in process]

Categories: Literature Watch

A Rule-Based Reasoner for Underwater Robots Using OWL and SWRL.

Fri, 2018-10-19 19:02
Related Articles

A Rule-Based Reasoner for Underwater Robots Using OWL and SWRL.

Sensors (Basel). 2018 Oct 16;18(10):

Authors: Zhai Z, Martínez Ortega JF, Lucas Martínez N, Castillejo P

Abstract
Web Ontology Language (OWL) is designed to represent varied knowledge about things and the relationships of things. It is widely used to express complex models and address information heterogeneity of specific domains, such as underwater environments and robots. With the help of OWL, heterogeneous underwater robots are able to cooperate with each other by exchanging information with the same meaning and robot operators can organize the coordination easier. However, OWL has expressivity limitations on representing general rules, especially the statement "If … Then … Else …". Fortunately, the Semantic Web Rule Language (SWRL) has strong rule representation capabilities. In this paper, we propose a rule-based reasoner for inferring and providing query services based on OWL and SWRL. SWRL rules are directly inserted into the ontologies by several steps of model transformations instead of using a specific editor. In the verification experiments, the SWRL rules were successfully and efficiently inserted into the OWL-based ontologies, obtaining completely correct query results. This rule-based reasoner is a promising approach to increase the inference capability of ontology-based models and it achieves significant contributions when semantic queries are done.

PMID: 30332798 [PubMed]

Categories: Literature Watch

Thalia: Semantic search engine for biomedical abstracts.

Thu, 2018-10-18 06:27

Thalia: Semantic search engine for biomedical abstracts.

Bioinformatics. 2018 Oct 17;:

Authors: Soto AJ, Przybyla P, Ananiadou S

Abstract
Summary: While publication rate of the biomedical literature has been growing steadily during the last decades, the accessibility of pertinent research publications for biologist and medical practitioners remains a challenge. This paper describes Thalia, which is a semantic search engine that can recognize eight different types of concepts occurring in biomedical abstracts. Thalia is available via a web-based interface or a RESTful API. A key aspect of our search engine is that it is updated from PubMed on a daily basis. We describe here the main building blocks of our tool as well as an evaluation of the retrieval capabilities of Thalia in the context of a precision medicine dataset.
Availability: Thalia is available at http://nactem.ac.uk/Thalia_BI/.
Supplementary information: Supplementary data are available at Bioinformatics online.

PMID: 30329013 [PubMed - as supplied by publisher]

Categories: Literature Watch

Web-Based Information Infrastructure Increases the Interrater Reliability of Medical Coders: Quasi-Experimental Study.

Wed, 2018-10-17 06:02
Related Articles

Web-Based Information Infrastructure Increases the Interrater Reliability of Medical Coders: Quasi-Experimental Study.

J Med Internet Res. 2018 Oct 15;20(10):e274

Authors: Varghese J, Sandmann S, Dugas M

Abstract
BACKGROUND: Medical coding is essential for standardized communication and integration of clinical data. The Unified Medical Language System by the National Library of Medicine is the largest clinical terminology system for medical coders and Natural Language Processing tools. However, the abundance of ambiguous codes leads to low rates of uniform coding among different coders.
OBJECTIVE: The objective of our study was to measure uniform coding among different medical experts in terms of interrater reliability and analyze the effect on interrater reliability using an expert- and Web-based code suggestion system.
METHODS: We conducted a quasi-experimental study in which 6 medical experts coded 602 medical items from structured quality assurance forms or free-text eligibility criteria of 20 different clinical trials. The medical item content was selected on the basis of mortality-leading diseases according to World Health Organization data. The intervention comprised using a semiautomatic code suggestion tool that is linked to a European information infrastructure providing a large medical text corpus of >300,000 medical form items with expert-assigned semantic codes. Krippendorff alpha (Kalpha) with bootstrap analysis was used for the interrater reliability analysis, and coding times were measured before and after the intervention.
RESULTS: The intervention improved interrater reliability in structured quality assurance form items (from Kalpha=0.50, 95% CI 0.43-0.57 to Kalpha=0.62 95% CI 0.55-0.69) and free-text eligibility criteria (from Kalpha=0.19, 95% CI 0.14-0.24 to Kalpha=0.43, 95% CI 0.37-0.50) while preserving or slightly reducing the mean coding time per item for all 6 coders. Regardless of the intervention, precoordination and structured items were associated with significantly high interrater reliability, but the proportion of items that were precoordinated significantly increased after intervention (eligibility criteria: OR 4.92, 95% CI 2.78-8.72; quality assurance: OR 1.96, 95% CI 1.19-3.25).
CONCLUSIONS: The Web-based code suggestion mechanism improved interrater reliability toward moderate or even substantial intercoder agreement. Precoordination and the use of structured versus free-text data elements are key drivers of higher interrater reliability.

PMID: 30322834 [PubMed - in process]

Categories: Literature Watch

Predictors of long-term care among nonagenarians: the Vitality 90 + Study with linked data of the care registers.

Wed, 2018-10-17 06:02
Related Articles

Predictors of long-term care among nonagenarians: the Vitality 90 + Study with linked data of the care registers.

Aging Clin Exp Res. 2018 Aug;30(8):913-919

Authors: Kauppi M, Raitanen J, Stenholm S, Aaltonen M, Enroth L, Jylhä M

Abstract
BACKGROUND: The need for long-term care services increases with age. However, little is known about the predictors of long-term care (LTC) entry among the oldest old.
AIMS: Aim of this study was to assess predictors of LTC entry in a sample of men and women aged 90 years and older.
METHODS: This study was based on the Vitality 90 + Study, a population-based study of nonagenarians in the city of Tampere, Finland. Baseline information about health, functioning and living conditions were collected by mailed questionnaires. Information about LTC was drawn from care registers during the follow-up period extending up to 11 years. Cox regression models were used for the analyses, taking into account the competing risk of mortality.
RESULTS: During the mean follow-up period of 2.3 years, 844 (43%) subjects entered first time into LTC. Female gender (HR 1.39, 95% CI 1.14-1.69), having at least two chronic conditions (HR 1.24, 95% CI 1.07-1.44), living alone (HR 1.37, 95% CI 1.15-1.63) and help received sometimes (HR 1.23, 95% CI 1.02-1.49) or daily (HR 1.68, 95% CI 1.38-2.04) were independent predictors of LTC entry.
CONCLUSION: Risk of entering into LTC was increased among women, subjects with at least two chronic conditions, those living alone and with higher level of received help. Since number of nonagenarians will increase and the need of care thereby, it is essential to understand predictors of LTC entry to offer appropriate care for the oldest old in future.

PMID: 29222731 [PubMed - indexed for MEDLINE]

Categories: Literature Watch

Pages