Semantic Web
VEO-Engine: Interfacing and Reasoning with an Emotion Ontology for Device Visual Expression.
VEO-Engine: Interfacing and Reasoning with an Emotion Ontology for Device Visual Expression.
HCI Int 2018 Posters Ext Abstr (2018). 2018 Jul;851:349-355
Authors: Amith M, Lin R, Liang C, Gong Y, Tao C
Abstract
In order for machines to understand or express emotion to users, the specific emotions must be formally defined and the software coded to how those emotions are to be expressed. This is particularly important if devices or computer-based tools are utilized in clinical settings, which may be stressful for patients and where emotions can dominate their decision making. We have reported our development and feasibility results of an ontology, Visualized Emotion Ontology (VEO), that links abstract visualizations that express specific emotions. Here, we used VEO with the VEO-Engine, a software API package that interfaces with the VEO. The VEO-Engine was developed in Java using Apache Jena and OWL-API. The software package was tested on a Raspberry Pi machine with a small touchscreen display that linked each visualization to an emotion. The VEO-Engine stores input parameters of emotional situations and valences to reason and interpret users' emotions using the ontology-based reasoner. With this software, devices can interfaced wirelessly, so smart devices with visual displays can interact with the ontology. By means of the VEO-Engine, we show the portability and usability of the VEO in human-computer interaction.
PMID: 30701263 [PubMed]
Complexity in disease management: A linked data analysis of multimorbidity in Aboriginal and non-Aboriginal patients hospitalised with atherothrombotic disease in Western Australia.
Complexity in disease management: A linked data analysis of multimorbidity in Aboriginal and non-Aboriginal patients hospitalised with atherothrombotic disease in Western Australia.
PLoS One. 2018;13(8):e0201496
Authors: Hussain MA, Katzenellenbogen JM, Sanfilippo FM, Murray K, Thompson SC
Abstract
BACKGROUND: Hospitalisation for atherothrombotic disease (ATD) is expected to rise in coming decades. However, increasingly, associated comorbidities impose challenges in managing patients and deciding appropriate secondary prevention. We investigated the prevalence and pattern of multimorbidity (presence of two or more chronic conditions) in Aboriginal and non-Aboriginal Western Australian residents with ATDs.
METHODS AND FINDINGS: We used population-based de-identified linked administrative health data from 1 January 2000 to 30 June 2014 to identify a cohort of patients aged 25-59 years admitted to Western Australian hospitals with a discharge diagnosis of ATD. The prevalence of common chronic diseases in these patients was estimated and the patterns of comorbidities and multimorbidities empirically explored using two different approaches: identification of the most commonly occurring pairs and triplets of comorbid diseases, and through latent class analysis (LCA). Half of the cohort had multimorbidity, although this was much higher in Aboriginal people (Aboriginal: 79.2% vs. non-Aboriginal: 39.3%). Only a quarter were without any documented comorbidities. Hypertension, diabetes, alcohol abuse disorders and acid peptic diseases were the leading comorbidities in the major comorbid combinations across both Aboriginal and non-Aboriginal cohorts. The LCA identified four and six distinct clinically meaningful classes of multimorbidity for Aboriginal and non-Aboriginal patients, respectively. Out of the six groups in non-Aboriginal patients, four were similar to the groups identified in Aboriginal patients. The largest proportion of patients (33% in Aboriginal and 66% in non-Aboriginal) was assigned to the "minimally diseased" (or relatively healthy) group, with most patients having less than two conditions. Other groups showed variability in degree and pattern of multimorbidity.
CONCLUSION: Multimorbidity is common in ATD patients and the comorbidities tend to interact and cluster together. Physicians need to consider these in their clinical practice. Different treatment and secondary prevention strategies are likely to be useful for management in these cluster groups.
PMID: 30106971 [PubMed - indexed for MEDLINE]
Optimising the use of linked administrative data for infectious diseases research in Australia.
Optimising the use of linked administrative data for infectious diseases research in Australia.
Public Health Res Pract. 2018 Jun 14;28(2):
Authors: Moore HC, Blyth CC
Abstract
Infectious diseases remain a major cause of morbidity in Australia. A wealth of data exists in administrative datasets, which are linked through established data-linkage infrastructure in most Australian states and territories. These linkages can support robust studies to investigate the burden of disease, the relative contribution of various aetiological agents to disease, and the effectiveness of population-based prevention policies - research that is critical to the success of current and future vaccination programs. At a recent symposium in Perth, epidemiologists, clinicians and policy makers in the infectious diseases field discussed the various benefits of, and barriers to, data-linkage research, with a focus on respiratory infection research. A number of issues and recommendations emerged. The demand for data-linkage projects is starting to outweigh the capabilities of exisiting data-linkage infrastructure. There is a need to further streamline processes relating to data access, increase data sharing and conduct nationally collaborative projects. Concerns about data security and sharing across jurisdictional borders can be addressed through multiple safe data solutions. Researchers need to do more to ensure that the benefits of linking datasets to answer policy-relevant questions are being realised for the benefit of community groups, government authorities, funding bodies and policy makers. Increased collaboration and engagement across all sectors can optimise the use of linked data to help reduce the burden of infectious diseases.
PMID: 29925082 [PubMed - indexed for MEDLINE]
queryMed: Semantic Web functions for linking pharmacological and medical knowledge to data.
queryMed: Semantic Web functions for linking pharmacological and medical knowledge to data.
Bioinformatics. 2019 Jan 18;:
Authors: Rivault Y, Dameron O, Le Meur N
Abstract
Summary: In public health research and more precisely in the reuse of electronic health data, selecting patients, identifying specific events and interpreting results typically requires biomedical knowledge. The queryMed R package aims to facilitate the integration of medical and pharmacological knowledge stored in formats compliant with the Linked Data paradigm (e.g. OWL ontologies and RDF datasets) into the R statistical programming environment. We show how it allowed us to identify all the drugs prescribed for critical limb ischemia (CLI) and also to detect one contraindicated prescription for one patient by linking a medical database of 1003 CLI patients to ontologies.
Availability: queryMed is readily usable for medical data mappings and enrichment. Sources, R vignettes and test data are available on GitHub (https://github.com/yannrivault/queryMed) and are archived on Zenodo (https://doi.org/10.5281/zenodo.1323481).
PMID: 30657867 [PubMed - as supplied by publisher]
Unsupervised Low-Dimensional Vector Representations for Words, Phrases and Text that are Transparent, Scalable, and produce Similarity Metrics that are not Redundant with Neural Embeddings.
Unsupervised Low-Dimensional Vector Representations for Words, Phrases and Text that are Transparent, Scalable, and produce Similarity Metrics that are not Redundant with Neural Embeddings.
J Biomed Inform. 2019 Jan 14;:103096
Authors: Smalheiser NR, Cohen AM, Bonifield G
Abstract
Neural embeddings are a popular set of methods for representing words, phrases or text as a low dimensional vector (typically 50-500 dimensions). However, it is difficult to interpret these dimensions in a meaningful manner, and creating neural embeddings requires extensive training and tuning of multiple parameters and hyperparameters. We present here a simple unsupervised method for representing words, phrases or text as a low dimensional vector, in which the meaning and relative importance of dimensions is transparent to inspection. We have created a near-comprehensive vector representation of words, and selected bigrams, trigrams and abbreviations, using the set of titles and abstracts in PubMed as a corpus. This vector is used to create several novel implicit word-word and text-text similarity metrics. The implicit word-word similarity metrics correlate well with human judgement of word pair similarity and relatedness, and outperform or equal all other reported methods on a variety of biomedical benchmarks, including several implementations of neural embeddings trained on PubMed corpora. Our implicit word-word metrics capture different aspects of word-word relatedness than word2vec-based metrics and are only partially correlated (rho = 0.5-0.8 depending on task and corpus). The vector representations of words, bigrams, trigrams, abbreviations, and PubMed title+abstracts are all publicly available from http://arrowsmith.psych.uic.edu/arrowsmith_uic/word_similarity_metrics.html for release under CC-BY-NC license. Several public web query interfaces are also available at the same site, including one which allows the user to specify a given word and view its most closely related terms according to direct co-occurrence as well as different implicit similarity metrics.
PMID: 30654030 [PubMed - as supplied by publisher]
Prediction of venous thromboembolism using semantic and sentiment analyses of clinical narratives.
Prediction of venous thromboembolism using semantic and sentiment analyses of clinical narratives.
Comput Biol Med. 2018 03 01;94:1-10
Authors: Sabra S, Mahmood Malik K, Alobaidi M
Abstract
Venous thromboembolism (VTE) is the third most common cardiovascular disorder. It affects people of both genders at ages as young as 20 years. The increased number of VTE cases with a high fatality rate of 25% at first occurrence makes preventive measures essential. Clinical narratives are a rich source of knowledge and should be included in the diagnosis and treatment processes, as they may contain critical information on risk factors. It is very important to make such narrative blocks of information usable for searching, health analytics, and decision-making. This paper proposes a Semantic Extraction and Sentiment Assessment of Risk Factors (SESARF) framework. Unlike traditional machine-learning approaches, SESARF, which consists of two main algorithms, namely, ExtractRiskFactor and FindSeverity, prepares a feature vector as the input to a support vector machine (SVM) classifier to make a diagnosis. SESARF matches and maps the concepts of VTE risk factors and finds adjectives and adverbs that reflect their levels of severity. SESARF uses a semantic- and sentiment-based approach to analyze clinical narratives of electronic health records (EHR) and then predict a diagnosis of VTE. We use a dataset of 150 clinical narratives, 80% of which are used to train our prediction classifier support vector machine, with the remaining 20% used for testing. Semantic extraction and sentiment analysis results yielded precisions of 81% and 70%, respectively. Using a support vector machine, prediction of patients with VTE yielded precision and recall values of 54.5% and 85.7%, respectively.
PMID: 29353160 [PubMed - indexed for MEDLINE]
Fuzzy Ontology and LSTM-Based Text Mining: A Transportation Network Monitoring System for Assisting Travel.
Fuzzy Ontology and LSTM-Based Text Mining: A Transportation Network Monitoring System for Assisting Travel.
Sensors (Basel). 2019 Jan 09;19(2):
Authors: Ali F, El-Sappagh S, Kwak D
Abstract
Intelligent Transportation Systems (ITSs) utilize a sensor network-based system to gather and interpret traffic information. In addition, mobility users utilize mobile applications to collect transport information for safe traveling. However, these types of information are not sufficient to examine all aspects of the transportation networks. Therefore, both ITSs and mobility users need a smart approach and social media data, which can help ITSs examine transport services, support traffic and control management, and help mobility users travel safely. People utilize social networks to share their thoughts and opinions regarding transportation, which are useful for ITSs and travelers. However, user-generated text on social media is short in length, unstructured, and covers a broad range of dynamic topics. The application of recent Machine Learning (ML) approach is inefficient for extracting relevant features from unstructured data, detecting word polarity of features, and classifying the sentiment of features correctly. In addition, ML classifiers consistently miss the semantic feature of the word meaning. A novel fuzzy ontology-based semantic knowledge with Word2vec model is proposed to improve the task of transportation features extraction and text classification using the Bi-directional Long Short-Term Memory (Bi-LSTM) approach. The proposed fuzzy ontology describes semantic knowledge about entities and features and their relation in the transportation domain. Fuzzy ontology and smart methodology are developed in Web Ontology Language and Java, respectively. By utilizing word embedding with fuzzy ontology as a representation of text, Bi-LSTM shows satisfactory improvement in both the extraction of features and the classification of the unstructured text of social media.
PMID: 30634527 [PubMed - in process]
Assessing HPV vaccination perceptions with online social media in Italy.
Assessing HPV vaccination perceptions with online social media in Italy.
Int J Gynecol Cancer. 2019 Jan 10;:
Authors: Angioli R, Casciello M, Lopez S, Plotti F, Minco LD, Frati P, Fineschi V, Panici PB, Scaletta G, Capriglione S, Miranda A, Feole L, Terranova C
Abstract
OBJECTIVE: Because of the widespread availability of the internet and social media, people often collect and disseminate news online making it important to understand the underlying mechanisms to steer promotional strategies in healthcare. The aim of this study is to analyze perceptions regarding the human papillomavirus (HPV) vaccine in Italy.
METHODS: From August 2015 to July 2016, articles, news, posts, and tweets were collected from social networks, posts on forums, blogs, and pictures about HPV. Using other keywords and specific semantic rules, we selected conversations presenting the negative or positive perceptions of HPV. We divided them into subgroups depending on the website, publication date, authors, main theme, and transmission modality.
RESULTS: Most conversations occurred on social networks. Of all the conversations regarding HPV, more than 50% were about vaccination. With regard to conversations exclusively on the HPV vaccine, 47%, 32%, and 21% were positive, negative and neutral, respectively. Only 9% of the conversations mentioned the vaccine trade name and, in these conversations, perception was almost always negative. We observed many peaks in positive conversation trends compared with negative trends. The peaks were related to the web dissemination of particular news regarding HPV vaccination.
CONCLUSIONS: In this study we have shown how mass media influences the diffusion of both negative and positive perceptions about HPV vaccines and suggest better ways to inform people about the importance of HPV vaccination.
PMID: 30630890 [PubMed - as supplied by publisher]
TogoGenome/TogoStanza: modularized Semantic Web genome database.
TogoGenome/TogoStanza: modularized Semantic Web genome database.
Database (Oxford). 2019 Jan 01;2019:
Authors: Katayama T, Kawashima S, Okamoto S, Moriya Y, Chiba H, Naito Y, Fujisawa T, Mori H, Takagi T
Abstract
TogoGenome is a genome database that is purely based on the Semantic Web technology, which enables the integration of heterogeneous data and flexible semantic searches. All the information is stored as Resource Description Framework (RDF) data, and the reporting web pages are generated on the fly using SPARQL Protocol and RDF Query Language (SPARQL) queries. TogoGenome provides a semantic-faceted search system by gene functional annotation, taxonomy, phenotypes and environment based on the relevant ontologies. TogoGenome also serves as an interface to conduct semantic comparative genomics by which a user can observe pan-organism or organism-specific genes based on the functional aspect of gene annotations and the combinations of organisms from different taxa. The TogoGenome database exhibits a modularized structure, and each module in the report pages is separately served as TogoStanza, which is a generic framework for rendering an information block as IFRAME/Web Components, which can, unlike several other monolithic databases, also be reused to construct other databases. TogoGenome and TogoStanza have been under development since 2012 and are freely available along with their source codes on the GitHub repositories at https://github.com/togogenome/ and https://github.com/togostanza/, respectively, under the MIT license.
PMID: 30624651 [PubMed - in process]
Investigating the role of interleukin-1 beta and glutamate in inflammatory bowel disease and epilepsy using discovery browsing.
Investigating the role of interleukin-1 beta and glutamate in inflammatory bowel disease and epilepsy using discovery browsing.
J Biomed Semantics. 2018 Dec 27;9(1):25
Authors: Rindflesch TC, Blake CL, Cairelli MJ, Fiszman M, Zeiss CJ, Kilicoglu H
Abstract
BACKGROUND: Structured electronic health records are a rich resource for identifying novel correlations, such as co-morbidities and adverse drug reactions. For drug development and better understanding of biomedical phenomena, such correlations need to be supported by viable hypotheses about the mechanisms involved, which can then form the basis of experimental investigations.
METHODS: In this study, we demonstrate the use of discovery browsing, a literature-based discovery method, to generate plausible hypotheses elucidating correlations identified from structured clinical data. The method is supported by Semantic MEDLINE web application, which pinpoints interesting concepts and relevant MEDLINE citations, which are used to build a coherent hypothesis.
RESULTS: Discovery browsing revealed a plausible explanation for the correlation between epilepsy and inflammatory bowel disease that was found in an earlier population study. The generated hypothesis involves interleukin-1 beta (IL-1 beta) and glutamate, and suggests that IL-1 beta influence on glutamate levels is involved in the etiology of both epilepsy and inflammatory bowel disease.
CONCLUSIONS: The approach presented in this paper can supplement population-based correlation studies by enabling the scientist to identify literature that may justify the novel patterns identified in such studies and can underpin basic biomedical research that can lead to improved treatments and better healthcare outcomes.
PMID: 30587224 [PubMed - in process]
Big Data analysis to improve care for people living with serious illness: The potential to use new emerging technology in palliative care.
Big Data analysis to improve care for people living with serious illness: The potential to use new emerging technology in palliative care.
Palliat Med. 2018 01;32(1):164-166
Authors: Nwosu AC, Collins B, Mason S
PMID: 28805118 [PubMed - indexed for MEDLINE]
Identifying Principles for the Construction of an Ontology-Based Knowledge Base: A Case Study Approach.
Identifying Principles for the Construction of an Ontology-Based Knowledge Base: A Case Study Approach.
JMIR Med Inform. 2018 Dec 21;6(4):e52
Authors: Jing X, Hardiker NR, Kay S, Gao Y
Abstract
BACKGROUND: Ontologies are key enabling technologies for the Semantic Web. The Web Ontology Language (OWL) is a semantic markup language for publishing and sharing ontologies.
OBJECTIVE: The supply of customizable, computable, and formally represented molecular genetics information and health information, via electronic health record (EHR) interfaces, can play a critical role in achieving precision medicine. In this study, we used cystic fibrosis as an example to build an Ontology-based Knowledge Base prototype on Cystic Fibrobis (OntoKBCF) to supply such information via an EHR prototype. In addition, we elaborate on the construction and representation principles, approaches, applications, and representation challenges that we faced in the construction of OntoKBCF. The principles and approaches can be referenced and applied in constructing other ontology-based domain knowledge bases.
METHODS: First, we defined the scope of OntoKBCF according to possible clinical information needs about cystic fibrosis on both a molecular level and a clinical phenotype level. We then selected the knowledge sources to be represented in OntoKBCF. We utilized top-to-bottom content analysis and bottom-up construction to build OntoKBCF. Protégé-OWL was used to construct OntoKBCF. The construction principles included (1) to use existing basic terms as much as possible; (2) to use intersection and combination in representations; (3) to represent as many different types of facts as possible; and (4) to provide 2-5 examples for each type. HermiT 1.3.8.413 within Protégé-5.1.0 was used to check the consistency of OntoKBCF.
RESULTS: OntoKBCF was constructed successfully, with the inclusion of 408 classes, 35 properties, and 113 equivalent classes. OntoKBCF includes both atomic concepts (such as amino acid) and complex concepts (such as "adolescent female cystic fibrosis patient") and their descriptions. We demonstrated that OntoKBCF could make customizable molecular and health information available automatically and usable via an EHR prototype. The main challenges include the provision of a more comprehensive account of different patient groups as well as the representation of uncertain knowledge, ambiguous concepts, and negative statements and more complicated and detailed molecular mechanisms or pathway information about cystic fibrosis.
CONCLUSIONS: Although cystic fibrosis is just one example, based on the current structure of OntoKBCF, it should be relatively straightforward to extend the prototype to cover different topics. Moreover, the principles underpinning its development could be reused for building alternative human monogenetic diseases knowledge bases.
PMID: 30578220 [PubMed]
Modifier Ontologies for frequency, certainty, degree, and coverage phenotype modifier.
Modifier Ontologies for frequency, certainty, degree, and coverage phenotype modifier.
Biodivers Data J. 2018;(6):e29232
Authors: Endara L, Thessen AE, Cole HA, Walls R, Gkoutos G, Cao Y, Chong SS, Cui H
Abstract
Background: When phenotypic characters are described in the literature, they may be constrained or clarified with additional information such as the location or degree of expression, these terms are called "modifiers". With effort underway to convert narrative character descriptions to computable data, ontologies for such modifiers are needed. Such ontologies can also be used to guide term usage in future publications. Spatial and method modifiers are the subjects of ontologies that already have been developed or are under development. In this work, frequency (e.g., rarely, usually), certainty (e.g., probably, definitely), degree (e.g., slightly, extremely), and coverage modifiers (e.g., sparsely, entirely) are collected, reviewed, and used to create two modifier ontologies with different design considerations. The basic goal is to express the sequential relationships within a type of modifiers, for example, usually is more frequent than rarely, in order to allow data annotated with ontology terms to be classified accordingly. Method: Two designs are proposed for the ontology, both using the list pattern: a closed ordered list (i.e., five-bin design) and an open ordered list design. The five-bin design puts the modifier terms into a set of 5 fixed bins with interval object properties, for example, one_level_more/less_frequently_than, where new terms can only be added as synonyms to existing classes. The open list approach starts with 5 bins, but supports the extensibility of the list via ordinal properties, for example, more/less_frequently_than, allowing new terms to be inserted as a new class anywhere in the list. The consequences of the different design decisions are discussed in the paper. CharaParser was used to extract modifiers from plant, ant, and other taxonomic descriptions. After a manual screening, 130 modifier words were selected as the candidate terms for the modifier ontologies. Four curators/experts (three biologists and one information scientist specialized in biosemantics) reviewed and categorized the terms into 20 bins using the Ontology Term Organizer (OTO) (http://biosemantics.arizona.edu/OTO). Inter-curator variations were reviewed and expressed in the final ontologies. Results: Frequency, certainty, degree, and coverage terms with complete agreement among all curators were used as class labels or exact synonyms. Terms with different interpretations were either excluded or included using "broader synonym" or "not recommended" annotation properties. These annotations explicitly allow for the user to be aware of the semantic ambiguity associated with the terms and whether they should be used with caution or avoided. Expert categorization results showed that 16 out of 20 bins contained terms with full agreements, suggesting differentiating the modifiers into 5 levels/bins balances the need to differentiate modifiers and the need for the ontology to reflect user consensus. Two ontologies, developed using the Protege ontology editor, are made available as OWL files and can be downloaded from https://github.com/biosemantics/ontologies. Contribution: We built the first two modifier ontologies following a consensus-based approach with terms commonly used in taxonomic literature. The five-bin ontology has been used in the Explorer of Taxon Concepts web toolkit to compute the similarity between characters extracted from literature to facilitate taxon concepts alignments. The two ontologies will also be used in an ontology-informed authoring tool for taxonomists to facilitate consistency in modifier term usage.
PMID: 30532623 [PubMed]
Analysis for the design of a novel integrated framework for the return to work of wheelchair users.
Analysis for the design of a novel integrated framework for the return to work of wheelchair users.
Work. 2018 Nov 27;:
Authors: Arlati S, Spoladore D, Mottura S, Zangiacomi A, Ferrigno G, Sacchetti R, Sacco M
Abstract
BACKGROUND: Return to work represents an important milestone for workers who were injured during a workplace accident, especially if the injury results in needing a wheelchair for locomotion.
OBJECTIVE: The aim of the study was to design a framework for training novice wheelchair users in regaining autonomy in activities of daily living and in the workplace and for providing medical personnel with objective data on users' health and work-related capabilities.
METHODS: The framework design was accomplished following the "Usability Engineering Life Cycle" model. According to it, three subsequent steps defined as "Know your User", "Competitive Analysis" and "Participatory Design" have been carried out to devise the described framework.
RESULTS: The needs of the end-users of the framework were identified during the first phase; the Competitive Analysis phase addressed standard care solutions, Virtual Reality-based wheelchair simulators, the current methodologies for the assessment of the health condition of people with disability and the use of semantic technologies in human resources. The Participatory Design phase led to the definition of an integrated user-centred framework supporting the return to work of wheelchair users.
CONCLUSION: The results of this work consists in the design of an innovative training process based on virtual reality scenarios and supported by semantic web technologies. In the near future, the design process will proceed in collaboration with the Italian National Institute for Insurance against Accidents at Work (INAIL). The whole framework will be then implemented to support the current vocational rehabilitation process within INAIL premises.
PMID: 30507601 [PubMed - as supplied by publisher]
Discovery of Emerging Design Patterns in Ontologies Using Tree Mining.
Discovery of Emerging Design Patterns in Ontologies Using Tree Mining.
Semant Web. 2018;9(4):517-544
Authors: Ławrynowicz A, Potoniec J, Robaczyk M, Tudorache T
Abstract
The research goal of this work is to investigate modeling patterns that recur in ontologies. Such patterns may originate from certain design solutions, and they may possibly indicate emerging ontology design patterns. We describe our tree-mining method for identifying the emerging design patterns. The method works in two steps: (1) we transform the ontology axioms in a tree shape in order to find axiom patterns; and then, (2) we use association analysis to mine co-occuring axiom patterns in order to extract emerging design patterns. We conduct an experimental study on a set of 331 ontologies from the BioPortal repository. We show that recurring axiom patterns appear across all individual ontologies, as well as across the whole set. In individual ontologies, we find frequent and non-trivial patterns with and without variables. Some of the former patterns have more than 300,000 occurrences. The longest pattern without a variable discovered from the whole ontology set has size 12, and it appears in 14 ontologies. To the best of our knowledge, this is the first method for automatic discovery of emerging design patterns in ontologies. Finally, we demonstrate that we are able to automatically detect patterns, for which we have manually confirmed that they are fragments of ontology design patterns described in the literature. Since our method is not specific to particular ontologies, we conclude that we should be able to discover new, emerging design patterns for arbitrary ontology sets.
PMID: 30505251 [PubMed]
ISO-FOOD ontology: A formal representation of the knowledge within the domain of isotopes for food science.
ISO-FOOD ontology: A formal representation of the knowledge within the domain of isotopes for food science.
Food Chem. 2019 Mar 30;277:382-390
Authors: Eftimov T, Ispirova G, Potočnik D, Ogrinc N, Koroušić Seljak B
Abstract
To link and harmonize different knowledge repositories with respect to isotopic data, we propose an ISO-FOOD ontology as a domain ontology for describing isotopic data within Food Science. The ISO-FOOD ontology consists of metadata and provenance data that needs to be stored together with data elements in order to describe isotopic measurements with all necessary information required for future analysis. The new domain has been linked with existing ontologies, such as Units of Measurements Ontology, Food, Nutrient and the Bibliographic Ontology. To show how such an ontology can be used in practise, it was populated with 20 isotopic measurements of Slovenian food samples. Describing data in this way offers a powerful technique for organizing and sharing stable isotope data across Food Science.
PMID: 30502161 [PubMed - in process]
Design Methodology of Microservices to Support Predictive Analytics for IoT Applications.
Design Methodology of Microservices to Support Predictive Analytics for IoT Applications.
Sensors (Basel). 2018 Dec 02;18(12):
Authors: Ali S, Jarwar MA, Chong I
Abstract
In the era of digital transformation, the Internet of Things (IoT) is emerging with improved data collection methods, advanced data processing mechanisms, enhanced analytic techniques, and modern service platforms. However, one of the major challenges is to provide an integrated design that can provide analytic capability for heterogeneous types of data and support the IoT applications with modular and robust services in an environment where the requirements keep changing. An enhanced analytic functionality not only provides insights from IoT data, but also fosters productivity of processes. Developing an efficient and easily maintainable IoT analytic system is a challenging endeavor due to many reasons such as heterogeneous data sources, growing data volumes, and monolithic service development approaches. In this view, the article proposes a design methodology that presents analytic capabilities embedded in modular microservices to realize efficient and scalable services in order to support adaptive IoT applications. Algorithms for analytic procedures are developed to underpin the model. We implement the Web Objects to virtualize IoT resources. The semantic data modeling is used to promote interoperability across the heterogeneous systems. We demonstrate the use case scenario and validate the proposed design with a prototype implementation.
PMID: 30513822 [PubMed - in process]
Does pre-activating domain knowledge foster elaborated online information search strategies? Comparisons between young and old web user adults.
Does pre-activating domain knowledge foster elaborated online information search strategies? Comparisons between young and old web user adults.
Appl Ergon. 2019 Feb;75:201-213
Authors: Sanchiz M, Amadieu F, Fu WT, Chevalier A
Abstract
The present study aimed at investigating how pre-activating prior topic knowledge before browsing the web can support information search performance and strategies of young and older users. The experiment focus on analyzing to what extent prior knowledge pre-activation might cope with older users' difficulties when interacting with a search engine. 26 older (age 60 to 77) and 22 young (age 18 to 32) adults performed 6 information search problems related to health and fantastic movies. Overall, results showed that pre-activating prior topic knowledge increased the time spent evaluating the search engine results pages, fostered deeper processing of the navigational paths elaborated (and thus reduced the exploration of different navigational paths) and improved the semantic specificity of queries. Pre-activating prior knowledge helped older adults produced semantically more specific queries when they had lower prior-knowledge than young adults. Moderation analyses indicated that the pre-activation supported older adults' search performance under the condition that participants generated semantically relevant keywords during this pre-activation task. Implications of these results show that prior topic knowledge pre-activation may be a good lead to support the beneficial role of prior knowledge in older users' search behavior and performance. Recommendations for design pre-activation support tool are provided.
PMID: 30509528 [PubMed - in process]
Agronomic Linked Data (AgroLD): A knowledge-based system to enable integrative biology in agronomy.
Agronomic Linked Data (AgroLD): A knowledge-based system to enable integrative biology in agronomy.
PLoS One. 2018;13(11):e0198270
Authors: Venkatesan A, Tagny Ngompe G, Hassouni NE, Chentli I, Guignon V, Jonquet C, Ruiz M, Larmande P
Abstract
Recent advances in high-throughput technologies have resulted in a tremendous increase in the amount of omics data produced in plant science. This increase, in conjunction with the heterogeneity and variability of the data, presents a major challenge to adopt an integrative research approach. We are facing an urgent need to effectively integrate and assimilate complementary datasets to understand the biological system as a whole. The Semantic Web offers technologies for the integration of heterogeneous data and their transformation into explicit knowledge thanks to ontologies. We have developed the Agronomic Linked Data (AgroLD- www.agrold.org), a knowledge-based system relying on Semantic Web technologies and exploiting standard domain ontologies, to integrate data about plant species of high interest for the plant science community e.g., rice, wheat, arabidopsis. We present some integration results of the project, which initially focused on genomics, proteomics and phenomics. AgroLD is now an RDF (Resource Description Format) knowledge base of 100M triples created by annotating and integrating more than 50 datasets coming from 10 data sources-such as Gramene.org and TropGeneDB-with 10 ontologies-such as the Gene Ontology and Plant Trait Ontology. Our evaluation results show users appreciate the multiple query modes which support different use cases. AgroLD's objective is to offer a domain specific knowledge platform to solve complex biological and agronomical questions related to the implication of genes/proteins in, for instances, plant disease resistance or high yield traits. We expect the resolution of these questions to facilitate the formulation of new scientific hypotheses to be validated with a knowledge-oriented approach.
PMID: 30500839 [PubMed - in process]
Developing a healthcare dataset information resource (DIR) based on Semantic Web.
Developing a healthcare dataset information resource (DIR) based on Semantic Web.
BMC Med Genomics. 2018 Nov 20;11(Suppl 5):102
Authors: Shi J, Zheng M, Yao L, Ge Y
Abstract
BACKGROUND: The right dataset is essential to obtain the right insights in data science; therefore, it is important for data scientists to have a good understanding of the availability of relevant datasets as well as the content, structure, and existing analyses of these datasets. While a number of efforts are underway to integrate the large amount and variety of datasets, the lack of an information resource that focuses on specific needs of target users of datasets has existed as a problem for years. To address this gap, we have developed a Dataset Information Resource (DIR), using a user-oriented approach, which gathers relevant dataset knowledge for specific user types. In the present version, we specifically address the challenges of entry-level data scientists in learning to identify, understand, and analyze major datasets in healthcare. We emphasize that the DIR does not contain actual data from the datasets but aims to provide comprehensive knowledge about the datasets and their analyses.
METHODS: The DIR leverages Semantic Web technologies and the W3C Dataset Description Profile as the standard for knowledge integration and representation. To extract tailored knowledge for target users, we have developed methods for manual extractions from dataset documentations as well as semi-automatic extractions from related publications, using natural language processing (NLP)-based approaches. A semantic query component is available for knowledge retrieval, and a parameterized question-answering functionality is provided to facilitate the ease of search.
RESULTS: The DIR prototype is composed of four major components-dataset metadata and related knowledge, search modules, question answering for frequently-asked questions, and blogs. The current implementation includes information on 12 commonly used large and complex healthcare datasets. The initial usage evaluation based on health informatics novices indicates that the DIR is helpful and beginner-friendly.
CONCLUSIONS: We have developed a novel user-oriented DIR that provides dataset knowledge specialized for target user groups. Knowledge about datasets is effectively represented in the Semantic Web. At this initial stage, the DIR has already been able to provide sophisticated and relevant knowledge of 12 datasets to help entry health informacians learn healthcare data analysis using suitable datasets. Further development of both content and function levels is underway.
PMID: 30453940 [PubMed - in process]