Semantic Web

MetSigDis: a manually curated resource for the metabolic signatures of diseases.

Tue, 2017-10-03 06:27

MetSigDis: a manually curated resource for the metabolic signatures of diseases.

Brief Bioinform. 2017 Aug 22;:

Authors: Cheng L, Yang H, Zhao H, Pei X, Shi H, Sun J, Zhang Y, Wang Z, Zhou M

Abstract
Complex diseases cannot be understood only on the basis of single gene, single mRNA transcript or single protein but the effect of their collaborations. The combination consequence in molecular level can be captured by the alterations of metabolites. With the rapidly developing of biomedical instruments and analytical platforms, a large number of metabolite signatures of complex diseases were identified and documented in the literature. Biologists' hardship in the face of this large amount of papers recorded metabolic signatures of experiments' results calls for an automated data repository. Therefore, we developed MetSigDis aiming to provide a comprehensive resource of metabolite alterations in various diseases. MetSigDis is freely available at http://www.bio-annotation.cn/MetSigDis/. By reviewing hundreds of publications, we collected 6849 curated relationships between 2420 metabolites and 129 diseases across eight species involving Homo sapiens and model organisms. All of these relationships were used in constructing a metabolite disease network (MDN). This network displayed scale-free characteristics according to the degree distribution (power-law distribution with R2 = 0.909), and the subnetwork of MDN for interesting diseases and their related metabolites can be visualized in the Web. The common alterations of metabolites reflect the metabolic similarity of diseases, which is measured using Jaccard index. We observed that metabolite-based similar diseases are inclined to share semantic associations of Disease Ontology. A human disease network was then built, where a node represents a disease, and an edge indicates similarity of pair-wise diseases. The network validated the observation that linked diseases based on metabolites should have more overlapped genes.

PMID: 28968812 [PubMed - as supplied by publisher]

Categories: Literature Watch

An automated tool for obtaining QSAR-ready series of compounds using semantic web technologies.

Tue, 2017-10-03 06:27

An automated tool for obtaining QSAR-ready series of compounds using semantic web technologies.

Bioinformatics. 2017 Sep 07;:

Authors: López-Massaguer O, Sanz F, Pastor M

Abstract
Summary: We describe an application (Collector) for obtaining series of compounds annotated with bioactivity data, ready to be used for the development of quantitative structure-activity relationships (QSAR) models. The tool extracts data from the 'Open Pharmacological Space' (OPS) developed by the Open PHACTS project, using as input a valid name of the biological target. Collector uses the OPS ontologies for expanding the query using all known target synonyms and extracts compounds with bioactivity data against the target from multiple sources. The extracted data can be filtered to retain only drug-like compounds and the bioactivities can be automatically summarised to assign a single value per compound, yielding data ready to be used for QSAR modeling. The data obtained is locally stored facilitating the traceability and auditability of the process. Collector was used successfully for the development of models for toxicity endpoints within the eTOX project.
Availability and implementation: The software is available at http://phi.upf.edu/collector . The source code is located at https://github.com/phi-grib/Collector and is free for use under the GPL3 license. The web version is hosted at http://collector.upf.edu /.
Contact: manuel.pastor@upf.edu.
Supplementary information: Supplementary data are available at Bioinformatics online.

PMID: 28968713 [PubMed - as supplied by publisher]

Categories: Literature Watch

Analysis and visualization of disease courses in a semantically-enabled cancer registry.

Sun, 2017-10-01 08:47

Analysis and visualization of disease courses in a semantically-enabled cancer registry.

J Biomed Semantics. 2017 Sep 29;8(1):46

Authors: Esteban-Gil A, Fernández-Breis JT, Boeker M

Abstract
BACKGROUND: Regional and epidemiological cancer registries are important for cancer research and the quality management of cancer treatment. Many technological solutions are available to collect and analyse data for cancer registries nowadays. However, the lack of a well-defined common semantic model is a problem when user-defined analyses and data linking to external resources are required. The objectives of this study are: (1) design of a semantic model for local cancer registries; (2) development of a semantically-enabled cancer registry based on this model; and (3) semantic exploitation of the cancer registry for analysing and visualising disease courses.
RESULTS: Our proposal is based on our previous results and experience working with semantic technologies. Data stored in a cancer registry database were transformed into RDF employing a process driven by OWL ontologies. The semantic representation of the data was then processed to extract semantic patient profiles, which were exploited by means of SPARQL queries to identify groups of similar patients and to analyse the disease timelines of patients. Based on the requirements analysis, we have produced a draft of an ontology that models the semantics of a local cancer registry in a pragmatic extensible way. We have implemented a Semantic Web platform that allows transforming and storing data from cancer registries in RDF. This platform also permits users to formulate incremental user-defined queries through a graphical user interface. The query results can be displayed in several customisable ways. The complex disease timelines of individual patients can be clearly represented. Different events, e.g. different therapies and disease courses, are presented according to their temporal and causal relations.
CONCLUSION: The presented platform is an example of the parallel development of ontologies and applications that take advantage of semantic web technologies in the medical field. The semantic structure of the representation renders it easy to analyse key figures of the patients and their evolution at different granularity levels.

PMID: 28962670 [PubMed - in process]

Categories: Literature Watch

SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data.

Thu, 2017-09-28 10:27
Related Articles

SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data.

Wellcome Open Res. 2016;1:25

Authors: Venkatesan A, Kim JH, Talo F, Ide-Smith M, Gobeill J, Carter J, Batista-Navarro R, Ananiadou S, Ruch P, McEntyre J

Abstract
The tremendous growth in biological data has resulted in an increase in the number of research papers being published. This presents a great challenge for scientists in searching and assimilating facts described in those papers. Particularly, biological databases depend on curators to add highly precise and useful information that are usually extracted by reading research articles. Therefore, there is an urgent need to find ways to improve linking literature to the underlying data, thereby minimising the effort in browsing content and identifying key biological concepts.   As part of the development of Europe PMC, we have developed a new platform, SciLite, which integrates text-mined annotations from different sources and overlays those outputs on research articles. The aim is to aid researchers and curators using Europe PMC in finding key concepts more easily and provide links to related resources or tools, bridging the gap between literature and biological data.

PMID: 28948232 [PubMed]

Categories: Literature Watch

ANALYTiC: An Active Learning System for Trajectory Classification.

Tue, 2017-09-26 06:22

ANALYTiC: An Active Learning System for Trajectory Classification.

IEEE Comput Graph Appl. 2017;37(5):28-39

Authors: Soares Junior A, Renso C, Matwin S

Abstract
The increasing availability and use of positioning devices has resulted in large volumes of trajectory data. However, semantic annotations for such data are typically added by domain experts, which is a time-consuming task. Machine-learning algorithms can help infer semantic annotations from trajectory data by learning from sets of labeled data. Specifically, active learning approaches can minimize the set of trajectories to be annotated while preserving good performance measures. The ANALYTiC web-based interactive tool visually guides users through this annotation process.

PMID: 28945577 [PubMed - in process]

Categories: Literature Watch

Logistic Model to Support Service Modularity for the Promotion of Reusability in a Web Objects-Enabled IoT Environment.

Sat, 2017-09-23 08:05

Logistic Model to Support Service Modularity for the Promotion of Reusability in a Web Objects-Enabled IoT Environment.

Sensors (Basel). 2017 Sep 22;17(10):

Authors: Kibria MG, Ali S, Jarwar MA, Kumar S, Chong I

Abstract
Due to a very large number of connected virtual objects in the surrounding environment, intelligent service features in the Internet of Things requires the reuse of existing virtual objects and composite virtual objects. If a new virtual object is created for each new service request, then the number of virtual object would increase exponentially. The Web of Objects applies the principle of service modularity in terms of virtual objects and composite virtual objects. Service modularity is a key concept in the Web Objects-Enabled Internet of Things (IoT) environment which allows for the reuse of existing virtual objects and composite virtual objects in heterogeneous ontologies. In the case of similar service requests occurring at the same, or different locations, the already-instantiated virtual objects and their composites that exist in the same, or different ontologies can be reused. In this case, similar types of virtual objects and composite virtual objects are searched and matched. Their reuse avoids duplication under similar circumstances, and reduces the time it takes to search and instantiate them from their repositories, where similar functionalities are provided by similar types of virtual objects and their composites. Controlling and maintaining a virtual object means controlling and maintaining a real-world object in the real world. Even though the functional costs of virtual objects are just a fraction of those for deploying and maintaining real-world objects, this article focuses on reusing virtual objects and composite virtual objects, as well as discusses similarity matching of virtual objects and composite virtual objects. This article proposes a logistic model that supports service modularity for the promotion of reusability in the Web Objects-enabled IoT environment. Necessary functional components and a flowchart of an algorithm for reusing composite virtual objects are discussed. Also, to realize the service modularity, a use case scenario is studied and implemented.

PMID: 28937590 [PubMed - in process]

Categories: Literature Watch

The Adverse Drug Reactions from Patient Reports in Social Media Project: Five Major Challenges to Overcome to Operationalize Analysis and Efficiently Support Pharmacovigilance Process.

Sat, 2017-09-23 08:05
Related Articles

The Adverse Drug Reactions from Patient Reports in Social Media Project: Five Major Challenges to Overcome to Operationalize Analysis and Efficiently Support Pharmacovigilance Process.

JMIR Res Protoc. 2017 Sep 21;6(9):e179

Authors: Bousquet C, Dahamna B, Guillemin-Lanne S, Darmoni SJ, Faviez C, Huot C, Katsahian S, Leroux V, Pereira S, Richard C, Schück S, Souvignet J, Lillo-Le Louët A, Texier N

Abstract
BACKGROUND: Adverse drug reactions (ADRs) are an important cause of morbidity and mortality. Classical Pharmacovigilance process is limited by underreporting which justifies the current interest in new knowledge sources such as social media. The Adverse Drug Reactions from Patient Reports in Social Media (ADR-PRISM) project aims to extract ADRs reported by patients in these media. We identified 5 major challenges to overcome to operationalize the analysis of patient posts: (1) variable quality of information on social media, (2) guarantee of data privacy, (3) response to pharmacovigilance expert expectations, (4) identification of relevant information within Web pages, and (5) robust and evolutive architecture.
OBJECTIVE: This article aims to describe the current state of advancement of the ADR-PRISM project by focusing on the solutions we have chosen to address these 5 major challenges.
METHODS: In this article, we propose methods and describe the advancement of this project on several aspects: (1) a quality driven approach for selecting relevant social media for the extraction of knowledge on potential ADRs, (2) an assessment of ethical issues and French regulation for the analysis of data on social media, (3) an analysis of pharmacovigilance expert requirements when reviewing patient posts on the Internet, (4) an extraction method based on natural language processing, pattern based matching, and selection of relevant medical concepts in reference terminologies, and (5) specifications of a component-based architecture for the monitoring system.
RESULTS: Considering the 5 major challenges, we (1) selected a set of 21 validated criteria for selecting social media to support the extraction of potential ADRs, (2) proposed solutions to guarantee data privacy of patients posting on Internet, (3) took into account pharmacovigilance expert requirements with use case diagrams and scenarios, (4) built domain-specific knowledge resources embeding a lexicon, morphological rules, context rules, semantic rules, syntactic rules, and post-analysis processing, and (5) proposed a component-based architecture that allows storage of big data and accessibility to third-party applications through Web services.
CONCLUSIONS: We demonstrated the feasibility of implementing a component-based architecture that allows collection of patient posts on the Internet, near real-time processing of those posts including annotation, and storage in big data structures. In the next steps, we will evaluate the posts identified by the system in social media to clarify the interest and relevance of such approach to improve conventional pharmacovigilance processes based on spontaneous reporting.

PMID: 28935617 [PubMed]

Categories: Literature Watch

PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets.

Fri, 2017-09-22 07:37
Related Articles

PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets.

J Biomed Semantics. 2017 Sep 20;8(1):42

Authors: Djokic-Petrovic M, Cvjetkovic V, Yang J, Zivanovic M, Wild DJ

Abstract
BACKGROUND: There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources.
RESULTS: PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results.
CONCLUSIONS: The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly useful for suggesting new data sources and cost optimization for new experiments. PIBAS FedSPARQL can be expanded with new topics, subtopics and templates on demand, rendering information retrieval more robust.

PMID: 28931422 [PubMed - in process]

Categories: Literature Watch

Fish Ontology framework for taxonomy-based fish recognition.

Thu, 2017-09-21 10:07
Related Articles

Fish Ontology framework for taxonomy-based fish recognition.

PeerJ. 2017;5:e3811

Authors: Ali NM, Khan HA, Then AY, Ving Ching C, Gaur M, Dhillon SK

Abstract
Life science ontologies play an important role in Semantic Web. Given the diversity in fish species and the associated wealth of information, it is imperative to develop an ontology capable of linking and integrating this information in an automated fashion. As such, we introduce the Fish Ontology (FO), an automated classification architecture of existing fish taxa which provides taxonomic information on unknown fish based on metadata restrictions. It is designed to support knowledge discovery, provide semantic annotation of fish and fisheries resources, data integration, and information retrieval. Automated classification for unknown specimens is a unique feature that currently does not appear to exist in other known ontologies. Examples of automated classification for major groups of fish are demonstrated, showing the inferred information by introducing several restrictions at the species or specimen level. The current version of FO has 1,830 classes, includes widely used fisheries terminology, and models major aspects of fish taxonomy, grouping, and character. With more than 30,000 known fish species globally, the FO will be an indispensable tool for fish scientists and other interested users.

PMID: 28929028 [PubMed]

Categories: Literature Watch

Learning From Short Text Streams With Topic Drifts.

Tue, 2017-09-19 06:00

Learning From Short Text Streams With Topic Drifts.

IEEE Trans Cybern. 2017 Sep 18;:

Authors: Li P, He L, Wang H, Hu X, Zhang Y, Li L, Wu X

Abstract
Short text streams such as search snippets and micro blogs have been popular on the Web with the emergence of social media. Unlike traditional normal text streams, these data present the characteristics of short length, weak signal, high volume, high velocity, topic drift, etc. Short text stream classification is hence a very challenging and significant task. However, this challenge has received little attention from the research community. Therefore, a new feature extension approach is proposed for short text stream classification with the help of a large-scale semantic network obtained from a Web corpus. It is built on an incremental ensemble classification model for efficiency. First, more semantic contexts based on the senses of terms in short texts are introduced to make up of the data sparsity using the open semantic network, in which all terms are disambiguated by their semantics to reduce the noise impact. Second, a concept cluster-based topic drifting detection method is proposed to effectively track hidden topic drifts. Finally, extensive studies demonstrate that as compared to several well-known concept drifting detection methods in data stream, our approach can detect topic drifts effectively, and it enables handling short text streams effectively while maintaining the efficiency as compared to several state-of-the-art short text classification approaches.

PMID: 28922135 [PubMed - as supplied by publisher]

Categories: Literature Watch

Extending XNAT Platform with an Incremental Semantic Framework.

Sat, 2017-09-16 07:37
Related Articles

Extending XNAT Platform with an Incremental Semantic Framework.

Front Neuroinform. 2017;11:57

Authors: Timón S, Rincón M, Martínez-Tomás R

Abstract
Informatics increases the yield from neuroscience due to improved data. Data sharing and accessibility enable joint efforts between different research groups, as well as replication studies, pivotal for progress in the field. Research data archiving solutions are evolving rapidly to address these necessities, however, distributed data integration is still difficult because of the need of explicit agreements for disparate data models. To address these problems, ontologies are widely used in biomedical research to obtain common vocabularies and logical descriptions, but its application may suffer from scalability issues, domain bias, and loss of low-level data access. With the aim of improving the application of semantic models in biobanking systems, an incremental semantic framework that takes advantage of the latest advances in biomedical ontologies and the XNAT platform is designed and implemented. We follow a layered architecture that allows the alignment of multi-domain biomedical ontologies to manage data at different levels of abstraction. To illustrate this approach, the development is integrated in the JPND (EU Joint Program for Neurodegenerative Disease) APGeM project, focused on finding early biomarkers for Alzheimer's and other dementia related diseases.

PMID: 28912709 [PubMed]

Categories: Literature Watch

Optimizing a Query by Transformation and Expansion.

Sat, 2017-09-09 06:52

Optimizing a Query by Transformation and Expansion.

Stud Health Technol Inform. 2017;243:197-201

Authors: Glocker K, Knurr A, Dieter J, Dominick F, Forche M, Koch C, Pascoe Pérez A, Roth B, Ückert F

Abstract
In the biomedical sector not only the amount of information produced and uploaded into the web is enormous, but also the number of sources where these data can be found. Clinicians and researchers spend huge amounts of time on trying to access this information and to filter the most important answers to a given question. As the formulation of these queries is crucial, automated query expansion is an effective tool to optimize a query and receive the best possible results. In this paper we introduce the concept of a workflow for an optimization of queries in the medical and biological sector by using a series of tools for expansion and transformation of the query. After the definition of attributes by the user, the query string is compared to previous queries in order to add semantic co-occurring terms to the query. Additionally, the query is enlarged by an inclusion of synonyms. The translation into database specific ontologies ensures the optimal query formulation for the chosen database(s). As this process can be performed in various databases at once, the results are ranked and normalized in order to achieve a comparable list of answers for a question.

PMID: 28883200 [PubMed - in process]

Categories: Literature Watch

Expert2OWL: A Methodology for Pattern-Based Ontology Development.

Sat, 2017-09-09 06:52

Expert2OWL: A Methodology for Pattern-Based Ontology Development.

Stud Health Technol Inform. 2017;243:165-169

Authors: Tahar K, Xu J, Herre H

Abstract
The formalization of expert knowledge enables a broad spectrum of applications employing ontologies as underlying technology. These include eLearning, Semantic Web and expert systems. However, the manual construction of such ontologies is time-consuming and thus expensive. Moreover, experts are often unfamiliar with the syntax and semantics of formal ontology languages such as OWL and usually have no experience in developing formal ontologies. To overcome these barriers, we developed a new method and tool, called Expert2OWL that provides efficient features to support the construction of OWL ontologies using GFO (General Formal Ontology) as a top-level ontology. This method allows a close and effective collaboration between ontologists and domain experts. Essentially, this tool integrates Excel spreadsheets as part of a pattern-based ontology development and refinement process. Expert2OWL enables us to expedite the development process and modularize the resulting ontologies. We applied this method in the field of Chinese Herbal Medicine (CHM) and used Expert2OWL to automatically generate an accurate Chinese Herbology ontology (CHO). The expressivity of CHO was tested and evaluated using ontology query languages SPARQL and DL. CHO shows promising results and can generate answers to important scientific questions such as which Chinese herbal formulas contain which substances, which substances treat which diseases, and which ones are the most frequently used in CHM.

PMID: 28883193 [PubMed - in process]

Categories: Literature Watch

BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.

Sat, 2017-09-09 06:52

BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.

Bioinformatics. 2017 Jul 15;33(14):i49-i58

Authors: Sogancioglu G, Öztürk H, Özgür A

Abstract
Motivation: The amount of information available in textual format is rapidly increasing in the biomedical domain. Therefore, natural language processing (NLP) applications are becoming increasingly important to facilitate the retrieval and analysis of these data. Computing the semantic similarity between sentences is an important component in many NLP tasks including text retrieval and summarization. A number of approaches have been proposed for semantic sentence similarity estimation for generic English. However, our experiments showed that such approaches do not effectively cover biomedical knowledge and produce poor results for biomedical text.
Methods: We propose several approaches for sentence-level semantic similarity computation in the biomedical domain, including string similarity measures and measures based on the distributed vector representations of sentences learned in an unsupervised manner from a large biomedical corpus. In addition, ontology-based approaches are presented that utilize general and domain-specific ontologies. Finally, a supervised regression based model is developed that effectively combines the different similarity computation metrics. A benchmark data set consisting of 100 sentence pairs from the biomedical literature is manually annotated by five human experts and used for evaluating the proposed methods.
Results: The experiments showed that the supervised semantic sentence similarity computation approach obtained the best performance (0.836 correlation with gold standard human annotations) and improved over the state-of-the-art domain-independent systems up to 42.6% in terms of the Pearson correlation metric.
Availability and implementation: A web-based system for biomedical semantic sentence similarity computation, the source code, and the annotated benchmark data set are available at: http://tabilab.cmpe.boun.edu.tr/BIOSSES/ .
Contact: gizemsogancioglu@gmail.com or arzucan.ozgur@boun.edu.tr.

PMID: 28881973 [PubMed - in process]

Categories: Literature Watch

Accent modulates access to word meaning: Evidence for a speaker-model account of spoken word recognition.

Fri, 2017-09-08 06:22

Accent modulates access to word meaning: Evidence for a speaker-model account of spoken word recognition.

Cogn Psychol. 2017 Sep 04;98:73-101

Authors: Cai ZG, Gilbert RA, Davis MH, Gaskell MG, Farrar L, Adler S, Rodd JM

Abstract
Speech carries accent information relevant to determining the speaker's linguistic and social background. A series of web-based experiments demonstrate that accent cues can modulate access to word meaning. In Experiments 1-3, British participants were more likely to retrieve the American dominant meaning (e.g., hat meaning of "bonnet") in a word association task if they heard the words in an American than a British accent. In addition, results from a speeded semantic decision task (Experiment 4) and sentence comprehension task (Experiment 5) confirm that accent modulates on-line meaning retrieval such that comprehension of ambiguous words is easier when the relevant word meaning is dominant in the speaker's dialect. Critically, neutral-accent speech items, created by morphing British- and American-accented recordings, were interpreted in a similar way to accented words when embedded in a context of accented words (Experiment 2). This finding indicates that listeners do not use accent to guide meaning retrieval on a word-by-word basis; instead they use accent information to determine the dialectic identity of a speaker and then use their experience of that dialect to guide meaning access for all words spoken by that person. These results motivate a speaker-model account of spoken word recognition in which comprehenders determine key characteristics of their interlocutor and use this knowledge to guide word meaning access.

PMID: 28881224 [PubMed - as supplied by publisher]

Categories: Literature Watch

RDFIO: extending Semantic MediaWiki for interoperable biomedical data management.

Wed, 2017-09-06 08:40
Related Articles

RDFIO: extending Semantic MediaWiki for interoperable biomedical data management.

J Biomed Semantics. 2017 Sep 04;8(1):35

Authors: Lampa S, Willighagen E, Kohonen P, King A, Vrandečić D, Grafström R, Spjuth O

Abstract
BACKGROUND: Biological sciences are characterised not only by an increasing amount but also the extreme complexity of its data. This stresses the need for efficient ways of integrating these data in a coherent description of biological systems. In many cases, biological data needs organization before integration. This is not seldom a collaborative effort, and it is thus important that tools for data integration support a collaborative way of working. Wiki systems with support for structured semantic data authoring, such as Semantic MediaWiki, provide a powerful solution for collaborative editing of data combined with machine-readability, so that data can be handled in an automated fashion in any downstream analyses. Semantic MediaWiki lacks a built-in data import function though, which hinders efficient round-tripping of data between interoperable Semantic Web formats such as RDF and the internal wiki format.
RESULTS: To solve this deficiency, the RDFIO suite of tools is presented, which supports importing of RDF data into Semantic MediaWiki, with metadata needed to export it again in the same RDF format, or ontology. Additionally, the new functionality enables mash-ups of automated data imports combined with manually created data presentations. The application of the suite of tools is demonstrated by importing drug discovery related data about rare diseases from Orphanet and acid dissociation constants from Wikidata. The RDFIO suite of tools is freely available for download via pharmb.io/project/rdfio .
CONCLUSIONS: Through a set of biomedical demonstrators, it is demonstrated how the new functionality enables a number of usage scenarios where the interoperability of SMW and the wider Semantic Web is leveraged for biomedical data sets, to create an easy to use and flexible platform for exploring and working with biomedical data.

PMID: 28870259 [PubMed - in process]

Categories: Literature Watch

Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse.

Sat, 2017-08-26 06:22

Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse.

ILAR J. 2017 Jul 01;58(1):17-41

Authors: Eppig JT

Abstract
The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided.

PMID: 28838066 [PubMed - in process]

Categories: Literature Watch

PathEdEx - Uncovering High-explanatory Visual Diagnostics Heuristics Using Digital Pathology and Multiscale Gaze Data.

Wed, 2017-08-23 07:54
Related Articles

PathEdEx - Uncovering High-explanatory Visual Diagnostics Heuristics Using Digital Pathology and Multiscale Gaze Data.

J Pathol Inform. 2017;8:29

Authors: Shin D, Kovalenko M, Ersoy I, Li Y, Doll D, Shyu CR, Hammer R

Abstract
BACKGROUND: Visual heuristics of pathology diagnosis is a largely unexplored area where reported studies only provided a qualitative insight into the subject. Uncovering and quantifying pathology visual and nonvisual diagnostic patterns have great potential to improve clinical outcomes and avoid diagnostic pitfalls.
METHODS: Here, we present PathEdEx, an informatics computational framework that incorporates whole-slide digital pathology imaging with multiscale gaze-tracking technology to create web-based interactive pathology educational atlases and to datamine visual and nonvisual diagnostic heuristics.
RESULTS: We demonstrate the capabilities of PathEdEx for mining visual and nonvisual diagnostic heuristics using the first PathEdEx volume of a hematopathology atlas. We conducted a quantitative study on the time dynamics of zooming and panning operations utilized by experts and novices to come to the correct diagnosis. We then performed association rule mining to determine sets of diagnostic factors that consistently result in a correct diagnosis, and studied differences in diagnostic strategies across different levels of pathology expertise using Markov chain (MC) modeling and MC Monte Carlo simulations. To perform these studies, we translated raw gaze points to high-explanatory semantic labels that represent pathology diagnostic clues. Therefore, the outcome of these studies is readily transformed into narrative descriptors for direct use in pathology education and practice.
CONCLUSION: PathEdEx framework can be used to capture best practices of pathology visual and nonvisual diagnostic heuristics that can be passed over to the next generation of pathologists and have potential to streamline implementation of precision diagnostics in precision medicine settings.

PMID: 28828200 [PubMed]

Categories: Literature Watch

Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies.

Sat, 2017-08-19 06:07

Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies.

Artif Intell Med. 2017 Aug 14;:

Authors: Lamy JB

Abstract
OBJECTIVE: Ontologies are widely used in the biomedical domain. While many tools exist for the edition, alignment or evaluation of ontologies, few solutions have been proposed for ontology programming interface, i.e. for accessing and modifying an ontology within a programming language. Existing query languages (such as SPARQL) and APIs (such as OWLAPI) are not as easy-to-use as object programming languages are. Moreover, they provide few solutions to difficulties encountered with biomedical ontologies. Our objective was to design a tool for accessing easily the entities of an OWL ontology, with high-level constructs helping with biomedical ontologies.
METHODS: From our experience on medical ontologies, we identified two difficulties: (1) many entities are represented by classes (rather than individuals), but the existing tools do not permit manipulating classes as easily as individuals, (2) ontologies rely on the open-world assumption, whereas the medical reasoning must consider only evidence-based medical knowledge as true. We designed a Python module for ontology-oriented programming. It allows access to the entities of an OWL ontology as if they were objects in the programming language. We propose a simple high-level syntax for managing classes and the associated "role-filler" constraints. We also propose an algorithm for performing local closed world reasoning in simple situations.
RESULTS: We developed Owlready, a Python module for a high-level access to OWL ontologies. The paper describes the architecture and the syntax of the module version 2. It details how we integrated the OWL ontology model with the Python object model. The paper provides examples based on Gene Ontology (GO). We also demonstrate the interest of Owlready in a use case focused on the automatic comparison of the contraindications of several drugs. This use case illustrates the use of the specific syntax proposed for manipulating classes and for performing local closed world reasoning.
CONCLUSION: Owlready has been successfully used in a medical research project. It has been published as Open-Source software and then used by many other researchers. Future developments will focus on the support of vagueness and additional non-monotonic reasoning feature, and automatic dialog box generation.

PMID: 28818520 [PubMed - as supplied by publisher]

Categories: Literature Watch

New tools and functions in Data-out activities at Protein Data Bank Japan (PDBj).

Fri, 2017-08-18 08:32
Related Articles

New tools and functions in Data-out activities at Protein Data Bank Japan (PDBj).

Protein Sci. 2017 Aug 17;:

Authors: Kinjo AR, Bekker GJ, Wako H, Endo S, Tsuchiya Y, Sato H, Nishi H, Kinoshita K, Suzuki H, Kawabata T, Yokochi M, Iwata T, Kobayashi N, Fujiwara T, Kurisu G, Nakamura H

Abstract
The Protein Data Bank Japan (PDBj), a member of the worldwide Protein Data Bank (wwPDB), accepts and processes the deposited data of experimentally determined biological macromolecular structures. In addition to archiving the PDB data in collaboration with the other wwPDB partners, PDBj also provides a wide range of original and unique services and tools, which are continuously improved and updated. Here, we report the new RDB PDBj Mine 2, the WebGL molecular viewer Molmil, the ProMode-Elastic server for normal mode analysis, a virtual reality system for the eF-site protein electrostatic molecular surfaces, the extensions of the Omokage search for molecular shape similarity, and the integration of PDBj and BMRB searches. This article is protected by copyright. All rights reserved.

PMID: 28815765 [PubMed - as supplied by publisher]

Categories: Literature Watch

Pages