Semantic Web
NeuroRDF: semantic integration of highly curated data to prioritize biomarker candidates in Alzheimer's disease.
NeuroRDF: semantic integration of highly curated data to prioritize biomarker candidates in Alzheimer's disease.
J Biomed Semantics. 2016;7(1):45
Authors: Iyappan A, Kawalia SB, Raschka T, Hofmann-Apitius M, Senger P
Abstract
BACKGROUND: Neurodegenerative diseases are incurable and debilitating indications with huge social and economic impact, where much is still to be learnt about the underlying molecular events. Mechanistic disease models could offer a knowledge framework to help decipher the complex interactions that occur at molecular and cellular levels. This motivates the need for the development of an approach integrating highly curated and heterogeneous data into a disease model of different regulatory data layers. Although several disease models exist, they often do not consider the quality of underlying data. Moreover, even with the current advancements in semantic web technology, we still do not have cure for complex diseases like Alzheimer's disease. One of the key reasons accountable for this could be the increasing gap between generated data and the derived knowledge.
RESULTS: In this paper, we describe an approach, called as NeuroRDF, to develop an integrative framework for modeling curated knowledge in the area of complex neurodegenerative diseases. The core of this strategy lies in the usage of well curated and context specific data for integration into one single semantic web-based framework, RDF. This increases the probability of the derived knowledge to be novel and reliable in a specific disease context. This infrastructure integrates highly curated data from databases (Bind, IntAct, etc.), literature (PubMed), and gene expression resources (such as GEO and ArrayExpress). We illustrate the effectiveness of our approach by asking real-world biomedical questions that link these resources to prioritize the plausible biomarker candidates. Among the 13 prioritized candidate genes, we identified MIF to be a potential emerging candidate due to its role as a pro-inflammatory cytokine. We additionally report on the effort and challenges faced during generation of such an indication-specific knowledge base comprising of curated and quality-controlled data.
CONCLUSION: Although many alternative approaches have been proposed and practiced for modeling diseases, the semantic web technology is a flexible and well established solution for harmonized aggregation. The benefit of this work, to use high quality and context specific data, becomes apparent in speculating previously unattended biomarker candidates around a well-known mechanism, further leveraged for experimental investigations.
PMID: 27392431 [PubMed - as supplied by publisher]
FunTree: advances in a resource for exploring and contextualising protein function evolution.
FunTree: advances in a resource for exploring and contextualising protein function evolution.
Nucleic Acids Res. 2016 Jan 4;44(D1):D317-23
Authors: Sillitoe I, Furnham N
Abstract
FunTree is a resource that brings together protein sequence, structure and functional information, including overall chemical reaction and mechanistic data, for structurally defined domain superfamilies. Developed in tandem with the CATH database, the original FunTree contained just 276 superfamilies focused on enzymes. Here, we present an update of FunTree that has expanded to include 2340 superfamilies including both enzymes and proteins with non-enzymatic functions annotated by Gene Ontology (GO) terms. This allows the investigation of how novel functions have evolved within a structurally defined superfamily and provides a means to analyse trends across many superfamilies. This is done not only within the context of a protein's sequence and structure but also the relationships of their functions. New measures of functional similarity have been integrated, including for enzymes comparisons of overall reactions based on overall bond changes, reaction centres (the local environment atoms involved in the reaction) and the sub-structure similarities of the metabolites involved in the reaction and for non-enzymes semantic similarities based on the GO. To identify and highlight changes in function through evolution, ancestral character estimations are made and presented. All this is accessible through a new re-designed web interface that can be found at http://www.funtree.info.
PMID: 26590404 [PubMed - indexed for MEDLINE]
Semantics based approach for analyzing disease-target associations.
Semantics based approach for analyzing disease-target associations.
J Biomed Inform. 2016 Jun 24;
Authors: Kaalia R, Ghosh I
Abstract
BACKGROUND: A complex disease is caused by heterogeneous biological interactions between genes and their products along with the influence of environmental factors. There have been many attempts for understanding the cause of these diseases using experimental, statistical and computational methods. In the present work the objective is to address the challenge of representation and integration of information from heterogeneous biomedical aspects of a complex disease using semantics based approach.
METHODS: Semantic web technology is used to design Disease Association Ontology (DAO-db) for representation and integration of disease associated information with diabetes as the case study. The functional associations of disease genes are integrated using RDF graphs of DAO-db. Three semantic web based scoring algorithms (PageRank, HITS (Hyperlink Induced Topic Search) and HITS with semantic weights) are used to score the gene nodes on the basis of their functional interactions in the graph.
RESULTS: Disease Association Ontology for Diabetes (DAO-db) provides a standard ontology-driven platform for describing genes, proteins, pathways involved in diabetes and for integrating functional associations from various interaction levels (gene-disease, gene-pathway, gene-function, gene-cellular component and protein-protein interactions). An automatic instance loader module is also developed in present work that helps in adding instances to DAO-db on a large scale.
CONCLUSIONS: Our ontology provides a framework for querying and analyzing the disease associated information in the form of RDF graphs. The above developed methodology is used to predict novel potential targets involved in diabetes disease from the long list of loose (statistically associated) gene-disease associations.
PMID: 27349858 [PubMed - as supplied by publisher]
The Detection of Emerging Trends Using Wikipedia Traffic Data and Context Networks.
The Detection of Emerging Trends Using Wikipedia Traffic Data and Context Networks.
PLoS One. 2015;10(12):e0141892
Authors: Kämpf M, Tessenow E, Kenett DY, Kantelhardt JW
Abstract
Can online media predict new and emerging trends, since there is a relationship between trends in society and their representation in online systems? While several recent studies have used Google Trends as the leading online information source to answer corresponding research questions, we focus on the online encyclopedia Wikipedia often used for deeper topical reading. Wikipedia grants open access to all traffic data and provides lots of additional (semantic) information in a context network besides single keywords. Specifically, we suggest and study context-normalized and time-dependent measures for a topic's importance based on page-view time series of Wikipedia articles in different languages and articles related to them by internal links. As an example, we present a study of the recently emerging Big Data market with a focus on the Hadoop ecosystem, and compare the capabilities of Wikipedia versus Google in predicting its popularity and life cycles. To support further applications, we have developed an open web platform to share results of Wikipedia analytics, providing context-rich and language-independent relevance measures for emerging trends.
PMID: 26720074 [PubMed - indexed for MEDLINE]
Knowledge Author: facilitating user-driven, domain content development to support clinical information extraction.
Knowledge Author: facilitating user-driven, domain content development to support clinical information extraction.
J Biomed Semantics. 2016;7(1):42
Authors: Scuba W, Tharp M, Mowery D, Tseytlin E, Liu Y, Drews FA, Chapman WW
Abstract
BACKGROUND: Clinical Natural Language Processing (NLP) systems require a semantic schema comprised of domain-specific concepts, their lexical variants, and associated modifiers to accurately extract information from clinical texts. An NLP system leverages this schema to structure concepts and extract meaning from the free texts. In the clinical domain, creating a semantic schema typically requires input from both a domain expert, such as a clinician, and an NLP expert who will represent clinical concepts created from the clinician's domain expertise into a computable format usable by an NLP system. The goal of this work is to develop a web-based tool, Knowledge Author, that bridges the gap between the clinical domain expert and the NLP system development by facilitating the development of domain content represented in a semantic schema for extracting information from clinical free-text.
RESULTS: Knowledge Author is a web-based, recommendation system that supports users in developing domain content necessary for clinical NLP applications. Knowledge Author's schematic model leverages a set of semantic types derived from the Secondary Use Clinical Element Models and the Common Type System to allow the user to quickly create and modify domain-related concepts. Features such as collaborative development and providing domain content suggestions through the mapping of concepts to the Unified Medical Language System Metathesaurus database further supports the domain content creation process. Two proof of concept studies were performed to evaluate the system's performance. The first study evaluated Knowledge Author's flexibility to create a broad range of concepts. A dataset of 115 concepts was created of which 87 (76 %) were able to be created using Knowledge Author. The second study evaluated the effectiveness of Knowledge Author's output in an NLP system by extracting concepts and associated modifiers representing a clinical element, carotid stenosis, from 34 clinical free-text radiology reports using Knowledge Author and an NLP system, pyConText. Knowledge Author's domain content produced high recall for concepts (targeted findings: 86 %) and varied recall for modifiers (certainty: 91 % sidedness: 80 %, neurovascular anatomy: 46 %).
CONCLUSION: Knowledge Author can support clinical domain content development for information extraction by supporting semantic schema creation by domain experts.
PMID: 27338146 [PubMed - in process]
Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources.
Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources.
PLoS Comput Biol. 2016 Jun;12(6):e1004989
Authors: Waagmeester A, Kutmon M, Riutta A, Miller R, Willighagen EL, Evelo CT, Pico AR
Abstract
The diversity of online resources storing biological data in different formats provides a challenge for bioinformaticians to integrate and analyse their biological data. The semantic web provides a standard to facilitate knowledge integration using statements built as triples describing a relation between two objects. WikiPathways, an online collaborative pathway resource, is now available in the semantic web through a SPARQL endpoint at http://sparql.wikipathways.org. Having biological pathways in the semantic web allows rapid integration with data from other resources that contain information about elements present in pathways using SPARQL queries. In order to convert WikiPathways content into meaningful triples we developed two new vocabularies that capture the graphical representation and the pathway logic, respectively. Each gene, protein, and metabolite in a given pathway is defined with a standard set of identifiers to support linking to several other biological resources in the semantic web. WikiPathways triples were loaded into the Open PHACTS discovery platform and are available through its Web API (https://dev.openphacts.org/docs) to be used in various tools for drug development. We combined various semantic web resources with the newly converted WikiPathways content using a variety of SPARQL query types and third-party resources, such as the Open PHACTS API. The ability to use pathway information to form new links across diverse biological data highlights the utility of integrating WikiPathways in the semantic web.
PMID: 27336457 [PubMed - as supplied by publisher]
stringgaussnet: from differentially expressed genes to semantic and Gaussian networks generation.
stringgaussnet: from differentially expressed genes to semantic and Gaussian networks generation.
Bioinformatics. 2015 Dec 1;31(23):3865-7
Authors: Chaplais E, Garchon HJ
Abstract
MOTIVATION: Knowledge-based and co-expression networks are two kinds of gene networks that can be currently implemented by sophisticated but distinct tools. We developed stringgaussnet, an R package that integrates both approaches, starting from a list of differentially expressed genes.
CONTACT: henri-jean.garchon@inserm.fr.
AVAILABILITY AND IMPLEMENTATION: Freely available on the web at http://cran.r-project.org/web/packages/stringgaussnet.
PMID: 26231430 [PubMed - indexed for MEDLINE]
DNA data bank of Japan (DDBJ) progress report.
DNA data bank of Japan (DDBJ) progress report.
Nucleic Acids Res. 2016 Jan 4;44(D1):D51-7
Authors: Mashima J, Kodama Y, Kosuge T, Fujisawa T, Katayama T, Nagasaki H, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T
Abstract
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Since 2013, the DDBJ Center has been operating the Japanese Genotype-phenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in Japan. In addition, the DDBJ Center develops semantic web technologies for data integration and sharing in collaboration with the Database Center for Life Science (DBCLS) in Japan. This paper briefly reports on the activities of the DDBJ Center over the past year including submissions to databases and improvements in our services for data retrieval, analysis, and integration.
PMID: 26578571 [PubMed - indexed for MEDLINE]
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.
J Biomed Semantics. 2016;7:39
Authors: Bolleman JT, Mungall CJ, Strozzi F, Baran J, Dumontier M, Bonnal RJ, Buels R, Hoehndorf R, Fujisawa T, Katayama T, Cock PJ
Abstract
BACKGROUND: Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples.
DESCRIPTION: We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned "omics" areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations.
CONCLUSIONS: Our ontology allows users to uniformly describe - and potentially merge - sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.
PMID: 27296299 [PubMed - in process]
PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.
PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.
Sci Rep. 2016;6:27653
Authors: Zhou J, Xu R, He Y, Lu Q, Wang H, Kong B
Abstract
Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community.
PMID: 27282833 [PubMed - in process]
The SBOL Stack: A Platform for Storing, Publishing, and Sharing Synthetic Biology Designs.
The SBOL Stack: A Platform for Storing, Publishing, and Sharing Synthetic Biology Designs.
ACS Synth Biol. 2016 Jun 7;
Authors: Madsen C, McLaughlin JA, Misirli G, Pocock M, Flanagan K, Hallinan J, Wipat A
Abstract
Recently, synthetic biologists have developed the Synthetic Biology Open Language (SBOL), a data exchange standard for descriptions of genetic parts, devices, modules, and systems. The goals of this standard are to allow scientists to exchange designs of biological parts and systems, to facilitate the storage of genetic designs in repositories, and to facilitate the description of genetic designs in publications. In order to achieve these goals, the development of an infrastructure to store, retrieve, and exchange SBOL data is necessary. To address this problem, we have developed the SBOL Stack, a Resource Description Framework (RDF) database specifically designed for the storage, integration, and publication of SBOL data. This database allows users to define a library of synthetic parts and designs as a service, to share SBOL data with collaborators, and to store designs of biological systems locally. The database also allows external data sources to be integrated by mapping them to the SBOL data model. The SBOL Stack includes two Web interfaces: the SBOL Stack API and SynBioHub. While the former is designed for developers, the latter allows users to upload new SBOL biological designs, download SBOL documents, search by keyword, and visualize SBOL data. Since the SBOL Stack is based on semantic Web technology, the inherent distributed querying functionality of RDF databases can be used to allow different SBOL stack databases to be queried simultaneously, and therefore, data can be shared between different institutes, centers, or other users.
PMID: 27268205 [PubMed - as supplied by publisher]
The Orthology Ontology: development and applications.
The Orthology Ontology: development and applications.
J Biomed Semantics. 2016;7(1):34
Authors: Fernández-Breis JT, Chiba H, Legaz-García MD, Uchiyama I
Abstract
BACKGROUND: Computational comparative analysis of multiple genomes provides valuable opportunities to biomedical research. In particular, orthology analysis can play a central role in comparative genomics; it guides establishing evolutionary relations among genes of organisms and allows functional inference of gene products. However, the wide variations in current orthology databases necessitate the research toward the shareability of the content that is generated by different tools and stored in different structures. Exchanging the content with other research communities requires making the meaning of the content explicit.
DESCRIPTION: The need for a common ontology has led to the creation of the Orthology Ontology (ORTH) following the best practices in ontology construction. Here, we describe our model and major entities of the ontology that is implemented in the Web Ontology Language (OWL), followed by the assessment of the quality of the ontology and the application of the ORTH to existing orthology datasets. This shareable ontology enables the possibility to develop Linked Orthology Datasets and a meta-predictor of orthology through standardization for the representation of orthology databases. The ORTH is freely available in OWL format to all users at http://purl.org/net/orth .
CONCLUSIONS: The Orthology Ontology can serve as a framework for the semantic standardization of orthology content and it will contribute to a better exploitation of orthology resources in biomedical research. The results demonstrate the feasibility of developing shareable datasets using this ontology. Further applications will maximize the usefulness of this ontology.
PMID: 27259657 [PubMed - as supplied by publisher]
Generation of open biomedical datasets through ontology-driven transformation and integration processes.
Generation of open biomedical datasets through ontology-driven transformation and integration processes.
J Biomed Semantics. 2016;7:32
Authors: Carmen Legaz-García MD, Miñarro-Giménez JA, Menárguez-Tortosa M, Fernández-Breis JT
Abstract
BACKGROUND: Biomedical research usually requires combining large volumes of data from multiple heterogeneous sources, which makes difficult the integrated exploitation of such data. The Semantic Web paradigm offers a natural technological space for data integration and exploitation by generating content readable by machines. Linked Open Data is a Semantic Web initiative that promotes the publication and sharing of data in machine readable semantic formats.
METHODS: We present an approach for the transformation and integration of heterogeneous biomedical data with the objective of generating open biomedical datasets in Semantic Web formats. The transformation of the data is based on the mappings between the entities of the data schema and the ontological infrastructure that provides the meaning to the content. Our approach permits different types of mappings and includes the possibility of defining complex transformation patterns. Once the mappings are defined, they can be automatically applied to datasets to generate logically consistent content and the mappings can be reused in further transformation processes.
RESULTS: The results of our research are (1) a common transformation and integration process for heterogeneous biomedical data; (2) the application of Linked Open Data principles to generate interoperable, open, biomedical datasets; (3) a software tool, called SWIT, that implements the approach. In this paper we also describe how we have applied SWIT in different biomedical scenarios and some lessons learned.
CONCLUSIONS: We have presented an approach that is able to generate open biomedical repositories in Semantic Web formats. SWIT is able to apply the Linked Open Data principles in the generation of the datasets, so allowing for linking their content to external repositories and creating linked open datasets. SWIT datasets may contain data from multiple sources and schemas, thus becoming integrated datasets.
PMID: 27255189 [PubMed - in process]
P-Finder: Reconstruction of Signaling Networks from Protein-Protein Interactions and GO Annotations.
P-Finder: Reconstruction of Signaling Networks from Protein-Protein Interactions and GO Annotations.
IEEE/ACM Trans Comput Biol Bioinform. 2015 Mar-Apr;12(2):309-21
Authors: Young-Rae Cho, Yanan Xin, Speegle G
Abstract
Because most complex genetic diseases are caused by defects of cell signaling, illuminating a signaling cascade is essential for understanding their mechanisms. We present three novel computational algorithms to reconstruct signaling networks between a starting protein and an ending protein using genome-wide protein-protein interaction (PPI) networks and gene ontology (GO) annotation data. A signaling network is represented as a directed acyclic graph in a merged form of multiple linear pathways. An advanced semantic similarity metric is applied for weighting PPIs as the preprocessing of all three methods. The first algorithm repeatedly extends the list of nodes based on path frequency towards an ending protein. The second algorithm repeatedly appends edges based on the occurrence of network motifs which indicate the link patterns more frequently appearing in a PPI network than in a random graph. The last algorithm uses the information propagation technique which iteratively updates edge orientations based on the path strength and merges the selected directed edges. Our experimental results demonstrate that the proposed algorithms achieve higher accuracy than previous methods when they are tested on well-studied pathways of S. cerevisiae. Furthermore, we introduce an interactive web application tool, called P-Finder, to visualize reconstructed signaling networks.
PMID: 26357219 [PubMed - indexed for MEDLINE]
The BioHub Knowledge Base: Ontology and Repository for Sustainable Biosourcing.
The BioHub Knowledge Base: Ontology and Repository for Sustainable Biosourcing.
J Biomed Semantics. 2016;7(1):30
Authors: Read WJ, Demetriou G, Nenadic G, Ruddock N, Stevens R, Winter J
Abstract
BACKGROUND: The motivation for the BioHub project is to create an Integrated Knowledge Management System (IKMS) that will enable chemists to source ingredients from bio-renewables, rather than from non-sustainable sources such as fossil oil and its derivatives.
METHOD: The BioHubKB is the data repository of the IKMS; it employs Semantic Web technologies, especially OWL, to host data about chemical transformations, bio-renewable feedstocks, co-product streams and their chemical components. Access to this knowledge base is provided to other modules within the IKMS through a set of RESTful web services, driven by SPARQL queries to a Sesame back-end. The BioHubKB re-uses several bio-ontologies and bespoke extensions, primarily for chemical feedstocks and products, to form its knowledge organisation schema.
RESULTS: Parts of plants form feedstocks, while various processes generate co-product streams that contain certain chemicals. Both chemicals and transformations are associated with certain qualities, which the BioHubKB also attempts to capture. Of immediate commercial and industrial importance is to estimate the cost of particular sets of chemical transformations (leading to candidate surfactants) performed in sequence, and these costs too are captured. Data are sourced from companies' internal knowledge and document stores, and from the publicly available literature. Both text analytics and manual curation play their part in populating the ontology. We describe the prototype IKMS, the BioHubKB and the services that it supports for the IKMS.
AVAILABILITY: The BioHubKB can be found via http://biohub.cs.manchester.ac.uk/ontology/biohub-kb.owl .
PMID: 27246819 [PubMed - in process]
ODMedit: uniform semantic annotation for data integration in medicine based on a public metadata repository.
ODMedit: uniform semantic annotation for data integration in medicine based on a public metadata repository.
BMC Med Res Methodol. 2016;16(1):65
Authors: Dugas M, Meidt A, Neuhaus P, Storck M, Varghese J
Abstract
BACKGROUND: The volume and complexity of patient data - especially in personalised medicine - is steadily increasing, both regarding clinical data and genomic profiles: Typically more than 1,000 items (e.g., laboratory values, vital signs, diagnostic tests etc.) are collected per patient in clinical trials. In oncology hundreds of mutations can potentially be detected for each patient by genomic profiling. Therefore data integration from multiple sources constitutes a key challenge for medical research and healthcare.
METHODS: Semantic annotation of data elements can facilitate to identify matching data elements in different sources and thereby supports data integration. Millions of different annotations are required due to the semantic richness of patient data. These annotations should be uniform, i.e., two matching data elements shall contain the same annotations. However, large terminologies like SNOMED CT or UMLS don't provide uniform coding. It is proposed to develop semantic annotations of medical data elements based on a large-scale public metadata repository. To achieve uniform codes, semantic annotations shall be re-used if a matching data element is available in the metadata repository.
RESULTS: A web-based tool called ODMedit ( https://odmeditor.uni-muenster.de/ ) was developed to create data models with uniform semantic annotations. It contains ~800,000 terms with semantic annotations which were derived from ~5,800 models from the portal of medical data models (MDM). The tool was successfully applied to manually annotate 22 forms with 292 data items from CDISC and to update 1,495 data models of the MDM portal.
CONCLUSION: Uniform manual semantic annotation of data models is feasible in principle, but requires a large-scale collaborative effort due to the semantic richness of patient data. A web-based tool for these annotations is available, which is linked to a public metadata repository.
PMID: 27245222 [PubMed - in process]
Acceptability of a mobile health exercise-based cardiac rehabilitation intervention: a randomized trial.
Acceptability of a mobile health exercise-based cardiac rehabilitation intervention: a randomized trial.
J Cardiopulm Rehabil Prev. 2015 Sep-Oct;35(5):312-9
Authors: Pfaeffli Dale L, Whittaker R, Dixon R, Stewart R, Jiang Y, Carter K, Maddison R
Abstract
BACKGROUND: Mobile technologies (mHealth) have recently been used to deliver behavior change interventions; however, few have investigated the application of mHealth for treatment of ischemic heart disease (IHD). The Heart Exercise And Remote Technologies trial examined the effectiveness of an mHealth intervention to increase exercise behavior in adults with IHD. As a part of this trial, a process evaluation was conducted.
METHODS: One hundred seventy-one adults with IHD were randomized to receive a 6-month mHealth intervention (n = 85) plus usual care or usual care alone (n = 86). The intervention delivered a theory-based, automated package of exercise prescription and behavior change text messages and a supporting Web site. Three sources of data were triangulated to assess intervention participant perceptions: (1) Web site usage statistics; (2) feedback surveys; and (3) semistructured exit interviews. Descriptive information from survey and Web data were merged with qualitative data and analyzed using a semantic thematic approach.
RESULTS: At 24 weeks, all intervention participants provided Web usage statistics, 75 completed the feedback survey, and 17 were interviewed. Participants reported reading the text messages (70/75; 93%) and liked the content (55/75; 73%). The program motivated participants to exercise. Several suggestions to improve the program included further tailoring of the content (7/75; 7%) and increased personal contact (10/75; 13%).
CONCLUSIONS: Adults with IHD were able to use an mHealth program and reported that text messaging is a good way to deliver exercise information. While mHealth is designed to be automated, programs might be improved if content and delivery were tailored to individual needs.
PMID: 26181037 [PubMed - indexed for MEDLINE]
An ontology for Autism Spectrum Disorder (ASD) to infer ASD phenotypes from Autism Diagnostic Interview-Revised data.
An ontology for Autism Spectrum Disorder (ASD) to infer ASD phenotypes from Autism Diagnostic Interview-Revised data.
J Biomed Inform. 2015 Aug;56:333-47
Authors: Mugzach O, Peleg M, Bagley SC, Guter SJ, Cook EH, Altman RB
Abstract
OBJECTIVE: Our goal is to create an ontology that will allow data integration and reasoning with subject data to classify subjects, and based on this classification, to infer new knowledge on Autism Spectrum Disorder (ASD) and related neurodevelopmental disorders (NDD). We take a first step toward this goal by extending an existing autism ontology to allow automatic inference of ASD phenotypes and Diagnostic & Statistical Manual of Mental Disorders (DSM) criteria based on subjects' Autism Diagnostic Interview-Revised (ADI-R) assessment data.
MATERIALS AND METHODS: Knowledge regarding diagnostic instruments, ASD phenotypes and risk factors was added to augment an existing autism ontology via Ontology Web Language class definitions and semantic web rules. We developed a custom Protégé plugin for enumerating combinatorial OWL axioms to support the many-to-many relations of ADI-R items to diagnostic categories in the DSM. We utilized a reasoner to infer whether 2642 subjects, whose data was obtained from the Simons Foundation Autism Research Initiative, meet DSM-IV-TR (DSM-IV) and DSM-5 diagnostic criteria based on their ADI-R data.
RESULTS: We extended the ontology by adding 443 classes and 632 rules that represent phenotypes, along with their synonyms, environmental risk factors, and frequency of comorbidities. Applying the rules on the data set showed that the method produced accurate results: the true positive and true negative rates for inferring autistic disorder diagnosis according to DSM-IV criteria were 1 and 0.065, respectively; the true positive rate for inferring ASD based on DSM-5 criteria was 0.94.
DISCUSSION: The ontology allows automatic inference of subjects' disease phenotypes and diagnosis with high accuracy.
CONCLUSION: The ontology may benefit future studies by serving as a knowledge base for ASD. In addition, by adding knowledge of related NDDs, commonalities and differences in manifestations and risk factors could be automatically inferred, contributing to the understanding of ASD pathophysiology.
PMID: 26151311 [PubMed - indexed for MEDLINE]
Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.
Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.
J Biomed Inform. 2015 Aug;56:57-64
Authors: De-Arteaga M, Eggel I, Do B, Rubin D, Kahn CE, Müller H
Abstract
Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the way in which physicians aim to access information. Medical image search is a much smaller domain but has gained much attention as it has different characteristics than search for text documents. While web search log files have been analysed many times to better understand user behaviour, the log files of hospital internal systems for search in a PACS/RIS (Picture Archival and Communication System, Radiology Information System) have rarely been analysed. Such a comparison between a hospital PACS/RIS search and a web system for searching images of the biomedical literature is the goal of this paper. Objectives are to identify similarities and differences in search behaviour of the two systems, which could then be used to optimize existing systems and build new search engines. Log files of the ARRS GoldMiner medical image search engine (freely accessible on the Internet) containing 222,005 queries, and log files of Stanford's internal PACS/RIS search called radTF containing 18,068 queries were analysed. Each query was preprocessed and all query terms were mapped to the RadLex (Radiology Lexicon) terminology, a comprehensive lexicon of radiology terms created and maintained by the Radiological Society of North America, so the semantic content in the queries and the links between terms could be analysed, and synonyms for the same concept could be detected. RadLex was mainly created for the use in radiology reports, to aid structured reporting and the preparation of educational material (Lanlotz, 2006) [1]. In standard medical vocabularies such as MeSH (Medical Subject Headings) and UMLS (Unified Medical Language System) specific terms of radiology are often underrepresented, therefore RadLex was considered to be the best option for this task. The results show a surprising similarity between the usage behaviour in the two systems, but several subtle differences can also be noted. The average number of terms per query is 2.21 for GoldMiner and 2.07 for radTF, the used axes of RadLex (anatomy, pathology, findings, …) have almost the same distribution with clinical findings being the most frequent and the anatomical entity the second; also, combinations of RadLex axes are extremely similar between the two systems. Differences include a longer length of the sessions in radTF than in GoldMiner (3.4 and 1.9 queries per session on average). Several frequent search terms overlap but some strong differences exist in the details. In radTF the term "normal" is frequent, whereas in GoldMiner it is not. This makes intuitive sense, as in the literature normal cases are rarely described whereas in clinical work the comparison with normal cases is often a first step. The general similarity in many points is likely due to the fact that users of the two systems are influenced by their daily behaviour in using standard web search engines and follow this behaviour in their professional search. This means that many results and insights gained from standard web search can likely be transferred to more specialized search systems. Still, specialized log files can be used to find out more on reformulations and detailed strategies of users to find the right content.
PMID: 26002820 [PubMed - indexed for MEDLINE]
Building the Ferretome.
Building the Ferretome.
Front Neuroinform. 2016;10:16
Authors: Sukhinin DI, Engel AK, Manger P, Hilgetag CC
Abstract
Databases of structural connections of the mammalian brain, such as CoCoMac (cocomac.g-node.org) or BAMS (https://bams1.org), are valuable resources for the analysis of brain connectivity and the modeling of brain dynamics in species such as the non-human primate or the rodent, and have also contributed to the computational modeling of the human brain. Another animal model that is widely used in electrophysiological or developmental studies is the ferret; however, no systematic compilation of brain connectivity is currently available for this species. Thus, we have started developing a database of anatomical connections and architectonic features of the ferret brain, the Ferret(connect)ome, www.Ferretome.org. The Ferretome database has adapted essential features of the CoCoMac methodology and legacy, such as the CoCoMac data model. This data model was simplified and extended in order to accommodate new data modalities that were not represented previously, such as the cytoarchitecture of brain areas. The Ferretome uses a semantic parcellation of brain regions as well as a logical brain map transformation algorithm (objective relational transformation, ORT). The ORT algorithm was also adopted for the transformation of architecture data. The database is being developed in MySQL and has been populated with literature reports on tract-tracing observations in the ferret brain using a custom-designed web interface that allows efficient and validated simultaneous input and proofreading by multiple curators. The database is equipped with a non-specialist web interface. This interface can be extended to produce connectivity matrices in several formats, including a graphical representation superimposed on established ferret brain maps. An important feature of the Ferretome database is the possibility to trace back entries in connectivity matrices to the original studies archived in the system. Currently, the Ferretome contains 50 reports on connections comprising 20 injection reports with more than 150 labeled source and target areas, the majority reflecting connectivity of subcortical nuclei and 15 descriptions of regional brain architecture. We hope that the Ferretome database will become a useful resource for neuroinformatics and neural modeling, and will support studies of the ferret brain as well as facilitate advances in comparative studies of mesoscopic brain connectivity.
PMID: 27242503 [PubMed]