Bio-Ontologies

What is an Ontology?

"An ontology is an explicit specification of some topic. For our purposes, it is a formal and declarative representation which includes the vocabulary (or names) for referring to the terms in that subject area and the logical statements that describe what the terms are, how they are related to each other, and how they can or cannot be related to each other. Ontologies therefore provide a vocabulary for representing and communicating knowledge about some topic and a set of relationships that hold among the terms in that vocabulary (From the Stanford Knowledge Systems Lab)."

Given below are some resources for known ontologies. If you encounter a broken link or come across a useful resource, not listed here, please e-mail me at anil.jegga@cchmc.org.

 

Gene Ontology (GO) an ontology describing the molecular function, the biological process and cellular location of gene products from eukaryotes.

OBO Open Biological Ontologies is an umbrella web address for well-structured controlled vocabularies for shared use across different biological domains. This site contains ontologies and points to some other efforts within the community. Ideally we see a range of ontologies being designed for biological domains. Some of these will be generic and apply across all organisms and others will be more restricted in scope, for example to specific taxonomic groups.

Sequence Ontology Project The Sequence Ontology is a set of terms used to describe features on a nucleotide or protein sequence. It encompasses both "raw" features, such as nucleotide similarity hits, and interpretations, such as gene models. The Sequence Ontologies are provided as a resource for the bioinformatics community. They have the following obvious uses:

  • To provide for a structured controlled vocabulary for the description of primary annotations of nucleic acid sequence, e.g. the annotations shared by a DAS server.
  • To provide for a structured representation of these annotations within genomic databases. Were genes within model organism databases to be annotated with these terms then it would be possible to query all these databases for, for example, all genes whose transcripts are edited, or trans-spliced, or are bound by a particular protein.
  • To provide a structured controlled vocabulary for the description of mutations at both sequence and more gross level in the context of genomic databases. We have also defined attributes for many of the terms. Pro tem these are held in the "comment:" field of the definitions file.

Edinburgh Mouse Atlas Project The UK MRC Human Genetics Unit in Edinburgh is developing a digital atlas of mouse development and database to be a resource for spatially mapped data such as in situ gene expression and cell lineage. The project is in collaboration with the Section of Biomedical Sciences within the Division of Biomedical and Clinical Sciences at the University of Edinburgh. This research programme is the Edinburgh Mouse Atlas Project (emap).

Ontology Lookup Service (OLS)

List of Ontologies

Semantic Mining - semantic interoperability and data mining in medicine

The MGED ontology working group aim to develop ontologies for describing gene expression experiments and data.

PharmGKB: Pharmacogenetics Knowledge Base.

Human Phenotype Ontology

TAMBIS Ontology The TAMBIS Ontology (TaO)is an ontology of molecular biological and bioinformatics concepts. The Terminology Server provides a number of services for terminological or concept models such as what can I say about concept X? or is concept Y coherent? or what are the immediate parents and/or children of concept Z? Together they form a Biological Terminology Concept Server used by TAMBIS.

RiboWeb an ontology describing ribosomal components, associated data and computations for processing those data.

Cell Cycle Ontology The Cell Cycle Ontology project aims to extend the existing ontologies for cell cycle knowledge, to build a resource that integrates and manages knowledge about the cell cycle components and regulatory aspects in OBO and OWL. This knowledge is assembled from a diverse set of already existing resources (GO, UniProt, IntAct, BIND, NCBI taxonomy, and so forth): the combination of the knowledge will give an overall picture of the cell division process (Download).

NCI Enterprise Vocabulary Services (EVS) - a set of services and resources that address NCI's needs for controlled vocabulary

BioCyc The BioCyc Knowledge Library is a collection of Pathway/Genome Databases. Each database in the BioCyc collection describes the genome and metabolic pathways of a single organism, with the exception of the MetaCyc database, which is a reference source on metabolic pathways from many organisms.

BioModels database and Systems Biology Ontologies (SBO) project

Ontologies for molecular biology and bioinformatics: an account of what concept ontologies in the domain of biology and bioinformatics are; what they are not; how they can be constructed; how they can be used; and some fallacies and pitfalls creators and users should be aware of.

ImMunoGeneTics (IMGT) Ontology

Mouse Anatomical Dictionary

EpoDB Controlled Vocabulary function, cell and tissue type, developmental stage and experimental type.

CBIL Controlled Vocabulary Terms for human anatomy.

STAR/mmCIF Macromolecule structure ontology.

STAR/mmCIF Signal Transduction Knowledge #Environment (STKE).

GeneX Ontologies for comparing gene expression across species.

flybase controlled vocabulary for fly anatomy used for describing phenotypes.

BioNLP Resources: A good compilation of many of the freely available resources either used directly and cited or of potential use by researchers applying NLP/text mining to biomedical literature.

OpenNLP Resources: OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP components.

Barry Smith's Ontological Resources: No ontologies resources can be complete without this! Also check the Buffalo Ontology site.

Systems biology-funded centers:

Semantic Web & Related Resources

Ontology Development Tools

  1. Protégé 2000: a tool which allows the user to construct a domain ontology, customize knowledge-acquisition forms and enter domain knowledge. Its a a platform which can be extended with graphical widgets for tables, diagrams, animation components to access other knowledge-based systems embedded applications.
  2. GKB Editor The Generic Knowledge Base Editor.
  3. Wonder Tools A support for choosing an ontology building tool.
  4. OilEd It is an ontology editor allowing the user to build ontologies using DAML+OIL. For further details and information about DAML+OIL, consult the DAML Pages.

Bio-Ontology and Related Publications

  • Feigenbaum L, Martin S, Roy MN, Szekely B, Yung WC. Boca: an open-source RDF store for building Semantic Web applications. Brief Bioinform. 2007 May 8; [Epub ahead of print]
  • Gordon PM, Trinh Q, Sensen CW. 2007. Semantic Web Service Provision: a Realistic Framework for Bioinformatics Programmers. Bioinformatics. 2007 Mar 24.
  • Wang X, Gorlitsky R, Almeida JS. 2005. From XML to RDF: how semantic web technologies will change the design of 'omic' standards. Nat Biotechnol. 23(9): 1099-1103.
  • Vyas H, Summers R. 2005. An information-driven approach to pharmacogenomics. Pharmacogenomics. 6(5):473-80.
  • Bodenreider O, Stevens R. 2006. Bio-ontologies: current trends and future directions. Brief Bioinform. 7(3): 256-274.
  • Blake JA, Bult CJ. 2006. Beyond the data deluge: data integration and bio-ontologies. J Biomed Inform. 39(3): 314-320.
  • Cheung KH, Yip KY, Smith A, Deknikker R, Masiar A, Gerstein M. 2005. YeastHub: a semantic web use case for integrating data in the life sciences domain. Bioinformatics. 21 Suppl 1: i85-i96.
  • Blake J. 2004. Bio-ontologies-fast and furious. Nat Biotechnol. 22: 773-774.
  • Bard J. 2003. Ontologies: Formalising biological knowledge for bioinformatics. Bioessays 25(5): 501-506.
  • Ashburner M, Mungall CJ, Lewis SE. 2003. Ontologies for biologists: a community model for the annotation of genomic data. Cold Spring Harb Symp Quant Biol. 68: 227-235.
  • Bard JB, Rhee SY. 2004. Ontologies in biology: design, applications and future challenges. Nat Rev Genet. 5(3): 213-222.
  • Lan N, Montelione GT, Gerstein M. 2003. Ontologies for proteomics: towards a systematic definition of structure and function that scales to the genome level. Curr Opin Chem Biol. 7(1): 44-54.
  • Andrey Rzhetsky, Tomohiro Koike, Sergey Kalachikov, Shawn M. Gomez, Michael Krauthammer, Sabina H. Kaplan, Pauline Kra, James J. Russo, and Carol Friedman. A Knowledge Model for Analysis and Simulation of Regulatory Networks. Bioinformatics, 16:1120-1128, 2000.
  • Toni Kazic. Semiotes: A Semantics for Sharing. Bioinformatics, 16:1129-1144, 2000.
  • P.D. Karp, M. Riley, M. Saier, I.T. Paulsen, S.M. Paley, and A. Pellegrini-Toole. The EcoCyc and MetaCyc Databases. Nucleic Acids Research, 28:56-59, 2000.
  • R. Stevens, P. Baker, S. Bechhofer, G. Ng, A. Jacoby, N.W. Paton, C.A. Goble, and A. Brass. TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources. Bioinformatics, 16(2):184-186, 2000.
  • P.G. Baker, C.A. Goble, S. Bechhofer, N.W. Paton, R. Stevens, and A Brass. An Ontology for Bioinformatics Applications. Bioinformatics, 15(6):510-520, 1999.
  • J.D. Westbrook and P.E. Bourne. STAR/mmCIF: An Ontology for Macromolecular Structure. Bioinformatics, 16(2):159-168, 2000.
  • P.D. Karp. An Ontology for Biological Function Based on Molecular Interactions. Bioinformatics, 16:269-285, 2000.
  • The Gene Ontology Consortium. Gene Ontology: Tool for the Unification of Biology. Nature Genetics, 25:25-29, 2000.
  • Veronique Giudicelli and Marie-Paule Lefranc. Ontology for Immunogenetics: The IMGT-ONTOLOGY. Bioinformatics, 15(12):1047-1054, 1999.
  • P. Karp and S. Paley. Integrated Access to Metabolic and Genomic Data. Journal of Computational Biology, 3(1):191-212, 1996.

 

This page was last updated on December 9, 2008