Bioinformatics Resources

Nucleotide Sequence Databases (the principal ones)

  • NCBI - National Center for Biotechnology Information
  • EBI - European Bioinformatics Institute
  • DDBJ - DNA Data Bank of Japan

Protein Sequence Databases

  • SWISS-PROT & TrEMBL - Protein sequence database and computer annotated supplement
  • UniProt - UniProt (Universal Protein Resource) is the world's most comprehensive catalog of information on proteins. It is a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR.
  • PIR - Protein Information Resource
  • MIPS - Munich Information centre for Protein Sequences
  • HUPO - HUman Proteome Organization

Database Searching by Sequence Similarity


Sequence Alignment

  • USC Sequence Alignment Server - align 2 sequences with all possible varieties of dynamic programming
  • T-COFFEE - multiple sequence alignment
  • ClustalW @ EBI - multiple sequence alignment
  • MSA 2.1 - optimal multiple sequence alignment using the Carrillo-Lipman method
  • BOXSHADE - pretty printing and shading of multiple alignments
  • Splign - Splign is a utility for computing cDNA-to-Genomic, or spliced sequence alignments. At the heart of the program is a global alignment algorithm that specifically accounts for introns and splice signals.
  • Spidey - an mRNA-to-genomic alignment program
  • Wise2 - align a protein or profile HMM against genomic sequence to predict a gene structure, and related tools
  • PipMaker - computes alignments of similar regions in two (long) DNA sequences
  • VISTA - align + detect conserved regions in long genomic sequences
  • myGodzilla - align a sequence to its ortholog in the human genome


Human Genome Databases

Databases of other Organisms


Genome-wide Analysis

  • MBGD - comparative analysis of completely sequenced microbial genomes
  • COGs - phylogenetic classification of orthologous proteins from complete genomes
  • STRING - detect whether a given query gene occurs repeatedly with certain other genes in potential operons
  • Pedant - automatic whole genome annotation
  • GeneCensus - various whole genome comparisons

Protein Domains: Databases and Search Tools

  • InterPro - integration of Pfam, PRINTS, PROSITE, SWISS-PROT + TrEMBL
  • PROSITE - database of protein families and domains
  • Pfam - alignments and hidden Markov models covering many common protein domains
  • SMART - analysis of domains in proteins
  • ProDom - protein domain database
  • PRINTS Database - groups of conserved motifs used to characterise protein families
  • Blocks - multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins
  • Protein Domain Profile Analysis @ BMERC - search a library of profiles with a protein sequence
  • TIGRFAMs - yet more protein families based on Hidden Markov Models


Motif and Pattern Search in Sequences

  • Gibbs Motif Sampler - identification of conserved motifs in DNA or protein sequences
  • AlignACE Homepage - gene regulatory motif finding
  • MEME  - motif discovery and search in protein and DNA sequences
  • SAM - tools for creating and using Hidden Markov Models
  • Pratt - discover patterns in unaligned protein sequences
  • Motivated Proteins - a web facility for exploring small hydrogen-bonded motifs

Protein 3D Structure


Phylogeny & Taxonomy

Gene Prediction

Gene Expression Databases (including RNA-seq and single cell)


Gene Regulation

  • TRANSFAC - database of eukaryotic cis-acting regulatory DNA elements and trans-acting factors
  • EPD - eukaryotic promoter database
  • DBTSS - DataBase of Transcriptional Start Sites (human)
  • SCPD - Saccharomyces cerevisiae promoter database
  • DCPD - Drosophila Core Promoter Database
  • RegulonDB - a database on transcriptional regulation in E. coli
  • DPInteract - protein binding sites on E. coli DNA
  • PromoterInspector - prediction of promoter regions in mammalian genomic sequences
  • MatInspector - search for transcription factor binding sites
  • Cister - cis-element cluster finder
  • Gene regulatory Tools
  • microRNA Targets & Expression Profiles
  • miRBase
  • TarBase Provides a means of searching through a comprehensive set of experimentally supported microRNA targets in at least 8 organisms
  • microRNA resource A gateway to all types of information about microRNAs, including articles, products, news, events, and other websites


Metabolic, Gene Regulatory & Signal Transduction Network Databases

  • KEGG - Kyoto Encyclopedia of Genes and Genomes
  • BioCarta
  • DAVID - Database for Annotation, Visualization and Integrated Discovery - A useful server to for annotating microarray and other genetic data.
  • stke - Signal Transduction Knowledge Environment
  • BIND - Biomolecular Interaction Network Database
  • EcoCyc
  • WIT
  • PathGuide A very useful collection of resources dealing primarily with pathways
  • SPAD - Signaling Pathway Database
  • CSNDB - Cell Signalling Networks Database
  • PathDB
  • Transpath
  • DIP - Database of Interacting Proteins
  • PFBP - Protein Function and Biochemical Networks
  • Alliance for Cellular Signalling


Systems Biology


Other Databases (Annotations, Ontologies, Consortia, etc.)


Miscellaneous Tools


Computational Resources


Bioinformatics on-line course materials and tutorials (not an exhaustive collection)

Intro to bioinformatics and computational biology:




Web Sites for Background Information & News


Other Collections of Bioinformatics Resources


This page was last updated on February 3, 2022 (new resources added; not checked for broken URLs though for the already listed resources).