Anil Jegga


Departmental Affiliation(s):

Division of Biomedical Informatics (Cincinnati Children's Hospital Medical Center - CCHMC)

Department of Pediatrics (University of Cincinnati - College of Medicine)


Click here for details of current and past students and to find out where our graduates are after their graduation.

Research Description:

We focus our research on biological networks, such as (1) gene regulatory networks (2) biomolecular interactions (disease and therapeutic networks), and (3) biomedical ontological networks

Gene Regulatory Networks (GRNs): Completion of genome sequencing has shown that all animals share, more or less, the same repertoire of genes. To understand differences between cells within a species or between species or between healthy and disease states we need to understand when, where, and how genes are expressed. A lot of this information is contained in non-coding DNA. We are using and developing a variety of computational methods to understand how genes are wired together to form functional gene regulatory networks.

Transcriptional Regulatory Networks (TRNs): Although the control of gene expression can occur in multiple steps, a vast majority of regulatory events occur at the level of transcription. The molecular basis for transcriptional regulation is the binding of transacting proteins (transcription factors - TFs) to cis-acting sequences (transcription factor binding sites - TFBS). We have developed a web accessible tool called Trafac for this purpose. Trafac identifies the cis-elements in the phylogenetic footprints which are non-coding DNA regions of 6 or more bp having almost 100% similarity and conserved across several species separated by several million years of evolution. We have also built a cis-regulatory database (GenomeTrafac) for all the RefSeq enlisted human and mouse mRNAs. Identification of the cis-regulatory regions will lead to construction of GRNs which provide clues to unravel the evolution of body plans. Evolutionary differences in GRNs of developmental processes must be responsible for morphological change. However, to demonstrate this fundamental principle explicitly needs synthesis of many more high-quality GRNs governing diverse developmental processes across a variety of species. Understanding cis-regulatory keys and the GRNs also facilitate studying the effect of variations within these regions. Many human diseases are known to result from genetic defects in transcription factors. In most cases, polymorphisms or mutations in transcription factors lead to pleiotropic effects. In addition, many events that lead to the process of tumorigenesis implicate either overexpression or mutations of TFs.

Post-transcriptional Microregulatory Networks (MRNs): An increasing number of reports acknowledge that the regulatory RNAs (rRNAs) are key players in the GRNs. Among these rRNAs are the microRNAs (miRs or miRNAs), small RNAs that regulate gene expression at the posttranscriptional level. miRNAs mediate posttranscriptional gene silencing through inhibition of protein production or degradation of mRNAs. So far little is known about the extent of regulation by miRNAs, especially their potential cooperation with other regulatory layers (transcription factors for instance) in the network. We are currently investigating the potential crosstalk between the miRNAs and transcription factors. In other words, interaction between posttranscriptional and transcriptional networks.

From Interactome to Diseasome and Pharmacome - Mining biomolecular interactions for disease-gene and drug-target networks: Based on the hypothesis that novel disease genes reside in the same pathways as those of known disease genes and that disruption of genes of similar function will lead to the same phenotype, we are currently mining the biomolecular interactome for novel disease gene discovery, analysis and prioritization (our published efforts in this direction) and drug repositioning candidates.

Orphan Diseasome: Mining the known knowledge about orphan diseases and their causal genes, we built and conducted a global analysis of the human orphan diseasome. Starting with a bipartite network of gene and orphan disease relations and using human protein interactome, we construct orphan disease network, orphan disease gene network and orphan disease gene interactome and analyze their topological features. The associated Orphan Diseasome web site (prototype) allows investigators to explore the orphan disease or rare disease relationships based on shared genes and shared enriched features (e.g., Gene Ontology Biological Process, Cellular Component, Pathways, Mammalian Phenotype). Additionally, users can also explore the networks of orphan disease causal genes where the nodes are orphan disease genes while the edge represents shared OD or a protein-protein interaction.

Biomedical Ontological Networks (BONs): The primary goal of this project is to integrate various biomedical ontologies and capture the disease-gene associations from the biomedical literature. We are developing a common platform integrating heterogeneous ontologies and annotations like Gene, Disease, Anatomy and Mammalian Phenotype ontologies and pathways. The anatomy-disease associations are mined from the UMLSKS while the disease-gene associations are mined from OMIM, GAD and biomedical literature parsing (currently limited to GeneRIFs from NCBI). Other related areas of research:

  1. Prioritization of Disease-Causal SNPs: The integration of the coding SNPs (Single Nucleotide Polymorphisms) information and the protein structure and protein functional domains. Incorporating more than 25000 genes and proteins, PolyDoms integrates several data sources and connects processes that have traditionally been studied in isolation. We aim to predict the implications of non-synonymous SNPs on biological systems and processes via their occurrence in the conserved functional domains, structures, prediction algorithms (like SIFT and PolyPhen). After all one cannot understand the behavior of the cell without knowing this biological circuitry and the interplay of the individual players. We are also working on prioritization of non-coding SNPs.
  2. Prioritization of Disease-Causal Genes: The majority of common diseases are multi-factorial and modified by genetically and mechanistically complex polygenic interactions and environmental factors. High-throughput genome-wide studies like linkage analysis and gene expression profiling, tend to be most useful for classification and characterization but do not provide sufficient information to identify or prioritize specific disease causal genes. Extending on an earlier hypothesis that the majority of genes that impact or cause disease share membership in any of several functional relationships we are studying the effect of different data integration methods and disease candidate gene prioritization. We strongly believe that our approach, ToppGene, which for the first times uses mouse phenotype data as one of the features for gene prioritization, greatly improves the human disease candidate gene analysis and prioritization.  


MEDLINE database search:

Search for this researcher's PubMed references

Search for this researcher on Google Scholar 

This page was last updated on December 16, 2013