GenomeTraFaC is a database of conserved regulatory elements obtained by systematically analyzing the orthologous set of human and mouse genes. It mainly focuses on all of the high-quality mRNA entries of mouse and human genes in the Reference Sequence (RefSeq) database of the NCBI.

The identification of conserved potential cis-regulatory regions was done in a computational pipeline fashion using an advanced version of our earlier developed TraFaC server. The availability of putative regulatory information for most of the well annotated genes can also greatly facilitate analyses of groups of co-expressed or functionally related genes for the occurrence of ortholog-conserved shared transcriptional machinery.

Using the TraFaC (Jegga et al., 2002), PipMaker (Schwartz et al., 2000) and MatInspector (Quandt et al., 1995) suite of programs, we have aligned and analyzed a) more than 12000 human and mouse orthologous gene pairs that had a validated RefSeq ID from the Reference Sequence database of NCBI (Pruitt et al., 2003), and b) more than 260 human microRNAs (miRNAs) (from miRBase). The genomic sequences with flanking 40 kb (in case of genes) and 10 kb (in case of microRNAs) regions were downloaded from the UCSC genome browser (Human May 2004, and March 2006 assemblies and Mouse Aug 2005 and February 2006 assemblies). Sequence alignment was done using the BlastZ algorithm of PipMaker while the transcription factor binding sites were found using MatInspector, which utilizes the position weight matrices (PWM) library for the binding sites. TraFaC server was used to identify the common cis-elements within the evolutionarily conserved regions in human-mouse sequence alignment.

The current version of GenomeTraFaC database has cis-regulatory analysis results for more than 12000 RefSeq annotated human and mouse gene pairs and more than 260 human miRNAs. We are in the process of updating the database as the new RefSeq orthologous gene pairs and miRNAs become available.

