Circular Genome Map represents the genome in circular form, a visualization approach typical for circular bacterial genomes and plasmids. From the outer ring in wards, genes on direct strand (pink), genes on complementary strand (yellow), tRNAs (green arrows), rRNAs (pink or orange stripes depending on the strand), GC content (brown lines), GC skew (yellow lines) are displayed. Replication origin and terminus predicted from the GC skew shift points are also labeled.
This view is useful to see the chromosomal organization of genes, especially related to replication. See the following section on GC content and GC skew where illustrative examples are given for highly skewed genomes and identification of possible horizontally transferred genes.
Copy number of tRNAs and rRNAs are suggested to correlate the growth rate of bacteria. "Variation in the strength of selected codon usage bias among bacteria", Paul M. Sharp*, Elizabeth Bailes, Russell J. Grocock, John F. Peden and R. Elizabeth Sockett, NAR 2005
Outermost red and yellow rings represent the position of genes. Outer red ring corresponds to the direct strand of genome flatfile annotation, and thus inner yellow ring corresponds to the complement strand. Each stripe represents a single gene. Thickness of the stripe corresponds to the length of the gene. Note that typical bacterial genes are about 1kbp in length.
Coordinates of the gene positions are labled both inside and outside of these two rings.
Circular bacterial genomes have a single pair of replication origin and terminus, which is marked by a long yellow line running all the way across the rings, dividing the genome in two segments. Moving clockwise or to the right of replication origin up to the terminus, outer red ring is the leading strand and inner yellow ring is the lagging strand. Similarly, moving anti-clockwise or to the left of replication origin up to the terminus, inner yellow ring is the leading strand and outer red ring is the lagging strand.
In E.coli, 55% of genes are located on the leading strand. Most bacterial genomes have more genes on the leading strand than on the lagging strand, which can be clearly seen with Genome Projector.
tRNAs are represented by arrows, directed in the orientation as stated in the genome flatfile: clock-wise when direct, anti-clockwise when complement. Since tRNAs are relatively short compared to coding genes, only about 75bp in length compared with 1kbp of coding genes, so the length of arrows are much longer than the actual length of tRNAs. Therefore, exact location of tRNAs are also marked with stripes, similar to the representation of genes in outer rings. Color of the stripe is blue for direct strand, and green for complement strand.
Note that tRNAs are often closely located within bacterial genomes as operons. In Genome Projector, these closely located tRNAs may seem like one arrow, but they can be located by the thickness of the stripes. For example, in the following screen capture, the right-most arrow actually represents two closely located tRNAs, which has thicker, or brighter, green stripe. Pins showing the search results may also overlap for tRNAs, but the text results shown in right collapsible window correctly shows all overlapping search results.
rRNAs are represented by pink (for direct strand) and orange (for complement strand) stripes located one step inner than tRNAs. Since rRNAs are typically long, these are only represented by stripes and not by arrows.
rRNAs tend to strongly prefer the leading strand, and in some genomes, many rRNAs are located close to the replication origin.
GC content is the percentage of GC bases out of the four nucleotides. Bacterial genomes exhibits quite remarkable diversity in their genomic GC content, ranging from as low as around 10% up to as much as 90%. In Genome Projector, GC content is shown with brown graph, calculated with 2000 bp windows sliding 1000 bp each.
GC content usually does not vary much within the genome, but local area of abnormal GC content is sometimes indicative of horizontally transferred genes or insertions. For example, the lower left region with low GC content in Corynebacterium glutamicum is a known large insertion region.
GC skew is the excess of C over G in certain regions, formulated as (C-G)/(C+G). In bacterial genomes, replicational selection prefers Guanine over Cytosine in leading strands, therefore positive GC skew value is typically observed in leading strands, and negative in lagging strands. In fact, GC skew is often utilized to define the positions of replication origin and terminus in bacterial genomes. In many bacterial genome projects, the position 1 in genome flatfiles correspond to the putative replication origin.
- Lobry, J.R., 1996. Asymmetric substitution patterns in the two DNA strands of bacteria. Mol. Biol. Evol. 13, 660-665
Following is a highly selected example, a genome of Clostridium perfringens, which shows extremely biased GC skew and gene orientation.
In Genome Projector, replication origin and terminus are predicted using cumulative GC skew at single base resolution. For detailed algorithms, please see the documentation of find_ori_ter in G-language GAE.
Upon searching, search results are shown as pins on the map, or as text shown in collapsible window on the right-most side. Clicking on each of the pins or text result entries will bring up a dialogue baloon, which shows the following information:
- gene name
- product description
- Gene Ontology terms if available
- 3D structure if PDB entry was found
- Links to UniProt, KEGG, NCBI RefSeq, and PDB (if link was found)