Ph.D. course in Biological Sequence Analysis and Protein Modelling
Visualisation of DNA Structures in Complete Genomes
-or-
Three SEVEN Views of the
Escherichia coli Genome.
Part 1
- What does it all mean? who cares?
- DNA Periodicities and Chromatin structures
- DNA Atlases for other organisms
view # 1 - "mechanical properties" of the DNA helix.
This includes the following:
note: DNase I sensitivity (the wheel inside of the annotation circle) and Position Preference (the wheel just outside of the annotation circle) are models based on tri-nucleotide values, whilst the other 4 measures use dinucleotide models.
view # 2 - "base composition" of the DNA sequence.
This includes the following:
1. Individual base contributions - the 4 seperate circles make it possible to spot regions enriched for one particular base (e.g., A near the yagG gene, and G near the phnM gene. (Maybe I need to work on a better colour scheme - this is still under development, and I'm open to any ideas!)
2. Trinucleotide distribution - this is a measure of the deviation of a particular region from the average for the entire chromosome. It is also possible to compare the region against a different genome (e.g., cp. Archae vs. Bacteria).
3. AT skew and GC skew - this is calculated by the formula (G-C)/(G+C), over a window of 5000 G's (which for E. coli is roughly 10,000 bp). It should be obvious that in E. coli there's an obvious GC skew, but not an AT skew. Different organisms have different skews, and although it could in part be explained by codon preference usage, this is probably
not the entire explanation. At any rate, this is a useful measure
in distinguishing the replicores in many bacteria.
view # 3 - "DNA repeats"
Combined view
For more E.coli atlases (as well as other genomes), visit the " DNA Structural Atlas of E.coli" web page from our the CBS server!
Escherichia coli is probably the best characterised organism.
some numbers:
There are 4085 predicted genes in Escherichia coli strain K-12 isolate W3110.
There are 4289 predicted genes in Escherichia coli strain K-12 isolate MG1665.
There are about 5100 predicted genes in Escherichia coli strain O157:H7 isolate EDL933 (enterohemorrhagic pathogen).
Roughly 2600 genes have been found to be expressed in Escherichia coli strain K-12 cells, under standard laboratory growth conditions.
About 2100 spots can be seen on 2-D protein gels.
Very roughly 1000 different genes (only about 600 mRNA transcripts) are expressed at "detectable levels" in E. coli cells grown in LB media.
Only about 350 proteins exist at concentrations of > 100 copies per cell. (These make up 90% of the total protein in E.coli!)

So what's this good for? Who cares?
- Architectural Motifs - e.g. Telomeres - see the DNA Structural Atlas for P. falciparium, chromosome 2.
- Gene regulation
- Curved DNA - can strongly enhance transcription
- Cruciforms
- Z-DNA
- Triplex-DNA
- Parallel-stranded DNA
- Recombination
- Mutagenesis
- Evolution
- Replication
- Apoptosis
link to Cookbook for this afternoon's lecture
For those who might be interested in learning more about DNA, visit my "
DNA is like Coke"web page!
Back to the Ph.D. course outline

Last modified on: 12 April, 2000 by Dave Ussery