A List of Completely Sequenced Genomes
The Genome
of Escherichia coli
Assigning Loci to Specific Chromosomes
Chromosome Mapping
Physical
Mapping of Genomes
Genetic
Analysis using physical Maps
Genome
sequencing
| Organism | Type | Size (Mbp) | number of genes | date sequenced |
| Haemophilus influenzae | Bacteria (Gm-) |
|
|
|
| Mycoplasma genitalium | Bacteria (Gm-) |
|
|
|
| Synechocystis sp. | Bacteria ("blue-green algae") |
|
|
|
| Methanococcus jannashchii | Archaebacteria |
|
|
|
| Mycoplasma pneumoniae | Bacteria (Gm-) |
|
|
|
| Saccharomyces cerevisiae | Eukaryotic
("baker's yeast") |
|
|
|
| Helicobacter pylori | Bacteria (Gm-) |
|
|
|
| Escherichia coli | Bacteria (Gm-) |
|
|
|
| Bacillus subtilis | Bacteria (Gm+) |
|
|
|
| Archaeoglobus fulgidus | Archaebacteria |
|
|
|
| Borrelia burgdorferi | Bacteria (Gm-) |
|
|
|
Organisms anticipated being completed in 1997:
Deinococcus
radiodurans (bacteria)
Plasmodium
falciparum (chromosome #
2, eukaryote)
Streptococcus
pneumoniae (bacteria)
Treponema
pallidum (bacteria)
Ureaplasma
urealyticum (bacteria)
There
are presently (at least) 35 other
organisms (including humans), whose genomes are being sequenced....
Escherichia coli genome
some trivial numbers:
4,639,221 bp total
4288 protein coding genes:
30% "well characterised"
~30% "no known function"
Average
distance between genes: 118 bp;
(only 70 regions >600 bp)
Protein
coding genes account for ~88% of total.
~1% stable RNAs
~1% repeats
~10% "regulatory"
Leading & Lagging strands of DNA are subject to differential mutational rates....
Average size - 317 aa
(1500-1700 aa) - 4 proteins
(1000-1500 aa) - 51 proteins
(<100 aa) - 381 proteins
~40% of ORFs are "uncharacterised"
cp. : 43% ORFs "unchar." in Haemophilus
45% ORFs "unchar." in Synechocystis
32% ORFs "unchar." in Mycoplasma
cp. transporter proteins vs.
translation proteins
E.coli
281 "well def."
146 "put."
427 proteins total 182 proteins
Haemoph. 123 proteins 141 proteins
Mycoplas.
34 proteins
101 proteins
very
roughly (remember, ~40% unknown):
~20% energy goes to small molecule metabolism
~12% energy goes to LARGE molecule metab.
~20% energy goes to cell structure & processes
E.coli
is most similar to Haemophilus (out of 5 genomes compared):
| Organism | size of genome | number proteins | E.coli hits |
| Haemophilus influenzae |
|
|
|
| Synechocystis sp. |
|
|
|
| Mycoplasma genitalium |
|
|
|
| Methanococcus jannashchii |
|
|
|
| Saccharomyces cerevisiae |
|
|
|
Nearly 60% of the proteins from E.coli have no match in the other organisms (or GenBank).
Largest
family of proteins in E.coli is ABC transporters... 54 characterised by
Monica Riley, + 26 new members - now 80 total proteins.
Assigning Loci to Specific Chromosomes
Chromosome Mapping
Physical
Mapping of Genomes
Genetic
Analysis using physical Maps
Genome sequencing