Chapter 14
Molecular Genetics
& Biotechnology


Outline:
 Introduction to BioInfomatics

A List of Completely Sequenced Genomes

The Genome of Escherichia coli
 



 Chapter 14 from GriffiGenetics Text:

Assigning Loci to Specific Chromosomes

Chromosome Mapping

Physical Mapping of Genomes
 

Genetic Analysis using physical Maps
 

Genome sequencing
 
 


A List of Completely Sequenced Genomes...
(so far, as of 30 September, 1997) 
 
 
Organism Type Size (Mbp) number of genes date sequenced
Haemophilus influenzae Bacteria (Gm-)
1.83
1703
 July, 1995
Mycoplasma genitalium Bacteria (Gm-)
0.58
470
October, 1995
Synechocystis sp. Bacteria ("blue-green algae")
3.57
3168
May, 1996
Methanococcus jannashchii Archaebacteria
1.66
1738
August, 1996
Mycoplasma pneumoniae Bacteria (Gm-)
0.81
677
November, 1996
Saccharomyces cerevisiae Eukaryotic 
("baker's yeast")
13
5885
January, 1997
Helicobacter pylori Bacteria (Gm-)
1.66
1590
August, 1997
Escherichia coli Bacteria (Gm-)
4.60
4288
September, 1997
Bacillus subtilis Bacteria (Gm+)
4.20
-
 "submitted"
Archaeoglobus fulgidus Archaebacteria
2.20
-
 "submitted"
Borrelia burgdorferi Bacteria (Gm-)
1.30
-
 "submitted"
 
 
 

Organisms anticipated being completed in 1997:

Deinococcus radiodurans (bacteria)
Plasmodium falciparum    (chromosome # 2, eukaryote)
Streptococcus pneumoniae (bacteria)
Treponema pallidum   (bacteria)
Ureaplasma urealyticum (bacteria)
 

There are presently (at least) 35 other organisms (including humans), whose genomes are being sequenced....
 


 Escherichia coli   genome

some trivial numbers:

        4,639,221 bp total

               4288 protein coding genes:
                  30% "well characterised"
                 ~30% "no known function"

Average distance between genes: 118 bp;
  (only 70 regions >600 bp)

Protein coding genes account for ~88% of total.
                                       ~1% stable RNAs
                                       ~1% repeats
                                      ~10% "regulatory"
 


Leading & Lagging strands of DNA are subject to differential mutational rates....



Open Reading Frames  (ORFs):

        Average size - 317 aa

                        (1500-1700 aa) -  4 proteins
                        (1000-1500 aa) - 51 proteins
                             (<100 aa)  - 381 proteins

        ~40% of ORFs are "uncharacterised"

        cp. : 43% ORFs "unchar." in Haemophilus
               45% ORFs "unchar." in Synechocystis
               32% ORFs "unchar." in Mycoplasma

    cp. transporter proteins      vs.   translation proteins
E.coli            281 "well def."
                    146 "put."
                    427 proteins total       182 proteins

Haemoph.      123 proteins               141 proteins

Mycoplas.       34 proteins                101 proteins
 

very roughly (remember, ~40% unknown):
            ~20% energy goes to small molecule metabolism
            ~12% energy goes to LARGE molecule metab.
            ~20% energy goes to cell structure & processes

E.coli is most similar to Haemophilus (out of 5 genomes compared):
 
 

Organism size of genome number proteins E.coli hits
Haemophilus influenzae
1.83
1703
1130
Synechocystis sp.
3.57
3168
675
Mycoplasma genitalium
0.58
470
111
Methanococcus jannashchii
1.66
1738
231
Saccharomyces cerevisiae
13
5885
254
Only 16 proteins are conserved in all six organisms (!!?)  these are mainly translation associated proteins, including 7 rRNA proteins.

Nearly 60% of the proteins from E.coli have no match in the other organisms (or GenBank).

Largest family of proteins in E.coli is ABC transporters... 54 characterised by Monica Riley, + 26 new members - now 80 total proteins.



Operons - 2584 predicted & known operons, most (68%) have one promoter, and roughly 90% are thought to be regulated by only one protein.
 



 Chapter 17 from Griffiths et al. Genetics Text:

Assigning Loci to Specific Chromosomes

Chromosome Mapping

Physical Mapping of Genomes
 

Genetic Analysis using physical Maps
 

Genome sequencing

 


O Back to Dave's Roanoke College HOMEPAGE