I. Schrödinger and Morse Code
In 1943, Erwin Schrödinger gave a famous series of lectures at Trinity College in Dublin, Ireland, where he speculated about the physics of biology. He proposed two main ideas, which will be discussed briefly.
In the former, Schrödinger postulated that perhaps the genetic material (then unknown, but thought to be protein) might be an "aperiodic solid", which contains coded information - perhaps somewhat like Morse code. This idea is actually the basis for the "Central Dogma" of molecular biology. In this situation, a reductionistic view has been quite successful in understanding much of biology in terms of genes.order from order order from disorder
The
flow of Genetic Information:


LEGEND
However,
the "Central Dogma" has had to be revised a bit. It turns out that
one CAN go back from RNA to DNA, and that RNA can also make copies of itself.
It is still not possible to go from Proteins back to RNA or DNA, and no
known mechanism has yet been demonstrated for proteins making copies of
themselves.

There are two aspects in which Schrödinger's order from disorder also play an important role in biology:
"negative entropy" - where an organism uses the energy obtained from burning food to offset the cost of storing information in the form of DNA, RNA, and protein sequences.
self-organisation - this is the process by which complex systems can spontaneously appear. It is based on non-linear systems, far from equilibrium, and hence difficult (if not impossible) to predict, although some aspects can be modelled.
"What is Life?" by Erwin Schrödinger (Cambridge University Press, 1944)
"What is Life? The Next Fifty Years - Speculations on the Future of Biology" (edited by Michael P. Murphey and Luke A.J. O'Neill, Cambridge University Press, 1995.)
"At Home in
the Universe - The Search for the Laws of Self-organization and Complexity",
by Stuart Kauffman (Oxford University Press, 1995).
DNA
sequence as information
The DNA sequence contains several different types of information:
1. The DNA sequence can code for an amino acid sequence for proteins2. The DNA sequence can code for an RNA sequence
3. The DNA sequence can code for protein binding sitestRNA rRNA snRNA telomeraseRNA other RNAs
4. The DNA can code for architectural information5. The DNA can code for structural / stability informationintrinsic DNA curvature nucleosome positioning transcription initiation origins of replication mutational "hot spots"
RNA
sequence as information
The RNA sequence also contains several different types of information:
1. The mRNAs can contain several different levels of information:2. The tRNAs code for the genetic code - same in all living organisms (n.b. diff. in mitochondria)specifies amino acid sequence for proteins localisation signals for WHERE the protein will be made stability signals to determine HOW MUCH protein is made splice sites editing sites
4. The rRNAs code for the structures of ribosomes5. Other RNA/protein complexes have important biological functionsintrinsic DNA curvature nucleosome positioning RNA template for teleomerase enzyme - necessary to prevent cancer snRNAs necessary for mRNA splicing snoRNAs are small nucleolar RNAs.
The PROTEIN sequence contains several different types of information:
1. The protein sequence can code for an "active site" for enzymesIII. A Few Words on the speed of DNA sequencing2. The protein sequence can code for structural roles:
3. The protein sequence can code for ion channels/pumpsmicrotubules myosin collagen etc.
4. The protein sequence can code for localisation information5. The protein sequence can code for modification siteswithin the cell extra-cytoplasmic
In
1977, Fred Sanger and colleagues in Cambridge, England, sequenced the first
bacteriophage (phi-X174, 5386 bp long), for which he later won the Nobel
prize. Although this was a dramatic improvement over the conventional
methods, this was still very slow, compared to the amount of information
in a single human cell.
About
a decade later, the human genome project was launched; this was an international
effort, and the U.S. alone would pay about $200,000,000 per year for 20
years! Most of this investment was in technology to speed up sequencing,
which in fact has been realised. Within a year, it is likely that
it will be possible to read the entire DNA sequence of a human cell, in
a few hours.

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

The
human genome project has also had a major influence on the rest of biology,
as other organisms are being sequenced as goals towards the ambitious end
of the 3,000,000,000 bp (or so) nucleotide sequence for the human genome.
In particular, the sequencing of complete bacterial genomes is revolutionizing
the field of microbiology. Presently, bacterial genomes are
being sequence at a rate of slightly faster than one new genome every month!
As technology improves, this rate will increase. It is estimated
that within the next two years, we will know the complete genomic sequence
of most major pathogenic bacteria.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

What is "genomics"?
genome
d3i.noum. Biol. Formerly also genom -nom. [a.
G. genom (H. Winkler Verbreitung
u. Ursache d. Parthenogenesis (1920) iv. 165), irreg. f. gen
gene1
+ chromosom chromosome.]
A haploid set of chromosomes; the sum-total of
the genes in such a set.
1930
Cytologia I. 14 Chromosomes from different sets (or genoms) of Triticum
vulgare show affinity toward each other. 1930 [see allopolyploidy]. 1932 Proc. 6th Int. Congr. Genetics I. 275 The inviability of deficient genomes in the haploid generation serves to some extent as an alternative distinction between mutation and deficiency. 1932 Proc. 6th Int. Congr. Genetics II. 5 There are two species having genoms resembling C. neglecta. 1952 C. P. Blacker Eugenics x. 243 The appearance of such terms as gene-complex and genome (denoting a set of chromosomes as a working unity) testify to the movement towards holism in genetics. 1965 A. M. Srb et al. Gen. Genetics (ed. 2) vii. 190 Among organisms with chromosomes, each species has a characteristic set of genes, or genome. In diploids a genome is found in each normal gamete. It consists of a full set of the different kinds of chromosomes. 1970 Sci. Amer. Oct. 19/1 The human genome..consists of perhaps as many as 10 million genes. |
Bacteriophage lambda
has a genome of about 50,000 bp. If you were to print the entire
sequence out, with roughly 25,000 bp per page, it would take about 2 pages.
(The sequence would be in a very small font, and you could barely read
it!)
The
common bacteria Escherichia coli is perhaps the best-studied organism
in all of biology. However, when the complete genomic sequenc of
E.coli was published about a year ago, many were surprised that
only about a third of the proteins had been well-characterised. Another
third was perhaps known about, based on DNA sequence analysis, but the
remaining third of potential proteins was not expected.
The
yeast Saccharomyces cerevisiae was the first (and only so far) eukaryote
to be sequenced. The yeast genome would occupy a thin volume of about
500 pages, or roughly twice the thickness of the E.coli volume.
I should mention that recent genetic analysis of the complete yeast genome
has found that it likely has arisen from a duplication event - that is,
yeast came from a more primitive organism which contained only half the
number of chromosomes.
The
first "animal" to be sequenced is likely to be the nematode C.elegans,
which is about 100,000 bp long. The plant Arabidopsis thaliana is
also being sequenced, and it is about the same size. Both of these
genomes are likely to be completed within the next year or so.
FINALLY,
the human genome, by comparison, is quite large. Using the same analogy
as above, the human genome would fill 80 volumes!
A
List of Genomes that have been Completely Sequenced:
(so
far, as of 1 Sept., 1998)
| Organism | # | Type | Size (Mbp) | number of genes |
|
| Haemophilus influenzae | 1 | Bacteria (Gm-) |
|
|
|
| Mycoplasma genitalium | 2 | Bacteria (Gm-) |
|
|
|
| Saccharomyces cerevisiae | 3 | Eukaryotic
("baker's yeast") |
|
|
|
| Methanococcus jannashchii | 4 | Archaebacteria |
|
|
|
| Synechocystis sp. | 5 | Bacteria ("blue-green algae") |
|
|
|
| Mycoplasma pneumoniae | 6 | Bacteria (Gm-) |
|
|
|
| Escherichia coli | 7 | Bacteria (Gm-) |
|
|
|
| Methanobacterium
thermoautotrophicum |
8 | Archaebacteria |
|
|
|
| Archaeoglobus fulgidus | 9 | Archaebacteria |
|
|
|
| Helicobacter pylori | 10 | Bacteria (Gm-) |
|
|
|
| Borrelia burgdorferi | 11 | Bacteria (Gm-) |
|
|
|
| Treponema pallidum | 12 | Bacteria (Gm-) |
|
|
|
| Bacillus subtilis | 13 | Bacteria (Gm+) |
|
|
|
| Pyrococcus horikoshii | 14 | Archaebacteria |
|
|
|
| Aquifex aeolicus | 15 | Eubacteria |
|
|
|
| Mycobacterium tuberculosis | 16 | Bacteria (Gm+)
|
|
|
|
| Treponema pallidum | 17 | Bacteria (Gm-) |
|
|
|
|
|
|
|
|
|
|
| Bacillus sp. C-125 |
|
|
|
|
|
| Pseudomonas aeruginosa |
|
|
|
|
|
| Pyrobaculum aerophilum |
|
|
|
|
|
| Pyrococcus abyssii |
|
|
|
|
|
| Rickettsia prowazekii |
|
|
|
|
|
| Ureaplasma urealyticum |
|
|
|
|
|
| Deinococcus radiodurans |
|
|
|
|
|
|
|
|
|
|
|
|
| Rhodobacter capsulatus |
|
|
|
|
|
| Streptococcus
|
|
|
|
|
|
| Thermotoga maritima |
|
|
|
|
|
| Ureaplasma urealyticum | 12 |
|
|
|
|
| Vibrio cholerae | 13 |
|
|
|
|
A list of microbial genomes which are being sequenced and are presently
searchable through TIGR:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This is based in part from data from The Institute for Genomic Research. For more information click on the TIGR logo.
Information
also came from Magpie Genome Sequencing project list - click on the bird
for a link.
There
are presently more than 100
organisms (including humans), whose genomes are in the process of being
sequenced....

980830 du