backmovingbackmovingbackmovingbackmovingbackmovingbackmovingbackmoving
Chapter 10a
Gene Expression:
Transcription
backmovingbackmovingbackmovingbackmovingbackmovingbackmovingbackmovingbackmoving
 

 

Brief Outline
 
1. The flow of Genetic Information
2. Synthesizing Proteins from the Instructions of DNA
3. The Genetic Code
4. RNA: Intermediary in Protein Synthesis
 

 Freiz22.gif
 

1. The flow of Genetic Information:
 

DNA -> RNA -> protein

How does the sequence of a strand of DNA correspond to the amino acid sequence of a protein? This concept is explained by
The Central Dogma of Molecular Biology:

 

The Relationship between Genes and Proteins
 

Sense Strand figure
 
 

Shown below is an Illustration of the transcription of DNA to RNA to protein which forms the backbone of molecular biology.

Central Dogma of Molecular Biology

LEGEND

Or in the words of Francis Crick:
Once information has passed into protein, it cannot get out again.
 
This was taken from Genetech's homepage:
Link to Genetech's Access Excellence site
 
 

However, the "Central Dogma" has had to be revised a bit.  It turns out that you CAN go back from RNA to DNA, and that RNA can also make copies of itself.  It is still not possible to go from Proteins back to RNA or DNA, and no known mechanism has yet been demonstrated for proteins making copies of themselves.

 
 
New (revised a bit) Central Dogma
 
 
 


 

Try it for youself on the "DNA Workshop" (from PBS).
 

Click HERE for a link to nice historical review of The Central Dogma.
 
Link to MIT Hypertext on Central Dogma
 


 

2. Synthesizing Proteins from the Instructions of DNA

DNA ->RNA-> Protein
In a prokaryotic cell, this process happens at the same time:
Prokaryotic transcription
However, in an eukaryotic cell, the transcription & translation occur in different places:
 
 
 
 
3. The Genetic Code
Figure 10_36 from Hartl & Jones, 1998

Figure 10_37

The Genetic Code
 
4. RNA: Intermediary in Protein Synthesis

Why would the cell want to have an intermediate between DNA and the proteins it encodes?
 

  • The DNA can then stay pristine and protected, away from the caustic chemistry of the cytoplasm.

  •  
  • Gene information can be amplified by having many copies of an RNA made from one copy of DNA.

  •  
  • Regulation of gene expression can be effected by having specific controls at each element of the pathway between DNA and proteins. The more elements there are in the pathway, the more opportunities there are to control it in different circumstances.
  • What is RNA?

    RNA has the same primary structure as DNA. It consists of a sugar-phosphate backbone, with nucleotides attaches to the 1' carbon of the sugar. The differences between DNA and RNA are that:
     
      1. RNA has a hydroxyl group on the 2' carbon of the sugar (thus, the difference between deoxyribonucleic acid and ribonucleic acid).

       
      2. Instead of using the nucleotide thymine, RNA uses another nucleotide called uracil:

       
      Because of the extra hydroxyl group on the sugar, RNA is too bulky to form a a stable double helix. RNA exists as a single-stranded molecule. However, regions of double helix can form where there is some base pair complementation (U and A , G and C), resulting in hairpin loops. The RNA molecule with its hairpin loops is said to have a secondary structure.

       

       

        In addition, because the RNA molecule is not restricted to a rigid double helix, it can form many different tertiary structures. Each RNA molecule, depending on the sequence of its bases, can fold into a stable three-dimensional structure.

       

       From http://motif.stanford.edu/thesis/tRNA.html.

       

     
     
    Figure 13-11 from Griffiths et al. Table 13_2 from Griffiths et al., 1996
     

     
    Figure 1_07 from Hartl & Jones, 1998
     

     The Genetic Code

    How does an mRNA specify amino acid sequence? The answer lies in the genetic code. It would be impossible for each amino aciud to be specified by one nucleotide, because there are only 4 nucleotides and 20 amino acids. Similarly, two nucleotide combinations could only specify 16 amino acids. The final conclusion is that each amino acid is specified by a particular combination of three nucleotides, called a codon:

     

     Note the degeneracy of the genetic code. Each amino acid might have up to six codons that specify it. It is also interesting to note that different organisms have different frequencies of codon usage. A giraffe might use CGC for arginine much more often than CGA, and the reverse might be true for a sperm whale. Another interesting point is that some species vary from the codon association described above, and use different codons fo different amino acids. In general, however, the code depicted can be relied upon.

     How do tRNAs recognize to which codon to bring an amino acid? The tRNA has an anticodon on its mRNA-binding end that is complementary to the codon on the mRNA. Each tRNA only binds the appropriate amino acid for its anticodon.

     

     From http://motif.stanford.edu/thesis/tRNA.html.

     

    hyperbio@mit.edu

      Central Dogma, Part 1: Transcription

    link to Kimball biology page.
     

     

    How does the sequence information from DNA get transferred to mRNA so that it can be carried to the ribosomes in the cytoplasm? This process, called transcription is highly analogous to DNA replication. Of course, there are different effectors, or proteins, that direct transcription. Primary among these is the RNA polymerase holoenzyme, an agglomeration of many different factors that together direct the synthesis of mRNA on a DNA template.

      

     

    As mentioned above, transcription (like ANY polymerisation process) is divided into three parts:

    1. Initiation of Transcription

    RNA polymerase must be able to recognize the beginning of a gene so that it knows where to start synthesizing an mRNA. It is directed to the start site of transcription by one of its subunits' affinity to a particular DNA sequence that appears at the beginning of genes. This sequence is called a promoter. It is a unidirectional sequence on one strand of the DNA that tells the RNA polymerase both where to start and in which direction (that is, on which strand) to continue synthesis. The bacterial promoter almost always contains some version of the following elements:

     

    The two sequences shown in red are known as the "-35" (TTGACA) and "-10" (TATAAT) sites, based on their positions from the start of transcription.  These two sequences represent the CONSENSUS, based on comparison of several different sequences aligned at the transcription start site.  Another way of representing this consensus is by the application of information theory to sequence analysis.  One currently used method is "sequence logos", (this is based on "Shannon information", for those of you who are interested - see Schneider, T.M., Stepehns,R.M., "Sequence logos: a new way to display Consensus Sequences", Nucleic Acids Research, 18:6097-6100, (1990).)  The sequence logo, based on the promoter region of 167 different genes, (aligned by their transcriptional start site) is shown below:
    Sigma 70 consensus
     

    The sequence logo for the -10 "TATA" box for 60 human promoters, aligned on the TATA box, is shown below:
    Anders TATA
     

     
    2. Elongation of Transcription

    The RNA polymerase then stretches open the double helix at that point in the DNA and begins synthesis of an RNA strand complementary to one of the strands of DNA. We call the strand from which it copies the antisense or template strand, and the other strand, to which it is identical, the sense or coding strand.

     Transcription - startThe RNA polymerase recruits rNTPs (ribonucleic nucleotides triphosphates) in the same way that DNA polymerase recruits dNTPs. However, since synthesis is single stranded and only proceeds in the 5' to 3' direction, there is no need for Okazaki fragments.

     It is important to note that synthesis once again proceeds in a unidirectional fashion, because of the reasons outlined in the previous section.

     

    3. Termination of Transcription

    How does RNA polymerase know when to stop transcribing a gene? This system has been elucidated in prokaryotes. It is important to know that since there is no nucleus in prokaryotes, ribosomes can begin making protein from an mRNA immediately upon its synthesis. At the end of a gene, the sequence of the mRNA allows it to form a hairpin loop, which blocks the ribosome. The ribosome falls off the mRNA, and that is the termination signal recognized by the RNA polymerase. As soon as the ribosome falls off the mRNA, the RNA polymerase falls off the DNA and transcription ceases.





    Gene Expression: Transcription

    The majority of genes are expressed as the proteins they encode. The process occurs in two steps: Taken together, they make up the "central dogma" of biology: DNA -> RNA -> protein. Here is an overview. 

    This page examines the first step:

    Gene Transcription: DNA -> RNA

    DNA serves as the template for the synthesis of RNA much as it does for its own replication.

    The Steps

    Note that at any place in a DNA molecule, either strand may be serving as the template; that is, some genes "run" one way, some the other (and in a few remarkable cases, the same segment of double helix contains genetic information on both strands!). In all cases, however, RNA polymerase proceeds along a strand in its 3' -> 5' direction.
     
     

    Types of RNA

    Several types of RNA are synthesized:

    Ribosomal RNA (rRNA)

    There are 4 kinds. In eukaryotes, these are The name given each type of rRNA reflects the rate at which the molecules sediment in the ultracentrifuge. The larger the number, the larger the molecule (but not proportionally).

    The 28S, 18S, and 5.8S molecules are produced by the processing of a single primary transcript from a cluster of identical copies of a single gene. The 5S molecules are produced from a different cluster of identical genes.

    Transfer RNA (tRNA)

    There are some 32 different kinds of tRNA in a typical eukaryotic cell.

    Messenger RNA (mRNA)

    Messenger RNA comes in a wide range of sizes reflecting the size of the polypeptide it encodes. Most cells produce small amounts of thousands of different mRNA molecules, each to be translated into a peptide needed by the cell. Many mRNAs are common to most cells, encoding "housekeeping" proteins needed by all cells (e.g. the enzymes of glycolysis). Other mRNAs are specific for only certain types of cells. These encode proteins needed for the function of that particular cell (e.g., the mRNA for hemoglobin in the precursors of red blood cells).

    Small Nuclear RNA (snRNA)

    Approximately a dozen different genes for snRNAs, each present in multiple copies, have been identified. The snRNAs have various roles in the processing of the other classes of RNA. For example, several snRNAs are part of the spliceosome that participates in converting pre-mRNA into mRNA by excising the introns and splicing the exons.

    The RNA polymerases

    The RNA polymerases are huge multi-subunit protein complexes. Three kinds are found in eukaryotes.

    RNA Processing: pre-mRNA -> mRNA

    All the primary transcripts produced in the nucleus must undergo processing steps to produce functional RNA molecules for export to the cytosol. We shall confine ourselves to a view of the steps as they occur in the processing of pre-mRNA to mRNA.

    Split Genes

    Most eukaryotic genes are split into segments. In decoding the open reading frame of a gene for a known protein, one usually encounters periodic stretches of DNA calling for amino acids that do not occur in the actual protein product of that gene. Such stretches of DNA, which get transcribed into RNA but not translated into protein, are called introns. Those stretches of DNA that do code for amino acids in the protein are called exons. Examples: The cutting and splicing of mRNA must be done with great precision. If even one nucleotide is left over from an intron or one is removed from an exon, the reading frame from that point on will be shifted, producing new codons specifying a totally different sequence of amino acids from that point to the end of the molecule (which often ends prematurely anyway when the shifted reading frame generates a STOP codon).

    The removal of introns and splicing of exons is done with the spliceosome. This is a complex of several snRNA molecules and several proteins. The introns in most pre-mRNAs begin with a GU and end with an AG. Presumably these short sequences are essential for guiding the spliceosome.

    Alternate Splicing

    The processing of pre-mRNA for many proteins proceeds along various paths in different cells or under different conditions. For example, early in the differentiation of a B cell (a lymphocyte that synthesizes an antibody) the cell first uses an exon that encodes a transmembrane domain that causes the molecule to be retained at the cell surface. Later, the B cell switches to using a different exon whose domain enables the protein to be secreted from the cell as a circulating antibody molecule.

    So, whether a particular segment of RNA will be retained as an exon or excised as an intron can vary under different circumstances. Clearly the switching to an alternate splicing pathway must be closely regulated.

    Why split genes?

    Perhaps during evolution, eukaryotic genes have been assembled from smaller, primitive genes - today's exons. Some proteins, like the antibodies mentioned in the previous section, are organized in a set of separate sections or domains each with a special function to perform in the complete molecule. Each domain is encoded by a separate exon. Having the different functional parts of the antibody molecule encoded by separate exons makes it possible to use these units in different combinations. Thus a set of exons in the genome may be the genetic equivalent of the various modular pieces in a box of "Lego" for children to assemble in whatever forms they wish.

    But the boundaries of other exons do not seem to correspond domain boundaries of the protein. Furthermore, rRNA and tRNA genes are also split, and these do not encode proteins. So perhaps some exons are simply "junk" DNA that was inserted into the gene at some point in evolution without causing any harm.

    Summary

    Gene expression occurs in two steps:



    back Back to the Genetics Syllabus Chromosome icon



    Last modified on: 4 February, 2000 by Dave Ussery