Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Usage instructions



Quick guide

Start by pasting in (or uploading) you DNA sequences in FASTA (sequence only), GenBank (CDS elements will be extracted) or TAB (sequence + annotation) format. Free format: if you have a single sequence you want to translate, you can simply paste it in. In this case all non-alphabetic characters (such as numbers) are ignored, making it easy to just copy and paste the sequence from most other data formats.

Hit "Submit query" to run the translation using the Standard Genetic Code and default parameters.


Options

Translation table

This is the single most important option in the Virtual Ribosome, since it is here it is possible to change the translation table used. All translation tables defined by the NCBI taxonomy group can be selected (see details here: The Genetic Codes [NCBI]). Please notice that the alternative start codons defined in each translation table is also used. For example, in the Standard Genetic Code (see table below), the codons TTG and CTG is allowed as methionie coding start-codons.

Standard Genetic code

    AAs  = FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = ---M---------------M---------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

Start codons

This option is closely related to the alternative start-codons mentioned above. By default the very first codon in the DNA sequence, is considered to be the start-codon which means the special rules of alternative start-codons applies and (for example) the codon TTG will code for methionie and not leucine as when the codon is used internally.

By selecting the "All codons are internal" option all of the sequence is considered internal and start-codon rules are not applied (useful for working with sequence fragments).

Stop codons

This option determines if the tranlation should be terminated at the first encountered stop-codon or not. The default is to read through the entire sequence marking stop-codons with "*".

Reading frame

This option governs the reading frame to use for translation. It's possibe to select either a single reading frame (1, 2, 3 on the plus strand and -1, -2, -3 on the minus strand), or a set of multiple reading frames ("all" = all 6; "plus" = 1, 2, 3; "minus" = -1, -2, -3).

When a single reading frame is selected, the output is (obviously) shown in regard to this frame only. For example:

VIRTUAL RIBOSOME
----------------
Translation table: Standard SGC0 

>Seq1
Reading frame: 1

    M  V  L  S  A  A  D  K  G  N  V  K  A  A  W  G  K  V  G  G  H  A  A  E  Y  G  A  E  A  L  
5' ATGGTGCTGTCTGCCGCCGACAAGGGCAATGTCAAGGCCGCCTGGGGCAAGGTTGGCGGCCACGCTGCAGAGTATGGCGCAGAGGCCCTG 90
   >>>...)))..............................................................................))) 

    E  R  M  F  L  S  F  P  T  T  K  T  Y  F  P  H  F  D  L  S  H  G  S  A  Q  V  K  G  H  G  
5' GAGAGGATGTTCCTGAGCTTCCCCACCACCAAGACCTACTTCCCCCACTTCGACCTGAGCCACGGCTCCGCGCAGGTCAAGGGCCACGGC 180
   ......>>>...))).......................................)))................................. 

    A  K  V  A  A  A  L  T  K  A  V  E  H  L  D  D  L  P  G  A  L  S  E  L  S  D  L  H  A  H  
5' GCGAAGGTGGCCGCCGCGCTGACCAAAGCGGTGGAACACCTGGACGACCTGCCCGGTGCCCTGTCTGAACTGAGTGACCTGCACGCTCAC 270
   ..................)))..................)))......))).........)))......)))......)))......... 

    K  L  R  V  D  P  V  N  F  K  L  L  S  H  S  L  L  V  T  L  A  S  H  L  P  S  D  F  T  P  
5' AAGCTGCGTGTGGACCCGGTCAACTTCAAGCTTCTGAGCCACTCCCTGCTGGTGACCCTGGCCTCCCACCTCCCCAGTGATTTCACCCCC 360
   ...)))...........................))).........))))))......))).............................. 

    A  V  H  A  S  L  D  K  F  L  A  N  V  S  T  V  L  T  S  K  Y  R  *  
5' GCGGTCCACGCCTCCCTGGACAAGTTCTTGGCCAACGTGAGCACCGTGCTGACCTCCAAATACCGTTAA 429
   ...............))).........)))..................)))...............*** 
Annotation key:
>>> : START codon (strict)
))) : START codon (alternative)
*** : STOP

When multiple reading frames are selected the peptides are "stacked" in the visualization, as seen in the example below. Notice how the START codon "arrows" are reversed on the minus strand to indicate the direction of translation.

VIRTUAL RIBOSOME
----------------
Translation table: Standard SGC0 

>Seq1 - reading frame(s): all

      G  A  V  C  R  R  Q  G  Q  C  Q  G  R  L  G  Q  G  W  R  P  R  C  R  V  W  R  R  G  P   
     W  C  C  L  P  P  T  R  A  M  S  R  P  P  G  A  R  L  A  A  T  L  Q  S  M  A  Q  R  P  W 
    M  V  L  S  A  A  D  K  G  N  V  K  A  A  W  G  K  V  G  G  H  A  A  E  Y  G  A  E  A  L  
5' ATGGTGCTGTCTGCCGCCGACAAGGGCAATGTCAAGGCCGCCTGGGGCAAGGTTGGCGGCCACGCTGCAGAGTATGGCGCAGAGGCCCTG 90
   >>>...))).)))...............>>>..........)))........))).........)))......>>>...........))) 
   ....................(((...(((..***(............(((.................(((.........(((........ 
3' TACCACGACAGACGGCGGCTGTTCCCGTTACAGTTCCGGCGGACCCCGTTCCAACCGCCGGTGCGACGTCTCATACCGCGTCTCCGGGAC 90
    H  H  Q  R  G  G  V  L  A  I  D  L  G  G  P  A  L  N  A  A  V  S  C  L  I  A  C  L  G  Q  
      T  S  D  A  A  S  L  P  L  T  L  A  A  Q  P  L  T  P  P  W  A  A  S  Y  P  A  S  A  R   
     P  A  T  Q  R  R  C  P  C  H  *  P  R  R  P  C  P  Q  R  G  R  Q  L  T  H  R  L  P  G  P 

Annotation key:
PLUS strand
-----------
>>> : START codon (strict)
))) : START codon (alternative)
*** : STOP
MINUS strand
------------
<<< : START codon (strict)
((( : START codon (alternative)
*** : STOP

ORF finder

The Virtual Ribosome has the option of scanning the input DNA sequence for ORFs (Open Reading Frames).

For each sequence the longest ORF is reported. The corresponding DNA fragment is also included for download (embedded in the comment field of the TAB file). For each sequence the specified reading frames are scanned for ORFs. The citeria for opening an ORF can be adjusted as follows:

  1. Start codon: Strict: Only open ORFs at strict START codons (those always coding for methionine, e.g ATG)
  2. Start codon: Any: Open ORFs at any START codon.
  3. Start codon: None: Open ORFs at any codon except STOP.

Advanced options

The advanced options relates the behavior of the translation when TAB format sequences containing annotation of Intron/exon structure has been used as the input (see detail description of TAB files below).

Derived sequence annotation

When the Inton/Exon structure is known, the Virtual Ribosom automatically extract the exonic parts of the sequence and performs the translation only on these. Following translation, an analysis of the underlying Intron/Exon structure is performed and used to add annotation to the protein sequence, in the form of a TAB file.

Two types on analysis are available:

1) Exon numbering: Each amino-acid is annotated the the number of the exon which encoded it (or a least hosted the first nucleotide in the codon).

VIRTUAL RIBOSOME
----------------
Translation table: Yeast Mitochondrial SGC2 

>Q0045 - translation and annotation of the exonic structure

pep: MVQRWLYSTNAKDIAVLYFMLAIFSGMAGTAMSLIIRLELAAPGSQYLHGNSQLFNVLVVGHAVLMIFFLVMPALIGGFGNYLLPLMIGA 90
ann: 111111111111111111111111111111111111111111111111111111111222222222222333333333333444444444 90

pep: TDTAFPRINNIAFWVLPMGLVCLVTSTLVESGAGTGWTVYPPLSSIQAHSGPSVDLAIFALHLTSISSLLGAINFIVTTLNMRTNGMTMH 180
ann: 444444444444444444444444444444444444444444444444444444444444444444444444444444444444444444 180

pep: KLPLFVWSIFITAFLLLLSLPVLSAGITMLLLDRNFNTSFFEVSGGGDPILYEHLFWFFGHPEVYILIIPGFGIISHVVSTYSKKPVFGE 270
ann: 444444444444444444444444444444444444444444444444444444444444555555555555555555555555555555 270

pep: ISMVYAMASIGLLGFLVWSHHMYIVGLDADTRAYFTSATMIIAIPTGIKIFSWLATIHGGSIRLATPMLYAIAFLFLFTMGGLTGVALAN 360
ann: 555555555555555555555555555555555555555555555555555555666666666666666666666666666666666666 360

pep: ASLDVAFHDTYYVVGHFHYVLSMGAIFSLFAGYYYWSPQILGLNYNEKLAQIQFWLIFIGANVIFFPMHFLGINGMPRRIPDYPDAFAGW 450
ann: 666666666777777777888888888888888888888888888888888888888888888888888888888888888888888888 450

pep: NYVASIGSFIATLSLFLFIYILYDQLVNGLNNKVNNKSVIYNKAPDFVESNTIFNLNTVKSSSIEFLLTSPPAVHSFNTPAVQS* 535
ann: 8888888888888888888888888888888888888888888888888888888888888888888888888888888888888 535

TAB files containing exon-number annotation can be used directly in the FeatureMap3D server, for mapping the underlying exon-structure onto protein 3D structures.

2) Intron pos vs. reading frame: The underlying reading frame is determined, and an annotation of intron positions and intron phase is generated.

Phase 0 - an intron exists right before the codon encoding the amino-acid.
Phase 1 - an intron exists in between positions 1 and 2 of the codon.
Phase 2 - an intron exists in between positions 2 and 3 of the codon.



VIRTUAL RIBOSOME
----------------
Translation table: Yeast Mitochondrial SGC2 

>Q0045 - translation and annotation of the position and phase of the introns

pep: MVQRWLYSTNAKDIAVLYFMLAIFSGMAGTAMSLIIRLELAAPGSQYLHGNSQLFNVLVVGHAVLMIFFLVMPALIGGFGNYLLPLMIGA 90
ann: ........................................................1...........1............0........ 90

pep: TDTAFPRINNIAFWVLPMGLVCLVTSTLVESGAGTGWTVYPPLSSIQAHSGPSVDLAIFALHLTSISSLLGAINFIVTTLNMRTNGMTMH 180
ann: .......................................................................................... 180

pep: KLPLFVWSIFITAFLLLLSLPVLSAGITMLLLDRNFNTSFFEVSGGGDPILYEHLFWFFGHPEVYILIIPGFGIISHVVSTYSKKPVFGE 270
ann: ............................................................0............................. 270

pep: ISMVYAMASIGLLGFLVWSHHMYIVGLDADTRAYFTSATMIIAIPTGIKIFSWLATIHGGSIRLATPMLYAIAFLFLFTMGGLTGVALAN 360
ann: ......................................................0................................... 360

pep: ASLDVAFHDTYYVVGHFHYVLSMGAIFSLFAGYYYWSPQILGLNYNEKLAQIQFWLIFIGANVIFFPMHFLGINGMPRRIPDYPDAFAGW 450
ann: .........0.......1........................................................................ 450

pep: NYVASIGSFIATLSLFLFIYILYDQLVNGLNNKVNNKSVIYNKAPDFVESNTIFNLNTVKSSSIEFLLTSPPAVHSFNTPAVQS* 535
ann: ..................................................................................... 535

Working with sequence annotation

The annotation string concept

The idea is simply to have a string in addtion to the DNA/peptide sequence which describes the properties of the sequence. This is done using a simple one-letter code. All this is described in great details for DNA sequences on the FeatureExtract server and the publication describing the server [FeatureExtract - extraction of sequence annotation made easy, Wernersson, 2005]. Here it is best illustrated by an example:

Sequence:    ATGTCTACATATGAAGGTATGTAA	
Annotation:  (EEEEEEEEEEEEEE)DIIIIIII

E: Exon
I: Intron

(: Start of exon
): End of exon
D: Donor site
A: Accepter site

The Virtual Ribosome looks for the regions annotated as (EEEEEE{many E's}EEEEE) for finding the exoninc part of a sequence.

TAB format file

The TAB for is very simple. Each line hold the information of exactly one sequence in four field separated by the TAB character:

Name Seq Ann Com
Name : Name of the sequence.
Seq : The DNA sequence itself
Ann : An annotation string of the same length as the sequences.
Com : A comment field. May be empty.

TAB format files containing information about the Intron/Exon structure, can be generated from GenBank files using the FeatureExtract server.

GenBank files

The Virtual Ribosome has the option of working directly with GenBank files. When a GenBank file is supplied, all CDS elements is extracted to a TAB file (see above) containing Intron/Exon annotation prior to the translation. This is done by processing the GenBank entry with the FeatureExtract software using default parameters.

For greated control of the GenBank parsing process, please use the FeatureExtract server directly and submit the resultant TAB files to the Virtual Ribosome.


Example input files

Alpha-globins in FASTA format

>AB001981_alpha-A_Pigeon
ATGGTGCTGTCTGCCAACGACAAGAGCAACGTGAAGGCCGTCTTCGGCAAAATCGGCGGC
CAGGCCGGTGACTTGGGTGGTGAAGCCCTGGAGAGGTTGTTCATCACCTACCCCCAGACC
AAGACCTACTTCCCCCACTTCGACCTGTCACATGGCTCCGCTCAGATCAAGGGGCACGGC
AAGAAGGTGGCGGAGGCACTGGTTGAGGCTGCCAACCACATCGATGACATCGCTGGTGCC
CTCTCCAAGCTGAGCGACCTCCACGCCCAAAAGCTCCGTGTGGACCCCGTCAACTTCAAA
CTGCTGGGTCACTGCTTCCTGGTGGTCGTGGCCGTCCACTTCCCCTCTCTCCTGACCCCG
GAGGTCCATGCTTCCCTGGACAAGTTCGTGTGTGCCGTGGGCACCGTCCTTACTGCCAAG
TACCGTTAA
>J00043_Alpha-i_Goat
ATGGTGCTGTCTGCCGCCGACAAGTCCAATGTCAAGGCCGCCTGGGGCAAGGTTGGCGGC
AACGCTGGAGCTTATGGCGCAGAGGCTCTGGAGAGGATGTTCCTGAGCTTCCCCACCACC
AAGACCTACTTCCCCCACTTCGACCTGAGCCACGGCTCGGCCCAGGTCAAGGGCCACGGC
GAGAAGGTGGCCGCCGCGCTGACCAAAGCGGTGGGCCACCTGGACGACCTGCCCGGTACT
CTGTCTGATCTGAGTGACCTGCACGCCCACAAGCTGCGTGTGGACCCGGTCAACTTTAAG
CTTCTGAGCCACTCCCTGCTGGTGACCCTGGCCTGCCACCTCCCCAATGATTTCACCCCC
GCGGTCCACGCCTCCCTGGACAAGTTCTTGGCCAACGTGAGCACCGTGCTGACCTCCAAA
TACCGTTAA
>AF098919_Embryonic_Alpha-pi_Chicken
ATGGCACTGACCCAAGCTGAGAAGGCTGCCGTGACCACCATCTGGGCAAAGGTGGCTACC
CAGATTGAGTCCATTGGGCTGGAATCACTGGAGAGGCTTTTTGCCAGCTATCCTCAGACG
AAAACCTACTTCCCTCACTTTGATGTCAGCCAAGGCTCAGTTCAGCTTCGTGGTCACGGC
TCCAAGGTCCTGAATGCCATTGGGGAAGCTGTGAAGAACATCGATGACATTAGAGGTGCT
TTGGCCAAACTCAGCGAGCTGCATGCTTACATCCTCAGGGTGGACCCAGTGAACTTCAAG
CTGCTTTCCCACTGTATCCTGTGCTCTGTGGCTGCCCGCTATCCCAGTGATTTCACCCCA
GAAGTTCATGCTGCGTGGGACAAGTTCCTGTCCAGCATTTCCTCTGTTCTGACTGAGAAA
TACAGATAA
The Example above can be pasted directly into the text-field in the main windows of the Virtual Ribosome.

Here is a file with 11 Alpha-globins in FASTA format: alpha-globins.fsa.

Alpha-globins in GenBank format

LOCUS       GOTHBAII                1691 bp    DNA     linear   MAM 27-APR-1993
DEFINITION  Goat adult alpha-ii-globin gene, complete sequence.
ACCESSION   J00044
VERSION     J00044.1  GI:164125
KEYWORDS    alpha-globin; globin.
SOURCE      Capra hircus (goat)
  ORGANISM  Capra hircus
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Laurasiatheria; Cetartiodactyla; Ruminantia;
            Pecora; Bovidae; Caprinae; Capra.
REFERENCE   1  (bases 1 to 1691)
  AUTHORS   Schon,E.A., Wernke,S.M. and Lingrel,J.B.
  TITLE     Gene conversion of two functional goat alpha-globin genes preserves
            only minimal flanking sequences
  JOURNAL   J. Biol. Chem. 257 (12), 6825-6835 (1982)
   PUBMED   6282825
FEATURES             Location/Qualifiers
     source          1..1691
                     /organism="Capra hircus"
                     /mol_type="genomic DNA"
                     /db_xref="taxon:9925"
     CDS             join(745..839,941..1145,1250..1378)
                     /note="alpha-ii globin"
                     /codon_start=1
                     /protein_id="AAA30910.1"
                     /db_xref="GI:164126"
                     /translation="MVLSAADKSNVKAAWGKVGSNAGAYGAEALERMFLSFPTTKTYF
                     PHFDLSHGSAQVKGHGEKVAAALTKAVGHLDDLPGTLSDLSDLHAHKLRVDPVNFKLL
                     SHSLLVTLACHHPSDFTPAVHASLDKFLANVSTVLTSKYR"
     exon            <745..839
                     /note="alpha-ii globin"
     exon            941..1145
     exon            1250..>1378
                     /note="alpha-ii globin"
ORIGIN      
        1 ctgcaggaac cagcacctgg gagaagagac ttgaacccgg acttgaactc cttgcaaatt
       61 gctgtaaccc gctctcagta tctgttcctt ccaagactgc cactcagttg cacccaaaaa
      121 ctctctgcgg aaagaaagga agctcgaagc gccaaggctg aagaggaaca ggagggttgg
      181 acgggggtgg ggaggaattc gcgattacat gtgaacggtg agccaagtgt gttgcgtcgg
      241 gctgcctctg gcatggacta ggcgcactca gtcgcccgtt ccttcactga tactgcccaa
      301 gtttaaaatg cccagagtgt gccaagctta ggtccggggt gggtagacgg gctgacttac
      361 tcccttccgt tctcaagaca gctggggaac tcctgcagga tgcaggagcg ggcatctacc
      421 cagctccaca atcccgcccc tgccacctgg cgcgaggcta ccacgtccgg ggaaggtgga
      481 cgcagcgggc gggaagcaga cggtggaagc aagaaccccc ggtcagagtc caggtctggg
      541 tgggtgaggg aagcacccat cgcccggccg ggcgcaggtc ggactccgcg cgccccctgc
      601 ggtcctggtc cggccgcgca tgccgcgtgc cagccaatga gcgcagcgcg ggcgggcgtg
      661 cacctggagc cgggcgcata aaggctcgcg cactcgcagc cccgcactct tctggttctg
      721 acccagactc agagagaatc caccatggtg ctgtctgccg ccgacaagtc caatgtcaag
      781 gccgcctggg gcaaggttgg cagcaacgct ggagcttatg gcgcagaggc tctggagagg
      841 tgagcaccgc acccgccccg aggggaccgg gccgctcgcc gggcgcgtcc ttgtaccggg
      901 cctctcggcc tgagcccggc tttcccgcct cttcacccag gatgttcctg agcttcccca
      961 ccaccaagac ctacttcccc cacttcgacc tgagccacgg ctcggcccag gtcaagggcc
     1021 acggcgagaa ggtggccgcc gcgctgacca aagcggtggg ccacctggac gacctgcccg
     1081 gtactctgtc tgatctgagt gacctgcacg cccacaagct gcgtgtggac ccggtcaact
     1141 ttaaggtgag ctcgcgggcc gggccgggac agacctgggc tagcggggca gagaatgccg
     1201 cggcggcccc acccagcccc cgccccactg acgtcccctc tctcggcagc ttctgagcca
     1261 ctccctgctg gtgaccctgg cctgccacca ccccagtgat ttcacccccg cggtccacgc
     1321 ctccctggac aagttcttgg ccaacgtgag caccgtgctg acctccaaat accgttaagc
     1381 tggagcctcg gccaccccta ccctggcctg gagcgccctt gcgctctgcg cactctcacc
     1441 tcctgatctt tgaataaagt ctgagtgggc tgcagtgtct gtctgtagcc tcgggtctct
     1501 gtgtccgcga accggcccag gttctcattg cctcggacca aggagctctc aggcagctag
     1561 agagagaagg ggaaaactgg acggaggggt gggggtgcag cctgccccac tgccactacc
     1621 tgggattctc tgggcagccc tcaccctcag cctggagtga tttctgagta tcttggccct
     1681 tccctgaatt c
//

Here is a file with 11 Alpha-globins (9 GenBank entries) in a multi-GenBank file: alpha-globins.gbk.

Alpha-globins in TAB format

Here is a file with 11 Alpha-globins in TAB format: alpha-globins.tab. The file contains both DNA sequence and annotation of the Intron/Exon structure. It was generated by parsing the GenBank entries listed below using the FeatureExtract server. In this file the naming of each entry has been selected to indicate both the type of alpha-globin and the organism.

AB001981
X01831
J00923
J00043
J00044
X01086
X07053
AF098919

An reformatted human-readable view of the first entry in the TAB file looks like this:

Name: 'AB001981_alpha-D_Pigeon'

ATGCTGACCGACTCTGACAAGAAGCTGGTCCTGCAGGTGTGGGAGAAGGTGATCCGCCAC 59       
(EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 59       

CCAGACTGTGGAGCCGAGGCCCTGGAGAGGTGCGGGCTGAGCTTGGGGAAACCATGGGCA 119      
EEEEEEEEEEEEEEEEEEEEEEEEEEEE)DIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 119      

AGGGGGGCGACTGGGTGGGAGCCCTACAGGGCTGCTGGGGGTTGTTCGGCTGGGGGTCAG 179      
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 179      

CACTGACCATCCCGCTCCCGCAGCTGTTCACCACCTACCCCCAGACCAAGACCTACTTCC 239      
IIIIIIIIIIIIIIIIIIIIIA(EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 239      

CCCACTTCGACTTGCACCATGGCTCCGACCAGGTCCGCAACCACGGCAAGAAGGTGTTGG 299      
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 299      

CCGCCTTGGGCAACGCTGTCAAGAGCCTGGGCAACCTCAGCCAAGCCCTGTCTGACCTCA 359      
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 359      

GCGACCTGCATGCCTACAACCTGCGTGTCGACCCTGTCAACTTCAAGGCAGGCGGGGGAC 419      
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE)DIIIIIIIIIIII 419      

GGGGGTCAGGGGCCGGGGAGTTGGGGGCCAGGGACCTGGTTGGGGATCCGGGGCCATGCC 479      
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 479      

GGCGGTACTGAGCCCTGTTTTGCCTTGCAGCTGCTGGCGCAGTGCTTCCACGTGGTGCTG 539      
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIA(EEEEEEEEEEEEEEEEEEEEEEEEEEEEE 539      

GCCACACACCTGGGCAACGACTACACCCCGGAGGCACATGCTGCCTTCGACAAGTTCCTG 599      
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 599      

TCGGCTGTGTGCACCGTGCTGGCCGAGAAGTACAGATAA 638      
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE) 638      

//

Notice: the sequence above CANNOT be paste into the Virtual Ribosome, since it's NOT in TAB format and only serves the purpose of illustrating the content of a TAB file.


A toolchain approach

Since the command-line programs behind both the FeatureExtract server and the Virtual Ribosome are available as Open-Source packages for download, they can be combined to form a strong tool-chain, as the following example describe.

gb2tab  : The program behind the FeatureExtract server.
dna2pep : The program behind the Virtual Ribosome server.

Example: Translating the Yeast genome

All files for the Yeast genome has been download in GenBank format:

genome[raz]:/home/people/raz/projects/genomes/YeastGenomeNov2005> ll *gbf
-rw-------    1 raz      user         479K Nov 30 13:59 chr01.gbf
-rw-------    1 raz      user         1.8M Nov 30 13:59 chr02.gbf
-rw-------    1 raz      user         701K Nov 30 13:59 chr03.gbf
-rw-------    1 raz      user         3.3M Nov 30 13:59 chr04.gbf
-rw-------    1 raz      user         1.2M Nov 30 13:59 chr05.gbf
-rw-------    1 raz      user         592K Nov 30 13:59 chr06.gbf
-rw-------    1 raz      user         2.3M Nov 30 13:59 chr07.gbf
-rw-------    1 raz      user         1.2M Nov 30 13:59 chr08.gbf
-rw-------    1 raz      user         948K Nov 30 13:59 chr09.gbf
-rw-------    1 raz      user         1.6M Nov 30 13:59 chr10.gbf
-rw-------    1 raz      user         1.4M Nov 30 13:59 chr11.gbf
-rw-------    1 raz      user         2.3M Nov 30 13:59 chr12.gbf
-rw-------    1 raz      user         2.0M Nov 30 13:59 chr13.gbf
-rw-------    1 raz      user         1.7M Nov 30 13:59 chr14.gbf
-rw-------    1 raz      user         2.3M Nov 30 13:59 chr15.gbf
-rw-------    1 raz      user         2.0M Nov 30 13:59 chr16.gbf
-rw-------    1 raz      user         160K Nov 30 13:59 chrmt.gbf

Extract and translate the nuclear genes:
gb2tab chr{0,1}*gbf | dna2pep > yeast_nuc.tab

Extract and translate the mitochondrial genes:
gb2tab chrmt.gbf | dna2pep -m 3 > yeast_mit.tab

Count number of lines = number of genes:
genome[raz]:/home/people/raz/projects/genomes/YeastGenomeNov2005> wc -l yeast_*.tab
            19 yeast_mit.tab
          5854 yeast_nuc.tab
          5873 total

Find the mitochondrial proteins that originates from genes without introns:
genome[raz]:/home/people/raz/projects/genomes/YeastGenomeNov2005> egrep -v "2222+" yeast_mit.tab | cut -f 1
AI1
AAP1
ATP6
OLI1
VAR1
SCEI
COX2
COX3




GETTING HELP

Scientific and Technical problems: