Quick start
Paste in or upload DNA sequences and hit "Submit query".
The RevTrans server will then virtually translate the DNA
sequences and align the resulting peptide sequences using MAFFT with
default settings (other alignment program can be selected).
Finally RevTrans constructs a multiple DNA alignment
using the peptide alignment as a scaffold.
If you want more control over the alignment process RevTrans also accepts user provided
peptide alignments.
This will give you
the opportunity to use your preferred alignment software and to optimize the
parameters.
If you need to translate your DNA sequences prior to alignment this can be done by
using the "Translate only" button (or by following the link to the "Virtual Ribosome"
server if you want more fine.grained control over the translation process).
The translation has full support for degenerate nucleotides and alternative translation tables
can be selected.
When providing your own peptide alignment, RevTrans will
accept arbitrarily large input files.
Detailed instructions
Supply DNA sequences
The DNA sequences must either be pasted into the webpage or uploaded via the "Choose file" button.
The input must be in FASTA, MSF, or ALN (Clustal) format.
The full IUPAC degenerated DNA alphabet (not case sensitive) is supported:
A
C
G
T
R
Y
M
K
W
S
B
D
H
V
N
Please note that gaps and unknown symbolse.g. - and X
will be discarded before processing.
Optional: Supply aligned peptide sequences
For greater control of the alignment process you have the option
of also supplying a pre-computed peptide alignment. RevTrans will then
use this as the scaffold for the DNA alignment.
The peptide alignment must be in FASTA, MSF, or ALN (Clustal) format.
By default "-", "." and "~"
will be interpreted as gap symbols.
If a peptide alignment is not supplied the RevTrans web-server will
automatically construct one using the selected multiple alignment program (deafult: MAFFT). In all case
the alignment program will be run with default parameters.
Submit query
Click on the "Submit query" button. If the processing of the query takes more
than a few seconds you'll will get the option of supplying your email address and be notified
when the job is done.
Advanced options
RevTrans has support for a number of advanced options. Typically it is not necessary to
set these manually and most users can safely skip this section and proceed to submitting the query.
-
Data format, DNA sequences:
By default the DNA file format is automatically detected. Alternatively you may specify
the format as being FASTA, MSF, or ALN (Clustal).
-
Data format, aligned peptide sequences:
By default the peptide alignment file format is automatically detected. Alternatively you may specify
the format as being FASTA, MSF, or ALN (Clustal).
-
Output format:
By default the final multiple DNA alignment will be in ALN (Clustal) format.
Alternatively you may specify FASTA or MSF.
-
Gap-In:
Here you can specify which symbol(s) denote(s) a gap in user provided peptide alignemnt.
The default should be correct for virtually all standard alignment files.
-
Gap-out:
Here you can specify which gap symbol to use in the output.
-
Match DNA and peptide sequences by:
This option gives the user control over how DNA sequences paired to their peptide
counterpart.
Translation:
(Default) The DNA sequences are translated using the standard genetic code
(or an alternative translation table if selected below)
with full IUPAC support and compared to the peptide sequences. The DNA sequence
is paired with the first matching peptide sequence found.
Name:
DNA sequences are paired with peptide counterparts based on sequence entry names.
Entry names must be unique within files and identical across files.
If you experience trouble when using name based matching, please make sure
that sequences names do match across files as some alignment software
may truncate or otherwise alter sequence names.
Position:
DNA sequences are paired with peptide counterparts simply based on their order
of appearance in the files.
-
Translation table:
Select an alternative translation table - used for "matching-by-translation" and
with the "Translate Only" functionality.
The numbering of the translation table is the one defined by the NCBI Taxonomy Group.
For a detailed description of each genetic code, please consult the following web page
at NCBI:
The Genetic Codes
.
[Main site: Taxonomy]
-
Alignment method
(New in RevTrans 1.4)
RevTrans offers a selection of programs for performing the peptide alignment step:
Dialign 2.2:
Reference:
B. Morgenstern (1999).
DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment.
Bioinformatics 15, 211 - 218.
Dialign-T 0.1.3:
Reference:
Amarendran R. Subramanian, Jan Weyer-Menkhoff, Michael Kaufmann, Burkhard Morgenstern:
DIALIGN-T: An improved algorithm
for segment-based multiple sequence alignment
Bioinformatics 2005, 6:66.
ClustalW 1.83:
Reference:
Higgins D., Thompson J., Gibson T. Thompson J. D., Higgins D. G., Gibson T. J.(1994).
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through
sequence weighting,position-specific gap penalties and weight matrix choice.
Nucleic Acids Res. 22:4673-4680.
Example data
Sample DNA dataset
The following is a set of unaligned Alpha-globin genes from a range for organisms.
>Sheep
ATGGCCCTGTGGACACGCCTGGTGCCCCTGCTGGCCCTGCTGGCACTCTGGGCCCCCGCC
CCGGCCCACGCCTTCGTCAACCAGCACCTGTGCGGCTCCCACCTGGTGGAGGCGCTGTAC
CTGGTGTGCGGAGAGCGCGGCTTCTTCTACACGCCCAAGGCCCGCCGGGAGGTGGAGGGC
CCCCAGGTGGGGGCGCTGGAGCTGGCCGGAGGCCCCGGCGCGGGTGGCCTGGAGGGGCCC
CCGCAGAAGCGTGGCATCGTGGAGCAGTGCTGCGCCGGCGTCTGCTCTCTCTACCAGCTG
GAGAACTACTGTAACTAG
>Pig
ATGGCCCTGTGGACGCGCCTCCTGCCCCTGCTGGCCCTGCTGGCCCTCTGGGCGCCCGCC
CCGGCCCAGGCCTTCGTGAACCAGCACCTGTGCGGCTCCCACCTGGTGGAGGCGCTGTAC
CTGGTGTGCGGGGAGCGCGGCTTCTTCTACACGCCCAAGGCCCGTCGGGAGGCGGAGAAC
CCTCAGGCAGGTGCCGTGGAGCTGGGCGGAGGCCTGGGCGGCCTGCAGGCCCTGGCGCTG
GAGGGGCCCCCGCAGAAGCGTGGCATCGTGGAGCAGTGCTGCACCAGCATCTGTTCCCTC
TACCAGCTGGAGAACTACTGCAACTAG
>Dog
ATGGCCCTCTGGATGCGCCTCCTGCCCCTGCTGGCCCTGCTGGCCCTCTGGGCGCCCGCG
CCCACCCGAGCCTTCGTTAACCAGCACCTGTGTGGCTCCCACCTGGTAGAGGCTCTGTAC
CTGGTGTGCGGGGAGCGCGGCTTCTTCTACACGCCTAAGGCCCGCAGGGAGGTGGAGGAC
CTGCAGGTGAGGGACGTGGAGCTGGCCGGGGCGCCTGGCGAGGGCGGCCTGCAGCCCCTG
GCCCTGGAGGGGGCCCTGCAGAAGCGAGGCATCGTGGAGCAGTGCTGCACCAGCATCTGC
TCCCTCTACCAGCTGGAGAATTACTGCAACTAG
>OwlMonkey
ATGGCCCTGTGGATGCACCTCCTGCCCCTGCTGGCGCTGCTGGCCCTCTGGGGACCCGAG
CCAGCCCCGGCCTTTGTGAACCAGCACCTGTGCGGCCCCCACCTGGTGGAAGCCCTCTAC
CTGGTGTGCGGGGAGCGAGGTTTCTTCTACGCACCCAAGACCCGCCGGGAGGCGGAGGAC
CTGCAGGTGGGGCAGGTGGAGCTGGGTGGGGGCTCTATCACGGGCAGCCTGCCACCCTTG
GAGGGTCCCATGCAGAAGCGTGGCGTCGTGGATCAGTGCTGCACCAGCATCTGCTCCCTC
TACCAGCTGCAGAACTACTGCAACTAG
>Human
ATGGCCCTGTGGATGCGCCTCCTGCCCCTGCTGGCGCTGCTGGCCCTCTGGGGACCTGAC
CCAGCCGCAGCCTTTGTGAACCAACACCTGTGCGGCTCACACCTGGTGGAAGCTCTCTAC
CTAGTGTGCGGGGAACGAGGCTTCTTCTACACACCCAAGACCCGCCGGGAGGCAGAGGAC
CTGCAGGTGGGGCAGGTGGAGCTGGGCGGGGGCCCTGGTGCAGGCAGCCTGCAGCCCTTG
GCCCTGGAGGGGTCCCTGCAGAAGCGTGGCATTGTGGAACAATGCTGTACCAGCATCTGC
TCCCTCTACCAGCTGGAGAACTACTGCAACTAG
>GreenMonkey
ATGGCCCTGTGGATGCGCCTCCTGCCCCTGCTGGCGCTGCTGGCCCTCTGGGGACCTGAC
CCGGTCCCGGCCTTTGTGAACCAGCACCTGTGCGGCTCCCACCTGGTGGAAGCCCTCTAC
CTGGTGTGCGGGGAGCGAGGCTTCTTCTACACGCCCAAGACCCGCCGGGAGGCAGAGGAC
CCGCAGGTGGGGCAGGTAGAGCTGGGCGGGGGCCCTGGCGCAGGCAGCCTGCAGCCCTTG
GCGCTGGAGGGGTCCCTGCAGAAGCGCGGCATCGTGGAGCAGTGCTGTACCAGCATCTGC
TCCCTCTACCAGCTGGAGAACTACTGCAACTAG
>Chimp
ATGGCCCTGTGGATGCGCCTCCTGCCCCTGCTGGTGCTGCTGGCCCTCTGGGGACCTGAC
CCAGCCTCGGCCTTTGTGAACCAACACCTGTGCGGCTCCCACCTGGTGGAAGCTCTCTAC
CTAGTGTGCGGGGAACGAGGCTTCTTCTACACACCCAAGACCCGCCGGGAGGCAGAGGAC
CTGCAGGTGGGGCAGGTGGAGCTGGGCGGGGGCCCTGGTGCAGGCAGCCTGCAGCCCTTG
GCCCTGGAGGGGTCCCTGCAGAAGCGTGGTATCGTGGAACAATGCTGTACCAGCATCTGC
TCCCTCTACCAGCTGGAGAACTACTGCAACTAG
>GuineaPig
ATGGCTCTGTGGATGCATCTCCTCACCGTGCTGGCCCTGCTGGCCCTCTGGGGGCCCAAC
ACTAATCAGGCCTTTGTCAGCCGGCATCTGTGCGGCTCCAACTTAGTGGAGACATTGTAT
TCAGTGTGTCAGGATGATGGCTTCTTCTATATACCCAAGGACCGTCGGGAGCTAGAGGAC
CCACAGGTGGAGCAGACAGAACTGGGCATGGGCCTGGGGGCAGGTGGACTACAGCCCTTG
GCACTGGAGATGGCACTACAGAAGCGTGGCATTGTGGATCAGTGCTGTACTGGCACCTGC
ACACGCCACCAGCTGCAGAGCTACTGCAACTAG
>Mouse
ATGGCCCTGTTGGTGCACTTCCTACCCCTGCTGGCCCTGCTTGCCCTCTGGGAGCCCAAA
CCCACCCAGGCTTTTGTCAAACAGCATCTTTGTGGTCCCCACCTGGTAGAGGCTCTCTAC
CTGGTGTGTGGGGAGCGTGGCTTCTTCTACACACCCAAGTCCCGCCGTGAAGTGGAGGAC
CCACAAGTGGAACAACTGGAGCTGGGAGGAAGCCCCGGGGACCTTCAGACCTTGGCGTTG
GAGGTGGCCCGGCAGAAGCGTGGCATTGTGGATCAGTGCTGCACCAGCATCTGCTCCCTC
TACCAGCTGGAGAACTACTGCAACTAA
>Chicken
ATGGCTCTCTGGATCCGATCACTGCCTCTTCTGGCTCTCCTTGTCTTTTCTGGCCCTGGA
ACCAGCTATGCAGCTGCCAACCAGCACCTCTGTGGCTCCCACTTGGTGGAGGCTCTCTAC
CTGGTGTGTGGAGAGCGTGGCTTCTTCTACTCCCCCAAAGCCCGACGGGATGTCGAGCAG
CCCCTAGTGAGCAGTCCCTTGCGTGGCGAGGCAGGAGTGCTGCCTTTCCAGCAGGAGGAA
TACGAGAAAGTCAAGCGAGGGATTGTTGAGCAATGCTGCCATAACACGTGTTCCCTCTAC
CAACTGGAGAACTACTGCAACTAG
Sample peptide dataset
The following is a the Alpha-globin genes from the dataset above: translated using the
standard genetic code, and aligned using MAFFT.
>Sheep
MALWTRLVPLLALLALWAPAPAHAFVNQHLCGSHLVEALYLVCGERGFFYTPKARREVEG
PQVGALELAGGPGAGG-----LEGPPQKRGIVEQCCAGVCSLYQLENYCN
>Pig
MALWTRLLPLLALLALWAPAPAQAFVNQHLCGSHLVEALYLVCGERGFFYTPKARREAEN
PQAGAVELGGGLG--GLQALALEGPPQKRGIVEQCCTSICSLYQLENYCN
>Dog
MALWMRLLPLLALLALWAPAPTRAFVNQHLCGSHLVEALYLVCGERGFFYTPKARREVED
LQVRDVELAGAPGEGGLQPLALEGALQKRGIVEQCCTSICSLYQLENYCN
>OwlMonkey
MALWMHLLPLLALLALWGPEPAPAFVNQHLCGPHLVEALYLVCGERGFFYAPKTRREAED
LQVGQVELGGGSITGSLPP--LEGPMQKRGVVDQCCTSICSLYQLQNYCN
>Human
MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAED
LQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN
>GreenMonkey
MALWMRLLPLLALLALWGPDPVPAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAED
PQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN
>Chimp
MALWMRLLPLLVLLALWGPDPASAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAED
LQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN
>GuineaPig
MALWMHLLTVLALLALWGPNTNQAFVSRHLCGSNLVETLYSVCQDDGFFYIPKDRRELED
PQVEQTELGMGLGAGGLQPLALEMALQKRGIVDQCCTGTCTRHQLQSYCN
>Mouse
MALLVHFLPLLALLALWEPKPTQAFVKQHLCGPHLVEALYLVCGERGFFYTPKSRREVED
PQVEQLELGGSPG--DLQTLALEVARQKRGIVDQCCTSICSLYQLENYCN
>Chicken
MALWIRSLPLLALLVFSGPGTSYAAANQHLCGSHLVEALYLVCGERGFFYSPKARRDVEQ
PLVSS-PLRGEAG--VLPFQQEEYEKVKRGIVEQCCHNTCSLYQLENYCN