Virtual Ribosome - Software Download


Command line version of the Virtual Ribosome

The main fuctionality of the Virtual Ribosome server comes from the command line program "dna2pep", which can be downloaded here. dna2pep is written in Python 2.4 and should work on any system (MacOS X, Unix, Windows etc.) with Python 2.4 or newer installed. Please notice that you may need to modify the very first line of the program to point to the location of you python installation.

The program is invoked from the shell (command line), and has a build in help page (dna2pep.py -h).

Download dna2pep v.1.1: dna2pep-1.1.tgz
Browse all releases.

If you require the Virtual Ribosome on a commerical license, please contact software@cbs.dtu.dk.

dna2pep help page

NAME
	dna2pep - full featured computational translation of DNA to peptide.
	(The program behind the "Virtual Ribosome" webserver.

SYNOPSIS
	dna2pep [options] [input files] [-f outfile]
	
DESCRIPTION
	TRANSLATION: The translation engine of dna2pep has full support for handling
	degenerate nucleotides (IUPAC definition, e.g. W = A or T, S = G or C).
	All translation table defined by the NCBI taxonomy group is included, 
	and a number of options determining the behaviour of STOP and START
	codons is avialable.
	
	INTRON and EXONS: dna2pep natively understands TAB files containing 
	Intron/Exon annotation (gb2tab / FeatureExtract). When translating
	files containing Intron/Exon structure, dna2pep will annotate the
	underlying gene-structure in the annotation of the translated
	sequence.
	
	Input files can be in FASTA (no Intron/Exon annotation) RAW (single
	sequence with no header - all non-letters are discarded) or TAB
	(incluing annotation) FORMAT. The output format will by default be FASTA
	for files without annotation and TAB for files including annotation. 
	The file format is autodetected by investigating the first line
	of the input.

	If no input files are specified, dna2pep will read from STDIN.
	
OPTIONS
	-F, --outfile
		Optional - specify an output file. If no output file is 
		specified the output will go to STDOUT.
		
	-O, --outformat
		Specify output format (see also the --fasta, --tab, 
		--report options below):
		
		FASTA:  Fasta format (plain DNA, no sequence annotation)
		
		TAB:    Tab format. Each line contains the following four
		        fields, separated by tabs:
			name, seq, ann, comment
			
			See gb2tab (FeatureExtract) for details.
		
		REPORT:	A nice visualization of the results.			
			
		AUTO:  [Default] Generate a both a report and sequence output 
			(use the same format as the one detected from the for 
			the input files).

	--fasta filename
		Write output sequences in FASTA format to the specified file.
		Use '-' to indicate STDOUT.
		
	--tab filename
		Write output sequences in TAB format to the specified file.
		Use '-' to indicate STDOUT.
		
	--report filename
		Write report to the specified file.
		Use '-' to indicate STDOUT.

        -m, --matrix tablename/file
                Use alternative translation matrix instead of the build-in
                Standard Genetic Code for translation.
                
                If "tablename" is 1-6,9-16 or 21-23 one of the alternative 
                translation tables defined by the NCBI taxonomy group will be 
                used.
                
                Briefly, the following tables are defined:
                -----------------------------------------
                 1: The Standard Code 
                 2: The Vertebrate Mitochondrial Code 
                 3: The Yeast Mitochondrial Code 
                 4: The Mold, Protozoan, and Coelenterate Mitochondrial Code 
                    and the Mycoplasma/Spiroplasma Code 
                 5: The Invertebrate Mitochondrial Code 
                 6: The Ciliate, Dasycladacean and Hexamita Nuclear Code 
                 9: The Echinoderm and Flatworm Mitochondrial Code 
                10: The Euplotid Nuclear Code 
                11: The Bacterial and Plant Plastid Code 
                12: The Alternative Yeast Nuclear Code 
                13: The Ascidian Mitochondrial Code 
                14: The Alternative Flatworm Mitochondrial Code 
                15: Blepharisma Nuclear Code 
                16: Chlorophycean Mitochondrial Code 
                21: Trematode Mitochondrial Code 
                22: Scenedesmus obliquus mitochondrial Code 
                23: Thraustochytrium Mitochondrial Code 
                
                See http://www.ncbi.nlm.nih.gov/Taxonomy [Genetic Codes]
                for a detailed description. Please notice that the table
                of start codons is also used (see the --allinternal option
                below for details).
                
                If a filename is supplied the translation table is read from
                file instead. 
                
                The file should contain one line per codon in the format:
                
                codon[whitespace]aa-single letter code
                
                All 64 codons must be included. Stop codons is specified 
                by "*". T and U is interchangeable. Blank lines and lines
                starting with "#" are ignored.
                
                See the "gcMitVertebrate.mtx" file in the dna2pep source
                distribution for a well documented example.

	-r x, --readingframe=x
		Specify the reading frame. For input files in TAB format this
		options is ignored, and the reading frame is build from the
		annotated Intron/Exon structure.		
		
		 1: Reading frame 1 (e.g. ATGxxxxxx). DEFAULT.
		 2: Reading frame 2 (e.g. xATGxxxxx).
		 3: Reading frame 3 (e.g. xxATGxxxx).
		 
		-1: Reading frame 1 on the minus strand.
		-2: Reading frame 2 on the minus strand.
		-3: Reading frame 3 on the minus strand.
		
		all: 	Try all reading frames. 
		     	This option also implies the -x option.
		     
		plus: 	All positive reading frames.
		     	This option also implies the -x option.
		      
		minus: 	All negative reading frames.
			This option also implies the -x option.
	
	-o mode, --orf mode
		Report longest ORF in the reading frame(s) specified with the 
		-r option.
		
		Mode governs which criterias are used to allow the opening of 
		an ORF. "Strict start codons" => codons _always_ coding for 
		methione (e.g. ATG in the standard code), "Minor start codons" 
		=> codon only coding for methionine at the start positon 
		(e.g. TTG in the standard genetic code). 
		
		Mode can be:
		------------
		strict:		Open an ORF at "strict start codons" only.
		any:		Open an ORF at any start codon.
		none:		Do not use start codons - look for the longest 
				fragment before a STOP codon.
		
		The DNA fragment usedfor encoding the ORF will be added to the 
		comment field (TAB format only).
		     	
        -a, --allinternal
                By default the very first codon in each sequences is assumed
                to be the initial codon on the transcript. This means certain
                non-methionine codons actually codes for metionine at this 
                position. For example "TTG" in the standard genetic code (see
                above).
                
                Selecting this option treats all codons as internal codons.     
                
        -x, --readthroughstop
                Allow the translation to continue after a stop codon is reached.
                The stop codon will be marked as "*".
		
	-p, --plain, --ignoreannotation
		Ignore annotation for TAB files. If this options is selected
		TAB files will be treated in same way as FASTA files.

	-c, --comment		
		Preserve the comment field in TAB files. Normally the comment
		field is silently dropped, since it makes no sense for FASTA
		files.
		
	-C, --processcomment
		Works as the -c option described above, except a bit of intelligent
		parsing is done on the comment field: If a "/spliced_product"
		sub-field is found (from TAB files create by FeatureExtract / gb2tab)
		only the part of the comment field before the DNA specific information
		is kept in the comment field.	

	-e, --exonstructure
		Default for TAB files. Annotate the underlying exons structure
		of the translated sequence the following way. Positions that
		are fully or partially encoded within the first exon get the
		annotation character "1", positions in the secon exon get the
		character "2" etc.
		
		The hex-decimal system is used, which means up to 15 exons can
		be uniquely annotated, before the numbering wraps around to "0".
		
	-i, --intronphase
		Annotate where an intron interrupted the DNA sequences, and how
		the intron did cut the readingframe.
		
		0 : phase-0 intron (inbetween the previous and current position).
		1 : phase-1 intron.
		2 : phase-2 intron.
		 
                
AUTHOR
	Rasmus Wernersson, raz@cbs.dtu.dk
	Feb-Mar 2006

FILES
	dna2pep.py, mod_translate.py, ncbi_genetic_codes.py

WEB PAGE
	http://www.cbs.dtu.dk/services/VirtualRibosome/
	
REFERENCE
	Rasmus Wernersson
	Virtual Ribosome - Comprehensive DNA translation tool.
	Submitted to Nucleic Acids Research, 2006	
    



GETTING HELP

Correspondence: