Output format



DESCRIPTION

The output conforms to the GFF format. For each input sequence the server prints a list of predicted genes, one per line. The columns are:
  • seqname:  input sequence name;
  • model:  organism model code (also in plain text in the table head);
  • feature:  predicted feature, 'CDS' or 'CDSsub' (alternative translation start);
  • start and end:  positions in the sequence;
  • score:  R-value, indicating how likely the fragment is to be just a non-coding open reading frame rather than a real gene;
  • strand:  '+' or '-';
  • startc:  predicted start codon;
  • odds:  log odds score.
Only the predictions with R-values lower than the selected R-value cutoff (the default is 2) are reported.

The example below shows the EasyGene 1.2 output for the sequence taken from the GenBank entry AB010576, containing Bacillus subtilis ComX, ComQ and DegQ genes. All the three genes are predicted as annotated in the database (shown in green), with high confidence, although an alternative translation start is preferred for comQ (shown in orange). Two additional genes not annotated in the GenBank entry are also predicted.


EXAMPLE OUTPUT


##gff-version 2
##source-version easygene-1.2b
##date 2007-08-15
##Type DNA
# model:  BS03 Bacillus subtilis
# seqname	model	feature	start	end	  score	       +/-	?	startc	odds
# ---------------------------------------------------------------------------------------------
AB010576	BS03	CDS	67	324	0.0271875	+	0	#ATG	20.1861
AB010576	BS03	CDSsub  55	324	0.031955	+	0	#ATG	20.1731
AB010576	BS03	CDS	1129	1269	0.0190622	+	0	#ATG	15.7102
AB010576	BS03	CDS	1370	2314	2.13273e-12	+	0	#ATG	74.7815
AB010576	BS03	CDSsub  1454	2314	1.92405e-12	+	0	#ATG	74.6356
AB010576	BS03	CDS	2327	2491	0.0167943	+	0	#ATG	17.2951
AB010576	BS03	CDS	300	668	1.43511		-	0	#ATG	10.6215
# ---------------------------------------------------------------------------------------------




GETTING HELP

Scientific problems:        Technical problems: