Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

EasyGene 1.0 Output Format

DESCRIPTION

For each input sequence the server lists all coding sequences (CDSs) with R-values less than a given cutoff. The default cutoff is R=2.

The output is written in GFF format, where the columns are organized as follows:

1. Sequence identifier

2. EasyGene source identifier

3. Feature indentifier with one of two values: a) CDS indicates a coding sequence (i.e. a gene) with the highest scoring start codon. b) CDSsub indicates a coding sequence with an alternative start codon.

4. If a CDS is on the direct strand this column is the position of the first base of a start codon. If a CDS is on the reverse strand then this column is the last base of a stop codon. The positions are meassured in direct strand coordinates in both cases.

5. If a CDS is on the direct strand this column is the position of the last base of a stop codon. If a CDS is on the reverse strand then this column is the first base of a start codon. The positions are meassured in direct strand coordinates in both cases.

6. R-value associated with a CDS. The R-value is the expected number of genes one would find pr. megabase random DNA sequence with a standard score greater than that of the CDS. Hence, the SMALLER the R-value, the MORE likely it is to be a real gene. The default R-value is R=2, but the user is free to set a higher R-value if less confident predictions are wanted.

7. Strand of CDS.

8. Frame of CDS. Since 4) and 5) always indicate in-frame positions, this column is always zero and may be ignored. It is included in order to comply with the GFF format.

9. The start codon predicted.

10. The (linearized) log-odds score of a CDS




EXAMPLE OUTPUT


############## EasyGene predictions ##############


gb|AE000783|AE000783  EasyGene  CDS    62     124    4.82e+03  +  0  #TTG   -14.4
gb|AE000783|AE000783  EasyGene  CDS    574    651    7.43e+03  +  0  #TTG   -18.9
gb|AE000783|AE000783  EasyGene  CDS    114    677    0.000608  +  0  #ATG   -7.13
gb|AE000783|AE000783  EasyGene  CDSsub 105    677    0.000547  +  0  #ATG   -7.95
gb|AE000783|AE000783  EasyGene  CDS    787    864    6.59e+03  +  0  #TTG   -17.9
gb|AE000783|AE000783  EasyGene  CDS    969    1028   1.19e+04  +  0  #ATG   -26.9
gb|AE000783|AE000783  EasyGene  CDS    1010   1084   1.02e+04  +  0  #ATG   -24




GETTING HELP

Scientific problems:        Technical problems: