|
EasyGene 1.0 Output Format
DESCRIPTION
For each input sequence the server lists all coding sequences (CDSs) with R-values less than a given cutoff. The default cutoff is R=2.
The output is written in GFF format, where the columns are organized as follows:
1. Sequence identifier
2. EasyGene source identifier
3. Feature indentifier with one of two values:
a) CDS indicates a coding sequence (i.e. a gene) with the highest scoring start codon.
b) CDSsub indicates a coding sequence with an alternative start codon.
4. If a CDS is on the direct strand this column is the position of the first base of a start codon. If a CDS is on the reverse strand then this column is the last base of a stop codon. The positions are meassured in direct strand coordinates in both cases.
5. If a CDS is on the direct strand this column is the position of the last base of a stop codon. If a CDS is on the reverse strand then this column is the first base of a start codon. The positions are meassured in direct strand coordinates in both cases.
6. R-value associated with a CDS. The R-value is the expected number of genes one would find pr. megabase random DNA sequence with a standard score greater than that of the CDS. Hence, the SMALLER the R-value, the MORE likely it is to be a real gene. The default R-value is R=2, but the user is free to set a higher R-value if less confident predictions are wanted.
7. Strand of CDS.
8. Frame of CDS. Since 4) and 5) always indicate in-frame positions, this column is always zero and may be ignored. It is included in order to comply with the GFF format.
9. The start codon predicted.
10. The (linearized) log-odds score of a CDS
EXAMPLE OUTPUT
############## EasyGene predictions ##############
gb|AE000783|AE000783 EasyGene CDS 62 124 4.82e+03 + 0 #TTG -14.4
gb|AE000783|AE000783 EasyGene CDS 574 651 7.43e+03 + 0 #TTG -18.9
gb|AE000783|AE000783 EasyGene CDS 114 677 0.000608 + 0 #ATG -7.13
gb|AE000783|AE000783 EasyGene CDSsub 105 677 0.000547 + 0 #ATG -7.95
gb|AE000783|AE000783 EasyGene CDS 787 864 6.59e+03 + 0 #TTG -17.9
gb|AE000783|AE000783 EasyGene CDS 969 1028 1.19e+04 + 0 #ATG -26.9
gb|AE000783|AE000783 EasyGene CDS 1010 1084 1.02e+04 + 0 #ATG -24
GETTING HELP
Scientific problems:
Technical problems:
|