Output format

Description of the scores
Examples of standard output
Examples of short output


DESCRIPTION OF THE SCORES

The graphical output from SignalP (neural network) comprises three different scores, C, S and Y. Two additional scores are reported in the SignalP3-NN output, namely the S-mean and the D-score, but these are only reported as numerical values.

For each organism class in SignalP; Eukaryote, Gram-negative and Gram-positive, two different neural networks are used, one for predicting the actual signal peptide and one for predicting the position of the signal peptidase I (SPase I) cleavage site. The S-score for the signal peptide prediction is reported for every single amino acid position in the submitted sequence, with high scores indicating that the corresponding amino acid is part of a signal peptide, and low scores indicating that the amino acid is part of a mature protein.

The C-score is the ``cleavage site'' score. For each position in the submitted sequence, a C-score is reported, which should only be significantly high at the cleavage site. Confusion is often seen with the position numbering of the cleavage site. When a cleavage site position is referred to by a single number, the number indicates the first residue in the mature protein, meaning that a reported cleavage site between amino acid 26-27 corresponds to that the mature protein starts at (and include) position 27.

Y-max is a derivative of the C-score combined with the S-score resulting in a better cleavage site prediction than the raw C-score alone. This is due to the fact that multiple high-peaking C-scores can be found in one sequence, where only one is the true cleavage site. The cleavage site is assigned from the Y-score where the slope of the S-score is steep and a significant C-score is found.

The S-mean is the average of the S-score, ranging from the N-terminal amino acid to the amino acid assigned with the highest Y-max score, thus the S-mean score is calculated for the length of the predicted signal peptide. The S-mean score was in SignalP version 2.0 used as the criteria for discrimination of secretory and non-secretory proteins.

The D-score is introduced in SignalP version 3.0 and is a simple average of the S-mean and Y-max score. The score shows superior discrimination performance of secretory and non-secretory proteins to that of the S-mean score which was used in SignalP version 1 and 2.

For non-secretory proteins all the scores represented in the SignalP3-NN output should ideally be very low.

The hidden Markov model calculates the probability of whether the submitted sequence contains a signal peptide or not. The eukaryotic HMM model also reports the probability of a signal anchor, previously named uncleaved signal peptides. Furthermore, the cleavage site is assigned by a probability score together with scores for the n-region, h-region, and c-region of the signal peptide, if such one is found.


EXAMPLES OF STANDARD OUTPUT

By default the server produces the following output for each input sequence:

Example 1: secretory protein

The example below shows the output for thioredoxin domain containing protein 4 precursor (endoplasmic reticulum protein ERp44), taken from the
Swiss-Prot entry TXN4_HUMAN. The signal peptide prediction is consistent with the database annotation.

>TXN4_HUMAN

SignalP-NN result:


# data

>Sequence              length = 70
# Measure  Position  Value  Cutoff  signal peptide?
  max. C    30       0.565   0.32   YES
  max. Y    30       0.690   0.33   YES
  max. S    12       0.989   0.87   YES
  mean S     1-29    0.852   0.48   YES
       D     1-29    0.771   0.43   YES
# Most likely cleavage site between pos. 29 and 30: VTT-EI

SignalP-HMM result:

# data

>TXN4_HUMAN
Prediction: Signal peptide
Signal peptide probability: 0.984
Signal anchor probability: 0.015
Max cleavage site probability: 0.962 between pos. 29 and 30
# gnuplot script for making the plot(s)
Explain the output. Go back.


Example 2: non-secretory protein

The example below shows the output for BMP-2 inducible protein kinase (EC 2.7.1.37), a nuclear protein taken from the Swiss-Prot entry BM2K_HUMAN. No signal peptide is predicted.

>BM2K_HUMAN

SignalP-NN result:


# data

>BM2K_HUMAN            length = 70
# Measure  Position  Value  Cutoff  signal peptide?
  max. C    20       0.035   0.32   NO
  max. Y    20       0.034   0.33   NO
  max. S    12       0.263   0.87   NO
  mean S     1-19    0.063   0.48   NO
       D     1-19    0.049   0.43   NO

SignalP-HMM result:

# data

>BM2K_HUMAN
Prediction: Non-secretory protein
Signal peptide probability: 0.157
Signal anchor probability: 0.023
Max cleavage site probability: 0.027 between pos. 28 and 29
# gnuplot script for making the plot(s)
Explain the output. Go back.

EXAMPLE OF SHORT OUTPUT

When selecting the short output format, the prediction for each submitted sequence (in a multisequence FASTA file) are reported on a single line, one for each fasta entry. A two line header is included, showing the information of the different columns.
# SignalP-NN euk predictions                                   	      # SignalP-HMM euk predictions
# name       Cmax  pos ?  Ymax  pos ?  Smax  pos ?  Smean ?  D     ?  # name      !  Cmax  pos ?  Sprob ?
TXN4_HUMAN   0.565  30 Y  0.690  30 Y  0.989  12 Y  0.852 Y  0.771 Y  TXN4_HUMAN  S  0.962  30 Y  0.984 Y  
BM2K_HUMAN   0.035  20 N  0.034  20 N  0.263  12 N  0.063 N  0.049 N  BM2K_HUMAN  Q  0.027  29 N  0.157 N  



GETTING HELP

Scientific problems:        Technical problems: