DESCRIPTION OF THE SCORES
The graphical output from SignalP (neural network) comprises three different
scores, C, S and Y. Two additional scores are
reported in the SignalP3-NN output, namely the S-mean and the
D-score, but these are only reported as numerical values.
For each organism class in SignalP; Eukaryote, Gram-negative and
Gram-positive, two different neural networks are used, one for
predicting the actual signal peptide and one for predicting the
position of the signal peptidase I (SPase I) cleavage site. The
S-score for the signal peptide prediction is reported for
every single amino acid position in the submitted sequence, with
high scores indicating that the corresponding amino acid is part
of a signal peptide, and low scores indicating that the amino acid
is part of a mature protein.
The C-score is the ``cleavage site'' score. For each
position in the submitted sequence, a C-score is reported, which
should only be significantly high at the cleavage site. Confusion
is often seen with the position numbering of the cleavage site.
When a cleavage site position is referred to by a single number,
the number indicates the first residue in the mature protein,
meaning that a reported cleavage site between amino acid 26-27
corresponds to that the mature protein starts at (and include)
Y-max is a derivative of the C-score combined with the
S-score resulting in a better cleavage site prediction than the
raw C-score alone. This is due to the fact that multiple
high-peaking C-scores can be found in one sequence, where only one
is the true cleavage site. The cleavage site is assigned from the
Y-score where the slope of the S-score is steep and a significant
C-score is found.
The S-mean is the average of the S-score, ranging from the
N-terminal amino acid to the amino acid assigned with the highest
Y-max score, thus the S-mean score is calculated for the length of
the predicted signal peptide. The S-mean score was in SignalP
version 2.0 used as the criteria for discrimination of secretory
and non-secretory proteins.
The D-score is introduced in SignalP version 3.0 and is a
simple average of the S-mean and Y-max score. The score shows
superior discrimination performance of secretory and non-secretory
proteins to that of the S-mean score which was used in SignalP
version 1 and 2.
For non-secretory proteins all the scores represented in the
SignalP3-NN output should ideally be very low.
The hidden Markov model calculates the probability of whether the
submitted sequence contains a signal peptide or not. The
eukaryotic HMM model also reports the probability of a signal
anchor, previously named uncleaved signal peptides. Furthermore,
the cleavage site is assigned by a probability score together with
scores for the n-region, h-region, and c-region of the signal
peptide, if such one is found.
Example 2: non-secretory protein
The example below shows the output for BMP-2 inducible protein kinase
(EC 188.8.131.52), a nuclear protein taken from the