Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Output format

DESCRIPTION OF THE SCORES

The neural networks in SignalP produce three output scores for each position in the input sequence:

C-score (raw cleavage site score)
The output from the CS networks, which are trained to distinguish signal peptide cleavage sites from everything else.
Note the position numbering of the cleavage site: the C-score is trained to be high at the position immediately after the cleavage site (the first residue in the mature protein).
S-score (signal peptide score)
The output from the SP networks, which are trained to distinguish positions within signal peptides from positions in the mature part of the proteins and from proteins without signal peptides.
Y-score (combined cleavage site score)
A combination (geometric average) of the C-score and the slope of the S-score, resulting in a better cleavage site prediction than the raw C-score alone. This is due to the fact that multiple high-peaking C-scores can be found in one sequence, where only one is the true cleavage site. The Y-score distinguishes between C-score peaks by choosing the one where the slope of the S-score is steep.
The graphical output from SignalP (see below) shows the three different scores, C, S and Y, for each position in the sequence.

In the summary below the plot, the maximal values of the three scores are reported. In addition, the following two scores are shown:

mean S
The average S-score of the possible signal peptide (from position 1 to the position immediately before the maximal Y-score).
D-score (discrimination score)
A weighted average of the mean S and the max. Y scores. This is the score that is used to discriminate signal peptides from non-signal peptides.

For non-secretory proteins all the scores represented in the SignalP output should ideally be very low (close to the negative target value of 0.1).

EXAMPLE OUTPUT

By default the server produces the following output for each input sequence:

Example: secretory protein — standard output format

The example below shows the output for thioredoxin domain containing protein 4 precursor (endoplasmic reticulum protein ERp44), taken from the Swiss-Prot entry ERP44_HUMAN. The signal peptide prediction is consistent with the database annotation.
# SignalP-4.1 euk predictions
>sp_Q9BS26_ERP44_HUMAN  Endoplasmic reticulum resident protein 44 OS_Homo sapiens GN_ERP44 PE_1 SV_1


# Measure  Position  Value    Cutoff   signal peptide?
  max. C    30       0.427
  max. Y    30       0.586
  max. S     9       0.950
  mean S     1-29    0.821
       D     1-29    0.713   0.450   YES
Name=sp_Q9BS26_ERP44_HUMAN	SP='YES' Cleavage site between pos. 29 and 30: VTT-EI D=0.713 D-cutoff=0.450 Networks=SignalP-noTM
# data
# gnuplot script

Signal peptides: 1
# processed fasta entries
# gff file of processed entries
Below the summary for each sequence, two files are provided via links: "data" and "gnuplot script". If you have the free graphics program gnuplot on your computer, you can use these two files to customize your plot.

Below the output for all the sequences, two other files are provided via links, if at least one signal peptide has been predicted. These are "processed fasta entries", a FASTA sequence file containing the sequences of protein that had predicted signal peptides, with the signal peptide removed; and "gff file of processed entries", a file showing the siganl peptides feature of those proteins that had predicted signal peptides in GFF format.

Example: secretory protein — short output format

# SignalP-4.0 euk predictions
# name                     Cmax  pos  Ymax  pos  Smax  pos  Smean   D     ?  Dmaxcut    Networks-used
ERP44_HUMAN                0.427  30  0.586  30  0.950   9  0.821   0.713 Y  0.450      SignalP-noTM



GETTING HELP

Scientific problems:        Technical problems: