INTRODUCTION This Web Service implements SignalP v. 3.1. It predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes. The method incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks and hidden Markov models. The method is described in detail in the following article: Improved prediction of signal peptides: SignalP 3.0. J D Bendtsen, H Nielsen, G v Heijne and S Brunak. J. Mol. Biol., 340:783-795, 2004. The difference between v. 3.0 and this version is only technical; the predictions are the same. Alongside this Web Service the SignalP method is also implemented as a traditional click-and-paste WWW server at: http://www.cbs.dtu.dk/services/SignalP/ The traditional server offers more detailed output (graphics), extended functionality and comprehensive documentation. It is suitable for close investigation of few proteins; this service is recommended for high throughput projects. SignalP is also available as a stand-alone software package to install and run at the user's site, with the same functionality. For academic users there is a download page at: http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?signalp Other users are requested to write to software@cbs.dtu.dk for details. WEB SERVICE OPERATION This Web Service is fully asynchronous; the usage is split into the following three operations: 1. runService Input: The following parameters and data: * 'organism' - organism type of the input sequences (mandatory) "euk" eukaryotes "gram-" Gram-negative prokaryotes "gram+" Gram-positive prokaryotes * 'method' - prediction method (optional) "nn" neural network only "hmm" hidden Markov models only "nn+hmm" both methods (default) * 'thnn' - threshold for yes/no decision by neural nets (optional) The threshhold setting affects the 'comment' field in the output (see below): if the neural net score is higher than the selected threshold the comment will be "Y", else "N". The default thresholds are 0.43 for "euk", 0.45 for "gram+" and 0.44 for "gram-"; the defaults have been shown to give the highest correlation coefficient on test data. Note: 'thnn' does not affect the signalp-hmm prediction output. * 'sequencedata' [containing multiple 'sequence' element] * 'sequence' * 'id' Unique identifier for the sequence * 'comment' Optional comment * 'seq' Protein sequences, with unique identifiers (mandatory) The sequences must be written using the one letter amino acid code: `acdefghiklmnpqrstvwy' or `ACDEFGHIKLMNPQRSTVWY'. Other letters will be converted to `X' and treated as unknown amino acids. Other symbols, such as whitespace and numbers, will be ignored. All the input sequences are truncated to 70 aa from the N-terminal. Currently, at most 2,000 sequences are allowed per submission. Output: Unique job identifier 2. pollQueue Input: Unique job identifier Output: 'jobstatus' - the status of the job Possible values are QUEUED, ACTIVE, FINISHED, WAITING, REJECTED, UNKNOWN JOBID or QUEUE DOWN 3. fetchResult Input : Unique job identifier of a FINISHED job Output: * 'annsource' 'method' : SignalP (options ...) 'version' : 3.1 ws0 * 'ann' (array of annotations - one element per input sequence) 'sequence' (standard sequence object) 'id' : Sequence identifier 'comment' : Sequence comment 'seq' : Sequence 'annrecords' (array of predicted features for this sequence) 'annrecord' (annotation record) 'feature : either 'signal-nn' or 'signal-hmm' 'range' 'begin' : 1 'end' : End postion of signal 'score' 'key' : Either nn_score or hmm_score 'value : Prediction score: For "signalp-3.1-nn" D score, for "signalp-3.1-hmm" the signal peptide branch probability (see the article) 'comment : Answer: For "signalp-3.1-nn" the answer is "Y" (yes) and "N" (no) depending on the selected threshold. For "signalp-3.1-hmm" the answer is "S" f or signal peptide, "A" for signal anchor ("euk" only) and "Q" for none of the above. KNOWN BUGS 2007-01-30 Error handling: some error messages may be non-informative; fix in progress; 2007-01-29 The server side may time out processing large submission; temporary fix: submit not more than 2,000 sequences at a time; permanent fix in progress. CONTACT Questions concerning the scientific aspects of the SignalP method should go to Henrik Nielsen, hnielsen@cbs.dtu.dk; technical questions concerning the Web Service should go to Peter Fischer Hallin, pfh@cbs.dtu.dk or Kristoffer Rapacki, rapacki@cbs.dtu.dk.