Mail Prediction Server


The NetPlantGene Mail server is a service producing neural network predictions of splice sites in Arabidopsis thaliana DNA.

The service has been enhanced and is now supported by the NetGene2 splice site prediction program. Using the mailserver or this WWW service will use NetGene2 to make predictions of splice sites in plant genes.

INSTRUCTIONS

In order to use the NetPlantGene mail-server for prediction on nucleotides sequences, you should prepare a message for the server NetPlantGene@cbs.dtu.dk
The submitted sequences are kept confidential and will be erased immediately after processing.

  1. A number of keywords can be put in the message, controlling the processing of the sequence. The keywords should be put first on single lines before the sequences in the mail message

    • help
      NetGene2 returns help on using the mailserver similar to this information).

    • score
      Return predictions in a numerical form. This is useful for detailed analysis of a sequence. The mail returned contains for each sequence submitted to the server a line starting with the symbol `>' followed by the name and the length of the sequence. This is followed by twelve collumns, with the following information given by column number (see output format below)

    • nocomplement
      As a default NetGene2 makes predictions on the complementary strand as well as the direct strand (the submitted strand). Include the word 'nocomplement' if prediction for the direct strand is wished only.

    • nodescription
      A description of the output format is concatenated to results files returned from NetGene2. If this is not wished include the keyword 'nodesrciption'.

    • nopostscript
      A figure showing details of the neural network prediction is mailed as default. In case this is not wanted, the keyword 'nopostscript' may be included in the mail.

  2. Enter your sequences in the message. Each sequence must be preceded by a line starting by > followed by a name or identifyer of the sequence. There must be at least one character at each line of each sequence.

  3. The sequences must be submitted using the one letter abbreviations for the nucleotides: `acgtACGT', and should be more than 200 (preferably more than 250) and less than 100000 nucleotides long. The mailserver can handle up to 1500 sequences pr. email.

  4. Other characters will be accepted as don't care, when making the prediction.

  5. Example: Entering a sequence: the syntax of the mail should look like this:

    >sequence1 
    GAAATGCTCAAGTTGTTTAGTCATTTGTAATAACAGTTTTTTTTTTAAAGATTGTTTCTCAAATA
    TCTTGAAATGATGTAGAAATGCTCAAGTTGTTTAGTCATTTGTAATAACAGTTTTTTTTTTAAAG
    ATTGTTTCTCAAATATCTTGAAATGATGTAGAAATGCTCAAGTTGTTTAGTCATTTGTAATTCAA
    TGCAGGCGACAGACAAGCCGGTGGCAGTCGGTTTTGGAATATCAAAGCCGGAGCATGTGAAACAG
    GTCCTAAACCTTACAAAGCTTCTCTTATGTTGTCTTCATAGTTACTTAATAACAGTTTTTTTTTT
    AAAGATTGTTTCTCAAATATCTTGAAATGATGTAGAAATGCTCAAGTTGTTTAGTCATTTGTAAT
    AACAGTTTTTTTTTTAAAGATTGTTTCTCAAATATCTTGAAATGATGTAGAAATGCTCAAGTTGT
    TTAGT
    >sequence2
    GAAATGCTCAAGTTGTTTAGTCATTTGTAATAACAGTTTTTTTTTTAAAGATTGTTTCTCAAATA
    TCTTGAAATGATGTAGAAATGCTCAAGTTGTTTAGTCATTTGTAATAACAGTTTTTTTTTTAAAG
    ATTGTTTCTCAAATATCTTGAAATGATGTAGAAATGCTCAAGTTGTTTAGTCATTTGTAATTCAA
    TGCAGGCGACAGACAAGCCGGTGGCAGTCGGTTTTGGAATATCAAAGCCGGAGCATGTGAAACAG
    GTCCTAAACCTTACAAAGCTTCTCTTATGTTGTCTTCATAGTTACTTAATAACAGTTTTTTTTTT
    AAAGATTGTTTCTCAAATATCTTGAAATGATGTAGAAATGCTCAAGTTGTTTAGTCATTTGTAAT
    AACAGTTTTTTTTTTAAAGATTGTTTCTCAAATATCTTGAAATGATGTAGAAATGCTCAAGTTGT
    TTAGT
    
  6. Putting a single '.' on a line (no other things) will make a forced end of the data, thus making it possible to eliminate such things as signatures from the sequences.
  7. The server will email you the results, including scores if you have selected this. The server will not give graphics output for extremely long sequences.

PAPER TO REFERENCE WHEN REPORTING RESULTS:

S.M. Hebsgaard, P.G. Korning, N. Tolstrup, J. Engelbrecht, P. Rouze, S. Brunak: Splice site prediction in Arabidopsis thaliana DNA by combining local and global sequence information, Nucleic Acids Research, 1996, Vol. 24, No. 17, 3439-3452.