Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

NNAlign-1.1 Server

OLD VERSION: the most updated version is 1.4


View the version history of this server. All the previous versions are available on line, for comparison and reference.

Instructions Output format Article abstract

1. TRAINING data

Paste peptides in PEPTIDE format

or submit a file directly from your local disk:


Alternatively, upload a trained MODEL

locally saved model:


Sample training data can be found HERE

Use this option if you previously trained a NNAlign model and
wish to use it for predictions on evaluation data

2. EVALUATION data (optional)

Paste in evaluation examples

or upload evaluation examples:


Sample evaluation data in FASTA or PEPTIDE format

3. Set advanced options (optional)

Customize the run by setting the parameters in the section below.

4. SUBMIT job

Depending on the size of your datasets and selected parameters it might take up to a few hours to complete the query.
Please be patient.
Instructions: Paste in or upload training examples to train the artificial neural networks. If you previously trained a NNAlign model, you may upload it in the right hand box, and use it for predictions on the evaluation set.
The evaluation set sequences can be either in FASTA format (sequences are automatically broken down into peptides) or as a list of peptides. For details refer to the instructions & guidelines page


Please read the CBS access policies for information about limitations on the daily number of submissions.

Advanced options

Motif length (also as interval e.g. 7-9)

Flanking amino acids for NN training

Encode flanking region length
Encode peptide length

Neural network encoding

Processing of input data
Linear rescale
Log-transform
Keep repeated flanks for training

Cross-validation method
Extensive cross-validation
Fast evaluation

Stop training on best test-set performance
Subsets for cross-validation
Random subsets
Cluster similar sequences

Threshold for homology clustering

Folds for cross-validation

Number of training cycles

Number of hidden neurons (for an ensemble, give a comma-separated list)

Number of initial seeds per iteration

Number of networks per fold in the final ensemble

Make logos of all single networks in final ensemble
Threshold on evaluation set predictions (for large FASTA files)

Optional prefix for results files (no blank spaces)



Confidentiality:
The sequences are kept confidential and will be deleted after processing.


CITATIONS

For publication of results, please cite:



GETTING HELP

Scientific problems:        Technical problems: