Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Usage Instructions



The user interface of Seq2Logo is split in 3 parts; submission, graphical layout and advanced settings.


Submission

In the submission part the user can:

  1. Upload their alignment file, either by copy/paste or by choosing a local file.
  2. Specify the logo type, either Shannon, Kullback-Leibler, Weighted Kullback-Leibler, Probability Weighted Kullback-Leibler or PSSM-Logo.
  3. Choose which kind of sequence weighting should be used to reduce sequence redundancy.
  4. If the Hobohm algorithm is chosen, the user can also specify the similarity threshold for two sequences to be deemed (1 is equal 100% identity, default is 63%).
  5. Assign the weight on prior value that should be used to adjust for a small alignment file (Recomended for dataset with less than 50 sequences).
  6. Type the unit of the Y-axis. (It is important to note that MSA and rawpeptide input data will always be calculated as bit content *)
  7. Choose additional output formats for the logo file.
* Shaner et al. gives a good description of the information content (the bit content) in their paper 'Sequence Logos: A Powerful, Yet Simple, Tool ', which can be accessed from their site.

NOTE: As stated above the paste field Seq2Logo supports following alignments formats: Peptides, Fasta, Clustal, weight matrices and frequency tables.

FORMAT DESCRIPTIONS:

The Peptide format is a file where each line is a new peptide sequence, only the amino acid and gap symbols are accepted.

The Fasta format is a file where '>' describes the header line, and all following lines composes the sequence belonging to the header. Only the amino acid and gap symbols are accepted in the sequence.

The CLUSTAL format is a file where the data is separated in two or three columns, first column containing the sequence name, second column containing the sequence, and the optional third column containing the position number of the last amino acid.

The PSSM format is a file where the data is stored in a weight matrix. There are a few different formats accepted by Seq2Logo:

General for all PSSM is the optional header line (starting with: 'Last position-specific scoring matrix...'), and the required amino acid header line (this can now contain other character if the PSSM-logo is chosen).
In regards to the weights in the PSSM, only numbers (integers, floats and scientific), are allowed.

The Blast Matrix: Special format.

Simple Weight Matrix: This is the simplest of the weight matrices, with only the weights provided. (Note: These weights cannot be integers!)

Weight Matrix w/ position: This is the same as the simple matrix, but with the first column specifying the position (Note: Integers allowed!)

Weight Matrix w/ position and consensus sequence: This is the same as the position matrix, but with an aditional column specifying the consensussequence (Note: This extra column is not used by Seq2Logo, but only allowed for the convenience.)

Special Weight Matrix: This is a scrapped version of the simple matrix, and it allows the user to specify other than amino acids eg. gaps. (Note: This matrix can only be used with the PSSM-logo option, and there is a limitation of minimum 3 characters and maximum 20 characters!)

The Frequency format is identical to a PSSM-matrix, but where weights/frequencies sums up to 1.00 per position (up to 2% inaccuracy allowed), and where of cause no weight/frequency is negative.



Submission


 

Graphical Layout

In the grafical layout part the user can:

  1. Assign the number of stacks per line.
  2. Assign the number of lines per page.
  3. Set the resolution of the image. For convenience a dropdown menu has been provided with som standard formats to choose from.
  4. Assign a logo title. (This is optional.)
  5. Specify the layout of the graph. **
  6. Choose a coloring scheme from the list, or assign the colors of the individual amino acids manually.
*Feel free to send an email request if you want additional formats added.
**This field allows you to really customize your logo.
Graphical Layout


 

Advanced Settings

In the advanced settings part the user can:

  1. Set the minimum width for stacks with gaps. * **
  2. Set the position number of the first amino acid in the alignment.**
  3. Set the frequency of which the position numbers are shown on the X-axis. ***
  4. Set a segment range, if only a part of the full alignment is wanted. ****
  5. Set the Y-axis range, This option allows the user to manually set the Y-axis maximum and minimum value, which makes it easier to compare several logos with eachother. *****
  6. Upload separate substitution frequency matrix.
  7. Upload separate Background frequency file (distribution of amino acids).
*If set to 1 there is no width adjustment of the stacks to show positions where gaps occur.
**This feature is meant for MSA and rawpeptide formats only.
***If the value is set to 1 all positions numbers are shown. If the value is left out or 0 the interval is determined automatically.
****Use the following format is "start-end", eg. 5-56
*****Use the following format: "Ymin:Ymax", eg. -4.32:4.32
Advanced Options


Implementation of easy access to Seq2Logo from other servers

Learn how to make an easy transfer of alignment files from your program or webpage to Seq2Logo Click here.
 


GETTING HELP

Scientific problems:        Technical problems: