Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Usage instructions



1. Specify the input sequences

All the input sequences must be in one-letter amino acid code. The allowed alphabet (not case sensitive) is as follows:

A C D E F G H I K L M N P Q R S T V W Y and X (unknown)

All the other symbols will be converted to X before processing. The sequences can be input in the following ways:

  • Paste a one or more sequences in FASTA or TAB (see below) format into the upper window of the main server page.

  • Select a FASTA or TAB file on your local disk, either by typing the file name into the lower window or by browsing the disk.

You can press "Submit" at this point to run the query using default parameters.


2. Working with sequence annotation

The FeatureMap3D server has the option of accepting Sequence feature annotation along with the protein sequence itself. Such feature annotation could be location of active site, phosphorylation sites, annotation of the underlying exon structure, parts of the protein affected by alternative splicing etc.

The annotation is not added automatically, the user must supply this either through the sequence annotation window or by submitting an annotated sequence file in TAB format (see detials below). A number of codes have been built into the program, which will show the annotated amino acids in a predefined way. These are shown in the table below. Here it can be seen, that if the user adds the annotation "RHA1_ASPAC 26, 209, 212: A", the three residues at position 26, 209 and 212 will be shown as yellow sticks. In this case, the three residues make up the catalytic triad of the enzyme. If the user wishes to show the localization of a metal binding site, a critical cleavage point, disulfide bridges or any other feature of interest, the letter "X" can be used. Any position annotated with "X" will be shown as white sticks. If, for example, a disulfide bridge at positions 231 and 249 were to be shown in white stick representation, the user would enter "RHA1_ASPAC 231, 249: X". The user is however free to enter "RHA1_ASPAC 231, 249: A" if he wishes to show the disulfide bridge as yellow sticks instead - the codes are only used to specify a certain color and representation and the program does not use the information about which type of annotation is to be shown. Finally, the images can be customized freely in PyMol using the supplied PyMol script file.

The location of any such annotated feature will be displayed at the corresponding site in the structure of the hit, by highlighting the amino acid residue of the hit structure at that position. The hit structure does therefore not need to have N-glycosylation or even an asparagine at an annotated N-glycosylation site - the image simply shows where in the structure of the hit the glycosylated residue would be, based on the sequence alignment shown below the figure. It is then up to the user to decide whether the sequence identity between the query and the target so high, that the local structure of the annotated site is likely to be conserved. The sequence conservation coloring and the sequence alignment between the query and the hit gives a quick overview of where the conserved and the more variable parts are in the structure. If the BLAST alignment is poor in the region of the annotated site, the region around the highlighted amino acid is less likely to represent the local structure of the query sequence.

The color scheme used for highlighting sequence annotations is as follows:

LetterDescriptionColorGraphics
. Null annotation . .
A Active site yellow stick
N N-glycosylation red spheres
O O-glycosylation purple spheres
S S-phosphorylation cyan spheres
T T-phosphorylation slate blue spheres
Y Y-phosphorylation blue spheres
U Tyr-sulfation orange spheres
X Generic PTM white stick
0 Custom backbone color black .
1 Custom backbone color white/slate blue .
2 Custom backbone color red .
3 Custom backbone color cyan .
4 Custom backbone color purple .
5 Custom backbone color green .
6 Custom backbone color blue .
7 Custom backbone color yellow .
8 Custom backbone color orange .
9 Custom backbone color brown .
In the Color and Graphics columns "." means "No effect".

The description of the type of annotation is only meant as a guide-line, and the annotation letters can be freely used as a mean for highlighting any kind of feature (e.g. disulfide bridges can be annotated to "X" to mark the positions in white stick representation or "A" to use yellow stick representation).

2.1. Submitting files in TAB format

The FeatureMap3D server has support for directly working with TAB files containing both sequence and sequence annotation. This is especially useful for working with annotation generated computationally.

The general structure of a TAB file is very simple. It consist of 4 fields, separated by the TAB character:

NAME SEQ ANN COM
Name: Name of the sequence.
Seq: The peptide sequence itself
Ann: An annotation string of the same length as the sequences.
Com: A comment field. May be empty.

Usually TAB files are generated by computational methods such as the Virtual Ribosome or custom scripting. The FeatureExtract server has many details on the use of TAB files in general.

2.2. Add descriptive sequence annotation

It is possible to manually add annotation to the sequences in a FASTA file, by following the prodecure as shown in the following example. This way of adding annotation is useful for working with one, or a few, sequence(s) in a manual manner.

Example - adding annotation to RHA1

The sequence is first entered in FASTA format:

>RHA1_ASPAC
MKTAALAPLFFLPSALATTVYLAGDSTMAKNGGGSGTNGWGEYLASYLSATVVNDAVAGR
SARSYTREGRFENIADVVTAGDYVIVEFGHNDGGSLSTDNGRTDCSGTGAEVCYSVYDGV
NETILTFPAYLENAAKLFTAKGAKVILSSQTPNNPWETGTFVNSPTRFVEYAELAAEVAG
VEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVL
TTTSFEGTCL

Avoid too complex names: The original header looked like this:

>sp|Q00017|RHA1_ASPAC Rhamnogalacturonan acetylesterase precursor (EC 3.1.1.-) (RGAE) - Aspergillus aculeatus.
When using the method of adding manual descriptive annotation, the name must be be alpha-numeric (The letters [a..z, A..Z], the numbers [0..9] and underscore "_"). Also, making the name shorter only makes it easier to write the manual descriptive annotation.

The features of interest are in this case found in the Swiss-Prot annotation. Below is shown part of the Swiss-Prot entry Q00017.

.
.
.
FT   SIGNAL        1     17
FT   CHAIN        18    250       RHAMNOGALACTURONAN ACETYLESTERASE.
FT   ACT_SITE     26     26
FT   ACT_SITE    209    209
FT   ACT_SITE    212    212
FT   DISULFID    105    113
FT   DISULFID    231    249
FT   CARBOHYD    121    121       N-LINKED (GLCNAC...).
FT   CARBOHYD    199    199       N-LINKED (GLCNAC...) (HIGH MANNOSE).
SQ   SEQUENCE   250 AA;  26351 MW;  C30676A3A876A38B CRC64;
     MKTAALAPLF FLPSALATTV YLAGDSTMAK NGGGSGTNGW GEYLASYLSA TVVNDAVAGR
     SARSYTREGR FENIADVVTA GDYVIVEFGH NDGGSLSTDN GRTDCSGTGA EVCYSVYDGV
     NETILTFPAY LENAAKLFTA KGAKVILSSQ TPNNPWETGT FVNSPTRFVE YAELAAEVAG
     VEYVDHWSYV DSIYETLGNA TVNSYFPIDH THTSPAGAEV VAEAFLKAVV CTGTSLKSVL
     TTTSFEGTCL
It can be seen, that there is a signal peptide from 1-17 (which we will annotate with an "S"), three active site residues at 26, 209 and 212 (we will label them "A") and two sites of N-glycosylation: 121 and 199 (they will get an "N" in the annotation field).

The annotation is then entered like this (Note that the sequence id is the first "word" in the header line of the FASTA file including everything from ">" up until the first space)

RHA1_ASPAC 1-17: S
RHA1_ASPAC 26, 209, 212: A
RHA1_ASPAC 121,199 : N
Note, that there is no ":" between the sequence id and the sequence numbers.

Lines with syntax errors will be ignored.

3. Customize your run

Decide how many BLAST hits in the PDB you want FeatureMap3D to consider. FeatureMap3D will select the best among those hits, based on the selection criteria defined below.

It is possible to mask low complexity regions in the query sequence if desired, and non-X-ray structures may be skipped. The default for these options is not to mask low complexity regions and to include non-X-ray structures. (CA-only structures and theoretical models are always skipped).

You may choose to retrieve all hits to the query sequence instead of just the best hit by selecting "Show all hits". If you choose to do so, the selection criteria described below will not apply.

The selections process may be customized by the user by selecting sequence identity, chain length and X-ray resolution thresholds. All sequences which do not fulfill these criteria are discarded before the selection process. If the query sequence is annotated, it is possible to use the annotation as a selection criteria, and any PDB hits which contain no annotated sites are discarded. The remaining structures are then sorted by sequence identity, number of annotated sites and resolution, in that order. It is possible to show a short summary of the selection process to get a sorted listing of all the found structures by selecting this option.

It is important to notice, that if the query sequence has multiple domains, each of which is represented by its own structure, FeatureMap3D will output several structures, one per structurally determined domain.

4. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.


Examples

Alpha-globin example (FASTA)

>AB001981_alpha-A_Pigeon
MVLSANDKSNVKAVFGKIGGQAGDLGGEALERLFITYPQTKTYFPHFDLSHGSAQIKGHG
KKVAEALVEAANHIDDIAGALSKLSDLHAQKLRVDPVNFKLLGHCFLVVVAVHFPSLLTP
EVHASLDKFVCAVGTVLTAKYR*

Alpha-globin example (TAB)

TAB file containing the same Alpha-A globin as above, but in TAB format with the underlying exon structure annotated. This example was generated using the Virtual Ribosome server.

Download: alpha-a.tab.

RHA1 (FASTA) + descriptive annotation

(Same sequence as described in details in section 2.2 on this page)

FASTA:

>RHA1_ASPAC
MKTAALAPLFFLPSALATTVYLAGDSTMAKNGGGSGTNGWGEYLASYLSATVVNDAVAGR
SARSYTREGRFENIADVVTAGDYVIVEFGHNDGGSLSTDNGRTDCSGTGAEVCYSVYDGV
NETILTFPAYLENAAKLFTAKGAKVILSSQTPNNPWETGTFVNSPTRFVEYAELAAEVAG
VEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVL
TTTSFEGTCL

Descriptive annotation:

RHA1_ASPAC 1-17: S
RHA1_ASPAC 26, 209, 212: A
RHA1_ASPAC 121,199 : N

RHA1 (TAB)

The exact same example as above, but this time the annotation has been put into a TAB file.

Download: rha1+annotation.tab.




GETTING HELP

Scientific problems:        Technical problems: