1. Specify the input sequences
All the input sequences must be in one-letter amino acid
code. The allowed alphabet (not case sensitive) is as follows:
A C D E F G H I K L M N P Q R S T V W Y and X (unknown)
All the other symbols will be converted to X before processing. The
sequences can be input in the following ways:
Paste a one or more sequences in
FASTA or TAB (see below)
format into the upper window of the main server page.
Select a FASTA or TAB
file on your local disk, either by typing the file name into the lower window
or by browsing the disk.
You can press "Submit" at this point to run the query using default parameters.
2. Working with sequence annotation
The FeatureMap3D server has the option of accepting
Sequence feature annotation along with the protein sequence itself.
Such feature annotation could be location of active site, phosphorylation sites,
annotation of the underlying exon structure, parts of the protein affected by
alternative splicing etc.
The annotation is not added automatically, the user must supply this either through
the sequence annotation window or by submitting an annotated sequence file in TAB
format (see detials below). A number of codes have been built into the program, which will show the
annotated amino acids in a predefined way. These are shown in the table below. Here it
can be seen, that if the user adds the annotation "RHA1_ASPAC 26, 209, 212: A",
the three residues at position 26, 209 and 212 will be shown as yellow sticks. In
this case, the three residues make up the catalytic triad of the enzyme. If the
user wishes to show the localization of a metal binding site, a critical cleavage
point, disulfide bridges or any other feature of interest, the letter "X" can be
used. Any position annotated with "X" will be shown as white sticks. If, for
example, a disulfide bridge at positions 231 and 249 were to be shown in white
stick representation, the user would enter "RHA1_ASPAC 231, 249: X". The user is
however free to enter "RHA1_ASPAC 231, 249: A" if he wishes to show the disulfide
bridge as yellow sticks instead - the codes are only used to specify a certain
color and representation and the program does not use the information about which
type of annotation is to be shown. Finally, the images can be customized freely
in PyMol using the supplied PyMol script file.
The location of any such annotated feature will be displayed at the corresponding
site in the structure of the hit, by highlighting the amino acid residue of the hit structure
at that position. The hit structure does therefore not need to have N-glycosylation or even an
asparagine at an annotated N-glycosylation site - the image simply shows where in the structure
of the hit the glycosylated residue would be, based on the sequence alignment shown below the figure.
It is then up to the user to decide whether the sequence identity between the query and the target so
high, that the local structure of the annotated site is likely to be conserved. The sequence
conservation coloring and the sequence alignment between the query and the hit gives a quick overview
of where the conserved and the more variable parts are in the structure. If the BLAST alignment is poor
in the region of the annotated site, the region around the highlighted amino acid is less likely to
represent the local structure of the query sequence.
The color scheme used for highlighting sequence annotations is as follows:
|. || Null annotation || . || . |
|A || Active site || yellow || stick|
|N || N-glycosylation || red || spheres|
|O || O-glycosylation || purple || spheres|
|S || S-phosphorylation || cyan || spheres|
|T || T-phosphorylation || slate blue|| spheres|
|Y || Y-phosphorylation || blue || spheres|
|U || Tyr-sulfation || orange || spheres|
|X || Generic PTM || white || stick|
|0 || Custom backbone color || black || . |
|1 || Custom backbone color || white/slate blue|| . |
|2 || Custom backbone color || red || . |
|3 || Custom backbone color || cyan || . |
|4 || Custom backbone color || purple || . |
|5 || Custom backbone color || green || . |
|6 || Custom backbone color || blue || . |
|7 || Custom backbone color || yellow || . |
|8 || Custom backbone color || orange || . |
|9 || Custom backbone color || brown || . |
In the Color and Graphics columns "."
means "No effect".
The description of the type of annotation is only meant as a guide-line, and the annotation letters
can be freely used as a mean for highlighting any kind of feature (e.g. disulfide bridges can be
annotated to "X" to mark the positions in white stick representation or "A" to use yellow stick representation).
2.1. Submitting files in TAB format
The FeatureMap3D server has support for directly working with TAB files containing both
sequence and sequence annotation. This is especially useful for working with
annotation generated computationally.
The general structure of a TAB file is very simple. It consist of 4 fields,
separated by the TAB character:
NAME SEQ ANN COM
Name: Name of the sequence.
Seq: The peptide sequence itself
Ann: An annotation string of the same length as the sequences.
Com: A comment field. May be empty.
Usually TAB files are generated by computational methods such as the
Virtual Ribosome or custom scripting.
FeatureExtract server has many details on the use of TAB files in general.
2.2. Add descriptive sequence annotation
It is possible to manually add annotation to the sequences in a FASTA file,
by following the prodecure as shown in the following example.
This way of adding annotation is useful for working with one, or a few, sequence(s)
in a manual manner.
Example - adding annotation to RHA1
The sequence is first entered in FASTA format:
Avoid too complex names: The original header looked like this:
>sp|Q00017|RHA1_ASPAC Rhamnogalacturonan acetylesterase precursor (EC 3.1.1.-) (RGAE) - Aspergillus aculeatus.
When using the method of adding manual descriptive annotation, the name must be be alpha-numeric
(The letters [a..z
], the numbers [0..9
] and underscore "_
"). Also, making the name shorter only
makes it easier to write the manual descriptive annotation.
The features of interest are in this case found in the
Below is shown part of the Swiss-Prot entry Q00017.
FT SIGNAL 1 17
FT CHAIN 18 250 RHAMNOGALACTURONAN ACETYLESTERASE.
FT ACT_SITE 26 26
FT ACT_SITE 209 209
FT ACT_SITE 212 212
FT DISULFID 105 113
FT DISULFID 231 249
FT CARBOHYD 121 121 N-LINKED (GLCNAC...).
FT CARBOHYD 199 199 N-LINKED (GLCNAC...) (HIGH MANNOSE).
SQ SEQUENCE 250 AA; 26351 MW; C30676A3A876A38B CRC64;
MKTAALAPLF FLPSALATTV YLAGDSTMAK NGGGSGTNGW GEYLASYLSA TVVNDAVAGR
SARSYTREGR FENIADVVTA GDYVIVEFGH NDGGSLSTDN GRTDCSGTGA EVCYSVYDGV
NETILTFPAY LENAAKLFTA KGAKVILSSQ TPNNPWETGT FVNSPTRFVE YAELAAEVAG
VEYVDHWSYV DSIYETLGNA TVNSYFPIDH THTSPAGAEV VAEAFLKAVV CTGTSLKSVL
It can be seen, that
there is a signal peptide from 1-17 (which we will annotate with an "S"),
three active site residues at 26, 209 and 212 (we will label them "A")
and two sites of N-glycosylation: 121 and 199 (they will get an "N" in the annotation field).
The annotation is then entered like this (Note that the sequence id is the first
"word" in the header line of the FASTA file including everything from ">" up until the first space)
RHA1_ASPAC 1-17: S
RHA1_ASPAC 26, 209, 212: A
RHA1_ASPAC 121,199 : N
Note, that there is no ":"
between the sequence id and the sequence numbers.
Lines with syntax errors will be ignored.
3. Customize your run
Decide how many BLAST
in the PDB you want FeatureMap3D to consider. FeatureMap3D
will select the best among those hits, based on the selection criteria defined
It is possible to mask low complexity regions in the query sequence if desired, and
non-X-ray structures may be skipped. The default for these options is not to mask
low complexity regions and to include non-X-ray structures. (CA-only structures
and theoretical models are always skipped).
You may choose to retrieve all hits to the query sequence instead of just
the best hit by selecting "Show all hits". If you choose to do so, the selection
criteria described below will not apply.
The selections process may be customized by the user by selecting sequence
identity, chain length and X-ray resolution thresholds. All sequences which
do not fulfill these criteria are discarded before the selection process.
If the query sequence is
annotated, it is possible to use the annotation as a selection criteria, and any
PDB hits which contain no annotated sites are discarded. The remaining structures
are then sorted by sequence identity, number of annotated sites and resolution,
in that order. It is possible to show a short summary of the selection process
to get a sorted listing of all the found structures by selecting this option.
It is important to notice, that if the query sequence has multiple domains,
each of which is represented by its own structure, FeatureMap3D will output
several structures, one per structurally determined domain.
4. Submit the job
Click on the "Submit"
button. The status of your job (either 'queued'
or 'running') will be displayed and constantly updated until it terminates and
the server output appears in the browser window.
At any time during the wait you may enter your e-mail address and simply leave
the window. Your job will continue; you will be notified by e-mail when it has
terminated. The e-mail message will contain the URL under which the results are
stored; they will remain on the server for 24 hours for you to collect them.
Alpha-globin example (FASTA)
Alpha-globin example (TAB)
TAB file containing the same Alpha-A globin as above, but in TAB format
with the underlying exon structure annotated. This example was
generated using the
Virtual Ribosome server.
RHA1 (FASTA) + descriptive annotation
(Same sequence as described in details in section 2.2 on this page)
RHA1_ASPAC 1-17: S
RHA1_ASPAC 26, 209, 212: A
RHA1_ASPAC 121,199 : N
The exact same example as above, but this time the annotation has been put into a TAB file.