Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Output format


VISUALIZATION AND ALIGNMENT

In the first two examples Columba livia (domestic pigeon) Alpha-A Globin is used as the query (GenBank id: AB001981).

Visualizing homology

Query : AB001981_alpha-A_Pigeon.7.1 (142 aa)
Hit   : 1A4F (molecule: HEMOGLOBIN;) (chain: a) (resolution: 2.0  ANGSTROMS.)
        Organism: ANSER INDICUS;

                                                                                        
      2 VLSANDKSNVKAVFGKIGGQAGDLGGEALERLFITYPQTKTYFPHFDLSHGSAQIKGHGKKVAEALVEAANHIDDIAGAL 81
        |||| ||:|||.||.||.|.| : |.|.|||:| .|||||||||||||.|||||||.|||||. |||||.||||||||||
      1 VLSAADKTNVKGVFSKISGHAEEYGAETLERMFTAYPQTKTYFPHFDLQHGSAQIKAHGKKVVAALVEAVNHIDDIAGAL 80
           HHHHHHHHHHHHHHTT HHHHHHHHHHHHHHH GGGGGG TTS  STT HHHHHHHHHHHHHHHHHHTTTT HHHHT
                                                                                        

                                                                     
     82 SKLSDLHAQKLRVDPVNFKLLGHCFLVVVAVHFPSLLTPEVHASLDKFVCAVGTVLTAKYR 142
        |||||||||||||||||||.||||||||||:| || || |||||||||:||||||||||||
     81 SKLSDLHAQKLRVDPVNFKFLGHCFLVVVAIHHPSALTAEVHASLDKFLCAVGTVLTAKYR 141
        HHHHHHHHTTS   THHHHHHHHHHHHHHHHH TTT  HHHHHHHHHHHHHHHHHTT TT 
                                                                     

This image shows how the default coloring scheme indicates the quality of underlying pairwise alignment.

Color key:

  1. Green: Perfect match
  2. Brown: Mismatch (low significance)
  3. Violet: Mismatch (high significance)
  4. Dark gray: Unmatched chain(s) - in this case the Beta-globin chain.
  5. Blue: Sequence gap in query sequence

Custom backbone coloring

Query : AB001981_alpha-A_Pigeon.7.1 (142 aa)
Hit   : 1A4F (molecule: HEMOGLOBIN;) (chain: a) (resolution: 2.0  ANGSTROMS.)
        Organism: ANSER INDICUS;

        11111111111111111111111111111112222222222222222222222222222222222222222222222222
      2 VLSANDKSNVKAVFGKIGGQAGDLGGEALERLFITYPQTKTYFPHFDLSHGSAQIKGHGKKVAEALVEAANHIDDIAGAL 81
        |||| ||:|||.||.||.|.| : |.|.|||:| .|||||||||||||.|||||||.|||||. |||||.||||||||||
      1 VLSAADKTNVKGVFSKISGHAEEYGAETLERMFTAYPQTKTYFPHFDLQHGSAQIKAHGKKVVAALVEAVNHIDDIAGAL 80
           HHHHHHHHHHHHHHTT HHHHHHHHHHHHHHH GGGGGG TTS  STT HHHHHHHHHHHHHHHHHHTTTT HHHHT
                                                                                        

        2222222222222222222333333333333333333333333333333333333333333
     82 SKLSDLHAQKLRVDPVNFKLLGHCFLVVVAVHFPSLLTPEVHASLDKFVCAVGTVLTAKYR 142
        |||||||||||||||||||.||||||||||:| || || |||||||||:||||||||||||
     81 SKLSDLHAQKLRVDPVNFKFLGHCFLVVVAIHHPSALTAEVHASLDKFLCAVGTVLTAKYR 141
        HHHHHHHHTTS   THHHHHHHHHHHHHHHHH TTT  HHHHHHHHHHHHHHHHHTT TT 
                                                                     

This image show the use of custom coloring of the protein backbone. Here the Virtual Ribosome software has been used to generated a TAB file containing both protein sequence and annotation of the underlying exon structure. The annotation-string is also shown in the alignment.

Color key:

  1. White: Exon 1
  2. Red: Exon 2
  3. Cyan: Exon 3

Working with sidechain annotation

In the this example the protein sequences of the Aspergillus aculeatus Rhamnogalacturonan acetylesterase precursor was annotated with information about active site and N-glycosylation, by adding descriptive annotation (see the example on the Instructions page).

After running the FeatureMap3D query, the generated ZIP archive with alle the files was downloaded and unpacked on a local PC (running Windows). The PyMol script was opened in PyMol by double-clicking on it, and the 1K7C structure was automatically loaded and colored. Following rotation to a better angle, a new ray-traced image was created by clicking on the "ray" button.

NOTICE: This example illustrates how the numbering of the original protein-file (derived from SwissProt) is mapped onto the numbering of the structure - which starts at a different positon. This also means that the annotation of the active site and N-glycosylation is automatically mapped to the corresponding positions in the PDB structure.

Query : RHA1_ASPAC.3.1 (250 aa)
Hit   : 1K7C (molecule: RHAMNOGALACTURONAN ACETYLESTERASE;) (chain: a) (resolution: 1.12 ANGSTROMS.)
        Organism: ASPERGILLUS ACULEATUS;

                A                                                                       
     18 TTVYLAGDSTMAKNGGGSGTNGWGEYLASYLSATVVNDAVAGRSARSYTREGRFENIADVVTAGDYVIVEFGHNDGGSLS 97
        ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
      1 TTVYLAGDSTMAKNGGGSGTNGWGEYLASYLSATVVNDAVAGRSARSYTREGRFENIADVVTAGDYVIVEFGHNDGGSLS 80
         EEEEE  TTTSTTTTSTT   GGGGSGGGBSSEEEE   TT  HHHHHHTTHHHHHHHH  TT EEEE   TTS S GG
                                                                                        

                               N                                                        
     98 TDNGRTDCSGTGAEVCYSVYDGVNETILTFPAYLENAAKLFTAKGAKVILSSQTPNNPWETGTFVNSPTRFVEYAELAAE 177
        ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
     81 TDNGRTDCSGTGAEVCYSVYDGVNETILTFPAYLENAAKLFTAKGAKVILSSQTPNNPWETGTFVNSPTRFVEYAELAAE 160
        G  S   BSSSSS EEEEEETTEEEEEEBHHHHHHHHHHHHHHTT EEEEE      TTTTSS      HHHHHHHHHHH
                                                                                        

                             N         A  A                                      
    178 VAGVEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVLTTTSFEGTCL 250
        |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    161 VAGVEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVLTTTSFEGTCL 233
        HHT EEE HHHHHHHHHHHH HHHHHHT SSSSS   HHHHHHHHHHHHHHHHHHT GGGGGBS    SS   
                                                                                 

Pairwise alignment explained

Using the last segment of the alignment from the RHA1 example above, this is how the alignment should be read:

                             N         A  A                                       <-- annotation 
    178 VAGVEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVLTTTSFEGTCL <-- query sequence
        ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| <-- match/mismatch
    161 VAGVEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVLTTTSFEGTCL <-- hit (PDB) sequence
        HHT EEE HHHHHHHHHHHH HHHHHHT SSSSS   HHHHHHHHHHHHHHHHHHT GGGGGBS    SS    <-- DSSP secondary structure

The DSSP code for secondary structure:

    * H = alpha helix
    * B = residue in isolated beta-bridge
    * E = extended strand, participates in beta ladder
    * G = 3-helix (3/10 helix)
    * I = 5 helix (pi helix)
    * T = hydrogen bonded turn
    * S = bend 

Annotation color scheme

The color scheme used for highlighting sequence annotations is as follows:

LetterDescriptionColorGraphics
. Null annotation . .
A Active site yellow stick
N N-glycosylation red spheres
O O-glycosylation purple spheres
S S-phosphorylation cyan spheres
T T-phosphorylation slate spheres
Y Y-phosphorylation blue spheres
U Tyr-sulfation orange spheres
X Generic PTM white stick
0 Custom backbone color black .
1 Custom backbone color white/slate .
2 Custom backbone color red .
3 Custom backbone color cyan .
4 Custom backbone color purple .
5 Custom backbone color green .
6 Custom backbone color blue .
7 Custom backbone color yellow .
8 Custom backbone color orange .
9 Custom backbone color brown .



GETSTRUCT REPORT

The output from GetStruct is in COL (column) format, a modified protein version of the column format described here. The output is mostly self-explanatory. The κ, t and τ parameters are defined in D.M. Soumpasis and M.C. Strahm (2000) J Biomol Struct Dyn. 17(6):965-79. If the option "Include the selection details for each query sequence" was selected, this line is shown for each of the BLAST hits found in the order that the selection process has determined. This can be used to check whether GetStruct has selected the best structure (marked by "*"), and if the user prefers one of the others, GetStruct can be rerun with the option "Show all hits (no selection, the criteria below will not apply)" in order to get the desired hit.

The DSSP code for secondary structure:

    * H = alpha helix
    * B = residue in isolated beta-bridge
    * E = extended strand, participates in beta ladder
    * G = 3-helix (3/10 helix)
    * I = 5 helix (pi helix)
    * T = hydrogen bonded turn
    * S = bend 

EXAMPLE GETSTRUCT REPORT

>sp_Q00017_RHA1_ASPAC.1.1 (250 aa), hit 1 alignment 1
# Hit sequence name .............. 1K7C.A
# Hit sequence comment ........... HYDROLASE                              
# Hit sequence length ............ 233
# Hit entry resolution ........... 1.12
# Alignment length ............... 233
# Alignment interval in query .... 18-250
# Alignment interval in hit ...... 1-233
# BLAST score .................... 473
# BLAST exp value ................ e-134
# BLAST ident regime ............. 233/233=100%
# 
# Column  1 ...................... Q#,    query residue #
# Column  2 ...................... QA,    query annotation
# Column  3 ...................... Q,     query residue
# Column  4 ...................... I,     identity
# Column  5 ...................... H,     hit residue
# Column  6 ...................... S,     secondary structure
# Column  7 ...................... ACC,   solvent accessibility
# Column  8 ...................... H#,    hit residue #
# Column  9 ...................... H#C,   hit residue # native
# Column 10 ...................... X,     atomic coordinate x
# Column 11 ...................... Y,     atomic coordinate y
# Column 12 ...................... Z,     atomic coordinate z
# Column 13 ...................... K,     KAPPA
# Column 14 ...................... t,     t
# Column 15 ...................... T,     TAV
# Column 16 ...................... Phi,   torsion angle
# Column 17 ...................... Psi,   torsion angle
# Column 18 ...................... Bfact, B factor
# Column 19 ...................... Occ,   occupancy
# 
# /sp_Q00017_RHA1_ASPAC/1/1/100/1.12/e-134/233/233/18/250/5/1K7C.A/
# 
# Q#  A  Q I H  S    ACC    H#   H#C     X        Y       Z          K       t       T      Phi       Psi       Bfact  Occ
# ------------------------------------------------------------------------------------------------------------------------
  18  .  T = T  .     95     1     1    26.443   26.547   37.135  (undef) (undef) (undef)    (undef)  139.8373  16.70 1.00
  19  .  T = T  E     29     2     2    29.608   24.483   36.686  (undef)    0.64   -0.31  -119.6066  141.2528  11.04 1.00
  20  .  V = V  E      0     3     3    33.202   25.539   37.225     0.29    0.69   -0.25  -110.2525  115.3524   9.01 1.00
  21  .  Y = Y  E     23     4     4    35.711   23.553   35.174     0.27    0.61   -0.31   -96.1051  135.6781   8.87 1.00
  22  .  L = L  E      0     5     5    39.363   23.532   36.251     0.25    0.78   -0.07  -113.7708  130.8918   7.99 1.00
  23  .  A = A  E      0     6     6    42.269   22.997   33.850     0.27    0.03    0.39  -124.0696  126.6376   8.27 1.00
  24  .  G = G  .      0     7     7    45.800   22.724   35.242     0.15    0.04   -0.36   177.0294 -164.0539   8.27 1.00
  25  .  D = D  .      1     8     8    48.591   20.446   36.423     0.18    0.24   -0.33  -115.9846 -159.8036   7.87 1.00
  26  A  S = S  T      7     9     9    49.644   18.576   39.587     0.37    0.24    0.53   -61.2798  -19.8555   9.63 1.00
  27  .  T = T  T      6    10    10    49.488   21.783   41.611     0.38    0.56    0.26   -73.9595  -18.0628   8.33 1.00
  28  .  M = M  T      0    11    11    45.715   21.845   40.974     0.32    0.08    0.45  -108.9317  -30.2153   8.30 1.00
  29  .  A = A  S      4    12    12    44.845   18.190   40.380     0.22    0.02   -0.43   -71.7497  153.0791   9.33 1.00
  30  .  K = K  T    137    13    13    43.346   15.565   42.643     0.31    0.54    0.17   -54.3985  132.1347  11.50 1.00
  31  .  N = N  T    104    14    14    46.190   13.655   44.328     0.36    0.52   -0.40    73.0668    7.6670  13.04 1.00
  32  .  G = G  T      6    15    15    48.736   16.336   43.504     0.29    0.48   -0.24    55.0216 -132.5436  12.22 1.00
  33  .  G = G  T     33    16    16    52.128   14.888   42.590     0.36    0.13   -0.47  -101.6921   23.9040  17.91 1.00
  34  .  G = G  S     31    17    17    51.531   11.744   44.636     0.23    0.03    0.45    83.5634 -175.6733  22.69 1.00




GETTING HELP

Scientific problems:        Technical problems: