In this page, we will guide you through an example output, clarifying its meaning.
For this example, we used the Purine nucleoside phosphorylase from Yersinia tuberculosis.
This protein was submitted as a single protein in fasta, as shown in the "Instructions" page. By hovering with the mouse cursor over each part of the output, you will be given an explanation. If you follow the links, you will have an example. Sometimes, in between the output, you will find explanation boxes (as this one). At the very end of this page you will find some more details of particular output files.
So, on top of the output page, you will find something like this:
Result for Q66EV7_DEOD_YERPS_Purine_nucleoside_phosphorylase_deoD_type
The color scheme used for highlighting the co-evolving residues ordered by score is as follows:
Letter
Description
Color
Graphics
C
Conserved residue
yellow
.
1
co-evolving residue
white/slate
stick
2
co-evolving residue
red
stick
3
co-evolving residue
cyan
stick
4
co-evolving residue
purple
stick
5
co-evolving residue
orange
stick
6
co-evolving residue
blue
stick
7
co-evolving residue
yellow
stick
8
co-evolving residue
brown
stick
9
co-evolving residue
black
stick
If a group of residues (i.e., more than 2) are found to coevolve, all the residues in the group will be marked the same color.
Additional colors are available, and marked in the alignment as the letter [a-z], giving a total of
35 different colors (although many of them are similar). The stick shape is maintained.The visualization of more than 9 interacting pairs or groups is therefore a little more attention-demanding.
A list with all the colors, and the associated symbols, can be found here.
Query : Q66EV7_DEOD_YERPS_Purine_nucleoside_phosphorylase_deoD_type___Y.1.1 (239 aa)
Hit : 1PW7 (molecule: PURINE NUCLEOSIDE PHOSPHORYLASE;) (chain: c) (resolution: 2.00 ANGSTROMS.)
Organism: ESCHERICHIA COLI, AND ESCHERICHIA
C 8 a2 CCC C C 8 C 6 C 6 C C C C 4CC 4
2 ATPHINAEMGDFADVVLMPGDPLRAKFIAETFLQDVREVNNVRGMLGFTGTYKGRKISVMGHGMGIPSCSIYAKELITDF 81
||||||||||||||||||||||||||:||||||:|.||||||||||||||||||||||||||||||||||||.|||||||
1 ATPHINAEMGDFADVVLMPGDPLRAKYIAETFLEDAREVNNVRGMLGFTGTYKGRKISVMGHGMGIPSCSIYTKELITDF 80
BTTB TTSS SEEEE S HHHHHHHHHHH EEEEEEE GGG EEEEEETTEEEEEE SSHHHHHHHHHHHHHHS
2 C C 1C 3 3 8 5 5 aC
82 GVKKIIRVGSCGAVRTDVKLRDVVIGMGACTDSKVNRMRFKDHDYAAIADFEMTRNAVDAAKAKGVNVRVGNLFSADLFY 161
|||||||||||||| ||||||||||||||||||||:||||||:||||||:|.||||||||| |::.||||||||||||
81 GVKKIIRVGSCGAVLPHVKLRDVVIGMGACTDSKVNRIRFKDHDFAAIADFDMVRNAVDAAKALGIDARVGNLFSADLFY 160
EEEEEEEEEE STT TT EEEEEEEEES SHHHHHTTTS B HHHHHHHHHHHHHTT EEEEEEE S SS
CC 17 C 7 C C
162 TPDPQMFDVMEKYGILGVEMEAAGIYGVAAEFGAKALTICTVSDHIRTGEQTTAAERQTTFNDMIEIALESVLLGD 237
:|| :||||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||:||||||||||
161 SPDGEMFDVMEKYGILGVEMEAAGIYGVAAEFGAKALTICTVSDHIRTHEQTTAAERQTTFNDMIKIALESVLLGD 236
S TTHHHHHHHTT EEESSHHHHHHHHHHHT EEEEEEEEEEETTT B TTHHHHHHHHHHHHHHHHHHHHH
Explanation box:
The above is an annotated alignment between the query sequence and the sequence of the protein whose sequence is graphically displayed.
The key to the annotation can be seen below:
N A A <-- annotation
178 VAGVEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVLTTTSFEGTCL <-- query sequence
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| <-- match/mismatch
161 VAGVEYVDHWSYVDSIYETLGNATVNSYFPIDHTHTSPAGAEVVAEAFLKAVVCTGTSLKSVLTTTSFEGTCL <-- hit (PDB) sequence
HHT EEE HHHHHHHHHHHH HHHHHHT SSSSS HHHHHHHHHHHHHHHHHHT GGGGGBS SS <-- DSSP secondary structure
The annotation codes used in InterMap3D correspond to the color codes explained before.
The DSSP code for secondary structure:
* H = alpha helix
* B = residue in isolated beta-bridge
* E = extended strand, participates in beta ladder
* G = 3-helix (3/10 helix)
* I = 5 helix (pi helix)
* T = hydrogen bonded turn
* S = bend
Predicted co-evolving residues:
Position 1
Position 2
Prediction
Score
Distance
92
203
co-evolution
4.38
4.983
Å
19
86
co-evolution
3.70
7.332
Å
115
124
co-evolution
3.63
10.501
Å
75
79
co-evolution
3.58
6.196
Å
136
140
co-evolution
3.52
6.212
Å
48
55
co-evolution
3.43
18.880
Å
204
222
co-evolution
3.41
9.505
Å
13
40
co-evolution
3.37
7.644
Å
13
128
co-evolution
3.34
25.554
Å
18
160
co-evolution
3.34
22.860
Å
5
5
conserved
.
21
21
conserved
.
22
22
conserved
.
23
23
conserved
.
25
25
conserved
.
30
30
conserved
.
44
44
conserved
.
51
51
conserved
.
64
64
conserved
.
66
66
conserved
.
69
69
conserved
.
73
73
conserved
.
76
76
conserved
.
77
77
conserved
.
88
88
conserved
.
90
90
conserved
.
93
93
conserved
.
161
161
conserved
.
181
181
conserved
.
182
182
conserved
.
218
218
conserved
.
225
225
conserved
.
230
230
conserved
.
Explanation box:
How to interpret the results in the ranked list of coevolving pairs:
After the alignment comes the list with the results. This list shows the pairs predicted to be coevolving, ordered by the strenght of our belief in their coevolution. That is, by the score (not a significance measure). The score changes with the method used.
Meaning of the scores by method:
RCW MI: The score is the Mutual Information between the residues in position 1 and 2 divided by their background Mutual Information to all other sites.
MI / Entropy: The score is the Mutual Information between the residues in position 1 and 2 divided by the entropy of positions 1 and 2.
Dependency: The score is the entropy-weighed dependency ratio. Basically, is the Mutual Information between the residues in position 1 and 2 divided by their background Mutual Information and then by a factor dependent on the entropy of positions 1 and 2.
Intersection of different methods: In case of intersection, the score is the average rank of the positions in the results list of different methods. If a pair, for instance, was the top hit according to RCW MI, and the 2nd hit on the other method used in the intersection (for instance, Dependency), then its score will be 1.5.
The generated phylogenetic trees:
The trees are generated from the full alignment. When the link (with a depiction of a phylogenetic tree, in pink) is clicked, the software ATV (Zmasek et al., 2001) will be executed in a new window, and the tree file is loaded.
For each sequence depicted on the tree, the amino-acids present on the positions predicted as co-evolving will be shown as node labels. This feature can be turned on or off in the ATV applet.
Tips for better visualization of the phylogenetic trees:
Unchecking the "show taxonomy" box will remove the pair residues names.
Checking the "show seq names" box will display the sequence names.
Unchecking the "color species" box will remove coloring for the labels.
If the tree contains many leaves, it is convenient to "Zoom it on Y" for a better visualization.
Frequencies of occuring pairs:
Clicking on the right-most link (with a depiction of a spreasheet) will open a page showing the frequencies at which each aminoacid pairs occur (for the pair predicted to be co-evolving).
The DSSP code for secondary structure:
* H = alpha helix
* B = residue in isolated beta-bridge
* E = extended strand, participates in beta ladder
* G = 3-helix (3/10 helix)
* I = 5 helix (pi helix)
* T = hydrogen bonded turn
* S = bend
>CDK2_HUMAN.1.1 (298 aa), hit 1 alignment 1
# Hit sequence name .............. 2EXM.A
# Hit sequence comment ........... TRANSFERASE
# Hit sequence length ............ 298
# Hit entry resolution ........... 1.80
# Alignment length ............... 298
# Alignment interval in query .... 1-298
# Alignment interval in hit ...... 1-298
# BLAST score .................... 605
# BLAST exp value ................ e-173
# BLAST ident regime ............. 298/298=100%
#
# Column 1 ...................... Q#, query residue #
# Column 2 ...................... QA, query annotation
# Column 3 ...................... Q, query residue
# Column 4 ...................... I, identity
# Column 5 ...................... H, hit residue
# Column 6 ...................... S, secondary structure
# Column 7 ...................... ACC, solvent accessibility
# Column 8 ...................... H#, hit residue #
# Column 9 ...................... H#C, hit residue # native
# Column 10 ...................... X, atomic coordinate x
# Column 11 ...................... Y, atomic coordinate y
# Column 12 ...................... Z, atomic coordinate z
# Column 13 ...................... K, KAPPA
# Column 14 ...................... t, t
# Column 15 ...................... T, TAV
# Column 16 ...................... Phi, torsion angle
# Column 17 ...................... Psi, torsion angle
# Column 18 ...................... Bfact, B factor
# Column 19 ...................... Occ, occupancy
#
# /CDK2_HUMAN/1/1/100/1.80/e-173/298/298/1/298/48/2EXM.A/
#
# Q# A Q I H S ACC H# H#C X Y Z K t T Phi Psi Bfact Occ
# ------------------------------------------------------------------------------------------------------------------------
1 . M = M . 27 1 1 103.736 111.276 93.725 (undef) (undef) (undef) (undef) 170.1962 69.35 1.00
2 . E = E . 126 2 2 105.169 114.841 93.811 (undef) 0.24 0.47 -104.6923 -11.6199 58.56 1.00
3 . N = N S 77 3 3 108.467 114.674 91.927 0.38 0.21 0.38 -92.2273 4.6679 47.15 1.00
4 . F = F E 12 4 4 106.535 113.718 88.835 0.21 0.51 -0.38 -131.5151 149.8276 42.90 1.00
5 . Q = Q E 95 5 5 103.834 115.410 86.809 0.30 0.41 -0.44 -112.7757 112.2643 49.77 1.00
6 . K = K E 77 6 6 101.359 112.839 85.567 0.30 0.82 0.01 -62.4825 143.0278 42.78 1.00
7 . V = V E 61 7 7 100.654 113.362 81.907 0.35 0.32 -0.10 -124.5373 -1.7162 46.03 1.00
8 . E = E E 120 8 8 98.229 110.463 81.416 0.14 0.10 -0.39 174.0266 168.2736 45.87 1.00
9 . K = K E 103 9 9 96.743 107.183 82.619 0.31 0.80 -0.15 -113.2600 108.3539 43.82 1.00
10 . I = I E 74 10 10 98.271 104.443 80.470 0.38 0.04 -0.40 -73.5653 -17.6334 42.89 1.00
11 C G = G E 26 11 11 95.773 102.022 81.929 0.16 0.16 -0.32 146.5818 -167.5758 41.43 1.00
12 C E = E E 72 12 12 95.074 99.768 84.855 0.27 0.86 -0.13 -125.7959 106.4560 48.42 1.00
13 C G = G . 26 13 13 97.405 96.860 85.441 0.20 0.06 -0.43 -83.2946 -164.8109 42.19 1.00
14 4 T = T S 16 14 14 97.420 93.848 87.749 0.37 0.48 0.30 -67.4280 -35.3884 45.14 1.00
15 4 Y = Y S 22 15 15 98.811 95.909 90.640 0.32 0.20 -0.31 -102.6491 -14.5752 38.77 1.00
(...)
295 . H = H . 88 295 295 104.767 99.631 61.462 0.31 0.59 0.35 -101.6250 89.4394 62.12 1.00
296 . L = L . 23 296 296 103.557 102.037 64.123 0.19 0.94 -0.16 -137.2310 135.2956 60.00 1.00
297 . R = R . 246 297 297 101.894 105.419 64.247 0.29 (undef) (undef) -117.0606 111.2114 66.66 1.00
298 . L = L . 150 298 298 101.772 106.937 67.724 (undef) (undef) (undef) -86.8988 (undef) 56.47 1.00
//
How to use the compressed archive containing all files "All files (TAR archive)"
Use a decompression program to recover all the files included in the .tgz compressed archive. In Windows, suitable programs are PowerArchiver 6.1, 7-zip (freeware), WinRar or Winzip. In some of these, you might have to decompress the archive two times in a row. In Macintosh, suitable programs are BOMArchiveHelper or Stuffit expander. In Unix, use gunzip. This will extract all the output files contained in the archive to a folder. Sometimes - depending on your machine configurations - a simple double-click on the file will cause the appropriate program to decompress the archive and create the folder.
If you have Pymol installed, the PyMol script (".pml" file) can be opened in PyMol by double-clicking on it. This will load the PDB structure and color it. The structure can then be rotated to a better angle, and a new ray-trace image created by clicking on the "ray" button (a warning, ray-tracing may take a long time and it might seem your computer has frozen).
GETTING HELP
Scientific problems:
Technical problems:
This file was last modified Thursday 24th 2008f July 2008 10:47:07 GMT