Immunological Bioinformatics Exercise 1: Prediction of B-cell epitopes
Overview
In this exercise you will use some bioinformatical tools to predict
location of B-cell epitopes. Since B-cell epitopes in general are
structural epitopes and most are
formed by amino acids that are not adjacent to each other in the
sequence, prediction of B-cell epitopes is a much more complicated task
than T-cell prediction. For many proteins, the 3D structure is unknown
and therefore only sequence based prediction methods can be used. This
means that only linear epitopes can be predicted. However, in this
exercise we will work on epitopes in proteins of known structures, to
give you a better understanding of the predicted epitopes.
The exercise consists of two parts, each based on predictions of epitopes using method developed at CBS:
- Prediction of linear B-cell epitopes using BepiPred
- Prediction of discontinuous B-cell epitopes using DiscoTope
For this exercise you will need to have installed PyMol and also an internet browser running on your computer.
The questions here below are meant
to guide you through the exercise and point on the most interesting topics.
Find at least one partner to discuss the exercise with.
Prediction of epitopes in the Cholera Enterotoxin
We are now going to have a look at epitopes in Cholera Enterotoxin.
A number of linear epitopes have been reported for this antigen, and
they have been compiled in a dataset (Pellequer et al., 1993). The
epitopes were identified by measuring the binding of peptide fragments
from the antigen sequence to antibodies. (Additional information can be
found in: Pellequer et al., 1993, Kazemi and Finkelstein, 1991 and Jacob et al., 1983.)
Have a look on the epitopes in CHO_measured_antigenicity.
The file shows the sequence of the cholera enterotoxin and the
following lines indicate the positions of linear epitopes (E). You will
now predict and analyze these epitopes using BepiPred.
Prediction of linear B-cell epitopes using BepiPred
A number of sequence-based prediction methods for B-cell epitopes
have been used in the past for prediction of linear epitopes. Past
prediction methods have mostly been based on very simple computational
algorithms. Sequence-based methods are used for prediction when the
structure of the antigen is undetermined or can not be predicted well.
Recently, a number of sequence-based predictors which are based on more
advanced bioinformatical methods have been developed. One of them is
the BepiPred method, which has been developed at CBS.
To predict epitopes and analyze them, do the following:
Download the following fasta file CHO.fsa and save it on the desktop of the computer.
Go to the homepage of the BepiPred prediction server: www.cbs.dtu.dk/services/BepiPred and upload the CHO.fsa file to the server.
Run BepiPred with the default options, and save the outputfile
on the desktop by righ clicking with the mouse and choose "save page
as". Have a quick look at the prediction result in a browser window to get aqcuainted with the output.
Now we will try to visualize these epitopes, too see where they are located in the structure of the protein.
Start by downloading the structure of the cholera enterotoxin here and save it on the desktop: CHO.pdb
Start the program called Pymol on your computer.
Click on "File->Open->.....CHO.pdb" to make PyMol locate and read the CHO.pdb structure from the desktop.
Then copy the commands here below into the window named pymol tcl/tk gui. (You can copy all at once.)
hide everything
show cartoon
select pellequer, resi 33-42
color red, pellequer
select pellequer, resi 53-61
color red, pellequer
select pellequer, resi 69-84
color red, pellequer
select pellequer, resi 104-118
color red, pellequer
create bepipred1, resi 22-26 and not name c+o+n
create bepipred2, resi 51-54 and not name c+o+n
create bepipred3, resi 74-84 and not name c+o+n
create bepipred4, resi 112-115 and not name c+o+n
show sticks, bepipred1 or bepipred2 or bepipred3 or bepipred4
color blue, bepipred1 or bepipred2 or bepipred3 or bepipred4
center CHO
zoom all
The regions which are shown with a red backbone in the structure are regions annotated as epitopes.
The side chains of the residues which are predicted to be epitopes are shown in blue.
Where are the antigenic regions found in the structure?
(Is it likely that all annotated residues are able to interact with
antibodies?)
How well do the predicted epitopes fit the annotation?
What could cause differences between the prediction and the annotation?
Prediction of discontinuous cholera enterotoxin epitopes using DiscoTope
Recently, protein structures have been used more in bioinformatical
tools for prediction of epitopes. This has been done because most
B-cell epitopes are conformational and discontinuous. More structures
become available, and modelling methods for protein structure become
more advanced, which means that in the future the requirement of
structural input becomes less problematic. One of the new
structure-based methods is the DiscoTope method, developed at CBS. The
method predicts residues which are part of discontinuous epitopes.
The cholera enterotoxin is a pentamer in the native form. Now, we
will try and predict residues in discontinuous epitopes using the
structure of the pentamer, which has the Protein Databank code 1fgb.
Go to the homepage of the DiscoTope prediction server: www.cbs.dtu.dk/services/DiscoTope
In part 1. after "Entry name (in four letter code):", write the pdb code 1fgb.
Leave the field "Chain(s)" empty to predict epitopes using all chains in the pentamer.
Run DiscoTope with the default options, and save the outputfile on the desktop.
By clicking on the link "View results in Jmol (please be patient...requires Jmol applet download)"
you can see a graphical representation of the predicted epitopes.
Where are the predicted residues located in the structure?
Is there an overlap between residues predicted by DiscoTope and the annotated epitopes?
Could some of the annotated linear epitopes be part of discontinuous epitopes?
Prediction of epitopes in the malaria protein AMA1
The malaria protein Apical Membrane Antigen 1 (AMA1) is a candidate for
a component of a malaria vaccine. In the ecto-domain of this protein, a
two epitopes have been identified. One of the epitopes are identified
using X-ray crystallography, and one is identified by measuring the
binding of mutants of the protein to antibodies. Additionally, sequence
analysis of the protein has shown that a number of the residues are
polymorphic, and it has been suggested that this is caused by an
attempt to escape the human immune system. Now we will analyze the
predictions of these epitopes.
Prediction of linear B-cell epitopes using BepiPred
Download the following fasta file AMA1.fsa and save it on the desktop of the computer.
Go to the homepage of the BepiPred prediction server: www.cbs.dtu.dk/services/BepiPred and upload the AMA1.fsa file to the server.
In the field after "Score threshold for epitope assignment" write 0.9 to set a specificity of 91% and a sensitivity of 25%.
Run BepiPred and save the outputfile on the desktop by righ clicking with the mouse and choose "save page as".
Have a quick look at the prediction result in a browser window to get aqcuainted with the output.
Now we will try to visualize these epitopes, too see where they are located in the structure of the protein.
Start by downloading the structure of the AMA1 with BepiPred predictions and save it on the desktop: AMA1_bepi.pdb
Start PyMol on your computer.
Click on "File->Open->....AMA1_bepi.pdb" to make PyMol locate and read the AMA1_bepi.pdb structure on the desktop.
Then copy the commands here below into the window named pymol tcl/tk gui. (You can copy all at once.)
hide everything
bg_color white
show cartoon
color white
select B_gt_0.9, c. a and b > 0.9
select B_lteq_0.9, c. a and (b > 0.9 or b = 0.9)
color red, B_gt_0.9
create 4G2, c. a and resi 348+351+352+354+355+356+385+388+389 and not name c+o+n
color black, 4G2
show sticks, 4G2
create 1F9, c. a and resi 188+197+200+201+204+223+225 and not name c+o+n
color black, 1F9
show sticks, 1F9
create poly, c. a and resi 187+230+243
show sticks, poly
color yellow, poly
set_view (\
-0.575032651, -0.008835237, 0.818080187,\
-0.307444185, -0.924317956, -0.226084813,\
0.758166254, -0.381521791, 0.528795898,\
0.000043839, -0.000048044, -211.527801514,\
2.967290878, 15.267670631, 48.428859711,\
60.298721313, 362.747131348, 0.000000000 )
Now you see the residues which are predicted to be part of epitopes
with the backbone in red. The two epitopes are shown with sidechains in
black. Polymorphic positions are shown with sidechains in yellow.
Where are the predicted residues located in the structure?
How well do the epitopes predicted by Bepipred correspond to the annotated epitopes?
Prediction of discontinuous epitopes using DiscoTope
Part of the AMA1 ectodomain structure is determined, and is
deposited in the Protein Databank with the code 1Z40. We will use this
structure for predictions of residues which are part of discontinuous
epitopes.
Go to the homepage of the DiscoTope prediction server: www.cbs.dtu.dk/services/DiscoTope
In part 1. after "Entry name (in four letter code):", write the pdb code 1Z40.
In the field after "Chain(s)" write "A" to specify that we are looking at chain A in the pdb file.
In the field after "Specify the threshold for epitope
identification:" write -4.7, to set the specificity to 90% and a
sensitivity of 24%.
Submit the query.
By clicking on the link "View results in Jmol (please be patient...requires Jmol applet download)"
you can see a graphical representation of the predicted epitopes.
Now we will compare to the DiscoTope prediction to the annotated epitopes using PyMol:
Start by downloading the structure of the AMA1 with DiscoTope predictions and save it on the desktop: AMA1_disco.pdb
Start PyMol on your computer.
Click on "File->Open->....AMA1_disco.pdb" to make PyMol locate and read the AMA1_disco.pdb structure on the desktop.
Then copy the commands here below into the window named pymol tcl/tk gui. (You can copy all at once.)
hide everything
bg_color white
show cartoon
color white
select B_gt_4.7, c. a and b > -4.7
select B_lteq_4.7, c. a and (b > -4.7 or b = -4.7)
color green, B_gt_4.7
create 4G2, c. a and resi 348+351+352+354+355+356+385+388+389 and not name c+o+n
color black, 4G2
show sticks, 4G2
create 1F9, c. a and resi 188+197+200+201+204+223+225 and not name c+o+n
color black, 1F9
show sticks, 1F9
create poly, c. a and resi 187+230+243
show sticks, poly
color yellow, poly
set_view (\
-0.575032651, -0.008835237, 0.818080187,\
-0.307444185, -0.924317956, -0.226084813,\
0.758166254, -0.381521791, 0.528795898,\
0.000043839, -0.000048044, -211.527801514,\
2.967290878, 15.267670631, 48.428859711,\
60.298721313, 362.747131348, 0.000000000 )
Now you see the residues which are predicted to be part of epitopes
with the backbone in green. The two epitopes are shown with sidechains
in black. Polymorphic positions are shown with sidechains in yellow.
Where are the predicted residues located in the structure?
How well do the epitopes predicted by DiscoTope correspond to the annotated epitopes?
|