Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Immunological Bioinformatics Exercise 1: Prediction of B-cell epitopes



Overview

In this exercise you will use some bioinformatical tools to predict location of B-cell epitopes. Since B-cell epitopes in general are structural epitopes and most are formed by amino acids that are not adjacent to each other in the sequence, prediction of B-cell epitopes is a much more complicated task than T-cell prediction. For many proteins, the 3D structure is unknown and therefore only sequence based prediction methods can be used. This means that only linear epitopes can be predicted. However, in this exercise we will work on epitopes in proteins of known structures, to give you a better understanding of the predicted epitopes.

The exercise consists of two parts, each based on predictions of epitopes using method developed at CBS:

  1. Prediction of linear B-cell epitopes using BepiPred
  2. Prediction of discontinuous B-cell epitopes using DiscoTope

For this exercise you will need to have installed PyMol and also an internet browser running on your computer.

The questions here below are meant to guide you through the exercise and point on the most interesting topics. Find at least one partner to discuss the exercise with.


Prediction of epitopes in the Cholera Enterotoxin

We are now going to have a look at epitopes in Cholera Enterotoxin. A number of linear epitopes have been reported for this antigen, and they have been compiled in a dataset (Pellequer et al., 1993). The epitopes were identified by measuring the binding of peptide fragments from the antigen sequence to antibodies. (Additional information can be found in: Pellequer et al., 1993, Kazemi and Finkelstein, 1991 and Jacob et al., 1983.) Have a look on the epitopes in CHO_measured_antigenicity. The file shows the sequence of the cholera enterotoxin and the following lines indicate the positions of linear epitopes (E). You will now predict and analyze these epitopes using BepiPred.

Prediction of linear B-cell epitopes using BepiPred

A number of sequence-based prediction methods for B-cell epitopes have been used in the past for prediction of linear epitopes. Past prediction methods have mostly been based on very simple computational algorithms. Sequence-based methods are used for prediction when the structure of the antigen is undetermined or can not be predicted well. Recently, a number of sequence-based predictors which are based on more advanced bioinformatical methods have been developed. One of them is the BepiPred method, which has been developed at CBS.

To predict epitopes and analyze them, do the following:

  • Download the following fasta file CHO.fsa and save it on the desktop of the computer.
  • Go to the homepage of the BepiPred prediction server: www.cbs.dtu.dk/services/BepiPred and upload the CHO.fsa file to the server.
  • Run BepiPred with the default options, and save the outputfile on the desktop by righ clicking with the mouse and choose "save page as".
  • Have a quick look at the prediction result in a browser window to get aqcuainted with the output.
  • Now we will try to visualize these epitopes, too see where they are located in the structure of the protein.

  • Start by downloading the structure of the cholera enterotoxin here and save it on the desktop: CHO.pdb
  • Start the program called Pymol on your computer.
  • Click on "File->Open->.....CHO.pdb" to make PyMol locate and read the CHO.pdb structure from the desktop.
  • Then copy the commands here below into the window named pymol tcl/tk gui. (You can copy all at once.)
    hide everything
    show cartoon
    select pellequer, resi 33-42
    color red, pellequer
    select pellequer, resi 53-61
    color red, pellequer
    select pellequer, resi 69-84
    color red, pellequer
    select pellequer, resi 104-118
    color red, pellequer
    create bepipred1, resi 22-26 and not name c+o+n
    create bepipred2, resi 51-54 and not name c+o+n
    create bepipred3, resi 74-84 and not name c+o+n
    create bepipred4, resi 112-115 and not name c+o+n
    show sticks, bepipred1 or bepipred2 or bepipred3 or bepipred4
    color blue, bepipred1 or bepipred2 or bepipred3 or bepipred4
    center CHO
    zoom all
    

    The regions which are shown with a red backbone in the structure are regions annotated as epitopes. The side chains of the residues which are predicted to be epitopes are shown in blue.

  • Where are the antigenic regions found in the structure? (Is it likely that all annotated residues are able to interact with antibodies?)
  • How well do the predicted epitopes fit the annotation?
  • What could cause differences between the prediction and the annotation?
  • Prediction of discontinuous cholera enterotoxin epitopes using DiscoTope

    Recently, protein structures have been used more in bioinformatical tools for prediction of epitopes. This has been done because most B-cell epitopes are conformational and discontinuous. More structures become available, and modelling methods for protein structure become more advanced, which means that in the future the requirement of structural input becomes less problematic. One of the new structure-based methods is the DiscoTope method, developed at CBS. The method predicts residues which are part of discontinuous epitopes.

    The cholera enterotoxin is a pentamer in the native form. Now, we will try and predict residues in discontinuous epitopes using the structure of the pentamer, which has the Protein Databank code 1fgb.

  • Go to the homepage of the DiscoTope prediction server: www.cbs.dtu.dk/services/DiscoTope
  • In part 1. after "Entry name (in four letter code):", write the pdb code 1fgb.
  • Leave the field "Chain(s)" empty to predict epitopes using all chains in the pentamer.
  • Run DiscoTope with the default options, and save the outputfile on the desktop.
  • By clicking on the link "View results in Jmol (please be patient...requires Jmol applet download)" you can see a graphical representation of the predicted epitopes.
  • Where are the predicted residues located in the structure?
  • Is there an overlap between residues predicted by DiscoTope and the annotated epitopes?
  • Could some of the annotated linear epitopes be part of discontinuous epitopes?

  • Prediction of epitopes in the malaria protein AMA1

    The malaria protein Apical Membrane Antigen 1 (AMA1) is a candidate for a component of a malaria vaccine. In the ecto-domain of this protein, a two epitopes have been identified. One of the epitopes are identified using X-ray crystallography, and one is identified by measuring the binding of mutants of the protein to antibodies. Additionally, sequence analysis of the protein has shown that a number of the residues are polymorphic, and it has been suggested that this is caused by an attempt to escape the human immune system. Now we will analyze the predictions of these epitopes.

    Prediction of linear B-cell epitopes using BepiPred

  • Download the following fasta file AMA1.fsa and save it on the desktop of the computer.
  • Go to the homepage of the BepiPred prediction server: www.cbs.dtu.dk/services/BepiPred and upload the AMA1.fsa file to the server.
  • In the field after "Score threshold for epitope assignment" write 0.9 to set a specificity of 91% and a sensitivity of 25%.
  • Run BepiPred and save the outputfile on the desktop by righ clicking with the mouse and choose "save page as".
  • Have a quick look at the prediction result in a browser window to get aqcuainted with the output.
  • Now we will try to visualize these epitopes, too see where they are located in the structure of the protein.

  • Start by downloading the structure of the AMA1 with BepiPred predictions and save it on the desktop: AMA1_bepi.pdb
  • Start PyMol on your computer.
  • Click on "File->Open->....AMA1_bepi.pdb" to make PyMol locate and read the AMA1_bepi.pdb structure on the desktop.
  • Then copy the commands here below into the window named pymol tcl/tk gui. (You can copy all at once.)

    hide everything
    bg_color white
    show cartoon
    color white
    select B_gt_0.9, c. a and b > 0.9
    select  B_lteq_0.9, c. a and (b > 0.9 or b = 0.9)
    color red, B_gt_0.9
    create 4G2, c. a and resi 348+351+352+354+355+356+385+388+389 and not name c+o+n
    color black, 4G2
    show sticks, 4G2
    create 1F9, c. a and resi 188+197+200+201+204+223+225 and not name c+o+n
    color black, 1F9
    show sticks, 1F9
    create poly, c. a and resi 187+230+243
    show sticks, poly
    color yellow, poly 
    set_view (\
        -0.575032651,   -0.008835237,    0.818080187,\
        -0.307444185,   -0.924317956,   -0.226084813,\
         0.758166254,   -0.381521791,    0.528795898,\
         0.000043839,   -0.000048044, -211.527801514,\
         2.967290878,   15.267670631,   48.428859711,\
        60.298721313,  362.747131348,    0.000000000 )
    
    

    Now you see the residues which are predicted to be part of epitopes with the backbone in red. The two epitopes are shown with sidechains in black. Polymorphic positions are shown with sidechains in yellow.

  • Where are the predicted residues located in the structure?
  • How well do the epitopes predicted by Bepipred correspond to the annotated epitopes?
  • Prediction of discontinuous epitopes using DiscoTope

    Part of the AMA1 ectodomain structure is determined, and is deposited in the Protein Databank with the code 1Z40. We will use this structure for predictions of residues which are part of discontinuous epitopes.

  • Go to the homepage of the DiscoTope prediction server: www.cbs.dtu.dk/services/DiscoTope
  • In part 1. after "Entry name (in four letter code):", write the pdb code 1Z40.
  • In the field after "Chain(s)" write "A" to specify that we are looking at chain A in the pdb file.
  • In the field after "Specify the threshold for epitope identification:" write -4.7, to set the specificity to 90% and a sensitivity of 24%.
  • Submit the query.
  • By clicking on the link "View results in Jmol (please be patient...requires Jmol applet download)" you can see a graphical representation of the predicted epitopes.
  • Now we will compare to the DiscoTope prediction to the annotated epitopes using PyMol:

  • Start by downloading the structure of the AMA1 with DiscoTope predictions and save it on the desktop: AMA1_disco.pdb
  • Start PyMol on your computer.
  • Click on "File->Open->....AMA1_disco.pdb" to make PyMol locate and read the AMA1_disco.pdb structure on the desktop.
  • Then copy the commands here below into the window named pymol tcl/tk gui. (You can copy all at once.)
  • hide everything
    bg_color white
    show cartoon
    color white
    select B_gt_4.7, c. a and b > -4.7
    select  B_lteq_4.7, c. a and (b > -4.7 or b = -4.7)
    color green, B_gt_4.7
    create 4G2, c. a and resi 348+351+352+354+355+356+385+388+389 and not name c+o+n
    color black, 4G2
    show sticks, 4G2
    create 1F9, c. a and resi 188+197+200+201+204+223+225 and not name c+o+n
    color black, 1F9
    show sticks, 1F9
    create poly, c. a and resi 187+230+243
    show sticks, poly
    color yellow, poly
    set_view (\
        -0.575032651,   -0.008835237,    0.818080187,\
        -0.307444185,   -0.924317956,   -0.226084813,\
         0.758166254,   -0.381521791,    0.528795898,\
         0.000043839,   -0.000048044, -211.527801514,\
         2.967290878,   15.267670631,   48.428859711,\
        60.298721313,  362.747131348,    0.000000000 )
    

    Now you see the residues which are predicted to be part of epitopes with the backbone in green. The two epitopes are shown with sidechains in black. Polymorphic positions are shown with sidechains in yellow.

  • Where are the predicted residues located in the structure?
  • How well do the epitopes predicted by DiscoTope correspond to the annotated epitopes?