Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

HCV vaccine development

Rational Vaccine design

You shall use bioinformatics tools to do a rational selection of peptides with a high potential as vaccine candidates against HCV in Guinea-Bissau. You shall tailor the selection towards the prevalent HLA-A, HLA-B, and HLA-DR molecules present in the population in Guinea-Bissau.

In detail you shall

  1. Identification of the prevalent HLA-A, HLA-B and HLA-DR alleles in Guinea-Bissau.
  2. Down-load the prevalent HCV genotype in Africa.
  3. Select a minimal set of peptides tailored to bind to the prevalent HLA types.
  4. Check if the selected peptides are cross-reactive towards other HCV subtypes.

Background

HCV is a major cause of acute hepatitis and chronic liver disease, including cirrhosis and liver cancer. Globally, an estimated 170 million persons are chronically infected with HCV and 3 to 4 million persons are newly infected each year. HCV is spread primarily by direct contact with human blood. The major causes of HCV infection worldwide are use of unscreened blood transfusions, and re-use of needles and syringes that have not been adequately sterilized.

No vaccine is currently available to prevent hepatitis C and treatment for chronic hepatitis C is too costly for most persons in developing countries to afford. Thus, from a global perspective, the greatest impact on hepatitis C disease burden will likely be achieved by focusing efforts on reducing the risk of HCV transmission from nosocomial exposures (e.g. blood transfusions, unsafe injection practices) and high-risk behaviours (e.g. injection drug use).

Hepatitis C virus (HCV) is one of the viruses (A, B, C, D, and E), which together account for the vast majority of cases of viral hepatitis. It is an enveloped RNA virus in the flaviviridae family which appears to have a narrow host range. Humans and chimpanzees are the only known species susceptible to infection, with both species developing similar disease.

An important feature of the virus is the relative mutability of its genome, which in turn is probably related to the high propensity (80%) of inducing chronic infection. HCV is clustered into several distinct genotypes which may be important in determining the severity of the disease and the response to treatment.

For more details on HCV see link: WHO


The exercise

Identification of prevalent HLA-A, HLA-B, and HLA-DR alleles in Guinea-Bissau.

Go to the Allele Frequency database

From the left hand menu select "HLA" -> submenu "HLA allele freq (classical)"

Specify the HLA locus to A.
Set population Guinea-Bissau (Note, only Guinea-Bissau).

Set "Level of resolution" to >=2.
Set "Sort by" to Population, Highest to Lowest Frequency.

Leave other options to the default and press the search button.

  • Q1 Which are the 5 most prevalent HLA-A alleles in Guinea-Bissau? Note that you only need to report the first four digits of the HLA typing, i.e A*02:01.

    Repeat this analysis for the HLA-B and HLA-DRB1 loci.

  • Q2 Which are the 5 most prevalent HLA-B alleles in Guinea-Bissau?
  • Q3 Which are the 5 most prevalent HLA-DRB1 alleles in Guinea-Bissau?

    Identification of binding motif similarities

    Many HLA alleles share large overlaps in their binding motif preferences. You shall use the MHCcluster to select the three most dissimilar HLA alleles from you five previously selected motifs for each HLA-A, HLA-B, and HLA-DRB1 loci.

    Go to the MHCcluster web-site. This server allows for cluster MHC molecules based on predicted binding motifs for MHC class I loci molecules from Human, non-human primates and mouse as well as human HLA-DR class II molecules.

    Select the 5 HLA-A alleles (either by simple typing the allele names or by selecting the names from the lists). Next press submit. The calculation takes a little while (minutes), so be patient. The output from the MHCcluster algorithm is a specificity tree and a heat-map displaying the similarities between the selected MHC molecules. Select 3 allele with the most different binding specificity so that the overlap in binding specificity between the alleles in binding specificity is low. Use the Advanced TreeViewer option to place the binding motifs (sequence logos) on the tree, and justify to yourself that some of the HLA molecule share binding specificity.

    Repeat this analysis and select 3 of the 5 HLA-B and 3 of the 5 HLA-DR alleles

    Here are links to the MHCcluster output in case the server is down
    HLA-A
    HLA-B
    HLA-DR

  • Q4 Which alleles did you select for the three loci? Note, the answer is not unambiguous for all loci.

    Down-load the prevalent HCV genotype in Africa.

    You now need to identify a representative genomic sequence for the most abundant HCV genotype in the selected population. In this case it is genotype 2. We need the sequence in protein fasta format.

    Go to GenBank.
    Set the 'search' roll-down to Genome and enter "Hepatitis C virus genotype 2" into the search field.
    Click on the first entry number (NC_009823).
    Click on the link to the protein ID, and click on the "FASTA" link to get the proteome in FASTA format.


    Identification of CTL epitopes

    You shall use the NetMHCcons prediction-server to identify potential CTL epitopes that will bind to your prevalent HLA-A and HLA-B molecules.

    Go to the NetMHCcons prediction-server and copy-paste the FASTA HCV genotype 2 genome protein sequence. Type in the HLA-A and HLA-B alleles you have found in question Q4 separated by commas (without blank spaces!), select peptide length as 9mers, select Save prediction to xls file, and press Submit. The calculation might takes some minutes, so please be patient.

    Once the calculation is completed, the output is displayed. Three prediction scores are provided for each peptide:MHC interaction (1-log50k(aff), Affinity(nM), %Rank). You can click on "Explain the output" on the prediction output page for details on these scores. The peptide are classified as weak binders (WB) if the nM score is less than 500 nM or the Rank is less than or equal to 2. Likewise are peptides with nM values less than 50 nM or Rank less than or equal to 0.5 classified as strong binders (SB).

    In the bottom of the results page you find a Link to output xls file. Open this file in excel.

    In the file you will have three prediction score for each peptide for each allele (1-log50k, nM, and Rank). The last column (NB) in the files contains the number of alleles to which the peptide-binding is classified as a weak or stronger binding.

    Here is a link to an output file if NetMHCcons does not complete NetMHCcons xls output file.

    MS Excel notes:
    Make sure that the decimal separator of your version of excel is set to '.' (periods).

  • Q5 Identify 5 peptides that will give you the broadest allelic coverage.
  • Q6 For the five peptides, how many binders do you find for each allele?

    Identification of T-helper epitopes

    To identify potential T-helper epitopes you shall use the NetMHCIIpan prediction-server.

    Go to the NetMHCIIpan prediction-server and upload the HCV genome sequence. Type in the three HLA-DRB1 alleles you have found in question Q4 separated by commas (without blank spaces!), select Save prediction to xls file, and Use fast mode (recommended for large calculations) (this will make slightly less accurate predictions but run 10 times faster), and press Submit. The calculation might takes some minutes.

    Also the NetMHCIIpan method provides three prediction scores for each peptide:MHC interaction (1-log15k(aff), Affinity(nM), %Rank) and the interpretation is the same as for the NetMHCcons methods. The peptide are classified as weak binders (WB) if the nM score is less than 500 nM or the Rank is less than or equal to 2. Likewise are peptides with nM values less than 50 nM or Rank less than or equal to 0.5 classified as strong binders (SB).

    In the bottom of the results page you find a Link to output xls file. Open this file in excel. Here is a link Here is a link to an output file if NetMHCIIpan does not complete NetMHCIIpan xls output file.

    In the file, you have three prediction score for each peptide for each allele (1-log50k, nM, and Rank). The last column (NB) in the files contains the number of alleles to which the peptide-binding is classified as a weak or stronger binding.

  • Q7 Are any of the CTL epitopes you found in question Q5 part of a 15mer T-helper epitope for the three HLA-DRB1 alleles?

  • Q8 Select the smallest number of 15mer peptides so that these peptides will have coverage of all HLA-A, HLA-B, and HLA-DRB1 alleles in your prevalent selection

    Known CTL epitopes

    Now you should go into the lab and investigate if any of the predicted epitopes are indeed CTL and/or T-helper epitopes. However before doing this, it might be informative to check the public epitope database for information on whether other groups have analyzed the peptides. Go to the Immune Epitope database (IEDB). Upload each of the selected 15mer peptides (as "Linear Peptide:"), select "Substring" similarity, source organism as "Hepatitis C virus", and Immune Recognition Context as "T cell Response".

  • Q9 Are any of the 15mer peptides described in the IEDB? And if yes does the restriction element and epitope match the predictions?

    Peptide conservation in other HCV subtypes

    You shall now investigate if any of you selected peptides are also present in an other HCV genotype. If this is the case, the peptides might not only have high importance for vaccine development against HCV in Guinea-Bissau, but also in other parts of the world.

    Go to GenBank.
    Set the 'search' roll-down to Genome and enter "Hepatitis C virus" into the search field.
    Click on the first entry number (NC_004102).
    Click on the link to the protein ID, and click on the "FASTA" link to get the proteome in FASTA format.

  • Q10 Are any of the 15mers selected in question Q8 present in the HVC genotype 1 genome?
    Hint: Copy-paste the translated sequence into an empty word document and remove all paragraphs and white-spaces. Then search for the 15mer sequence in the document.
  • Q11 What does this imply for the worldwide coverage of your vaccine?

    Now you are done!! You can try to sell your peptides to big pharma can get rich.