Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction.
Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M.
BMC Bioinformatics. 2007 Oct 31;8:424.
BACKGROUND: Reliable predictions of Cytotoxic T
lymphocyte (CTL) epitopes are essential for rational vaccine design.
Most importantly, they can minimize the experimental effort needed to
identify epitopes. NetCTL is a web-based tool designed for predicting
human CTL epitopes in any given protein. It does so by integrating
predictions of proteasomal cleavage, TAP transport efficiency, and MHC
class I affinity. At least four other methods have been developed
recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP,
MHC-pathway, and WAPP. In order to compare the performance of
prediction methods, objective benchmarks and standardized performance
measures are needed. Here, we develop such large-scale benchmark and
corresponding performance measures and report the performance of an
updated version 1.2 of NetCTL in comparison with the four other
RESULTS: We define a number of performance measures that can
handle the different types of output data from the five methods. We use
two evaluation datasets consisting of known HIV CTL epitopes and their
source proteins. The source proteins are split into all possible 9 mers
and except for annotated epitopes; all other 9 mers are considered
non-epitopes. In the RANK measure, we compare two methods at a time and
count how often each of the methods rank the epitope highest. In
another measure, we find the specificity of the methods at three
predefined sensitivity values. Lastly, for each method, we calculate
the percentage of known epitopes that rank within the 5% peptides with
the highest predicted score.
CONCLUSION: NetCTL-1.2 is demonstrated to
have a higher predictive performance than EpiJen, MAPPP, MHC-pathway,
and WAPP on all performance measures. The higher performance of
NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not
statistically significant on all measures. In the large-scale benchmark
calculation consisting of 216 known HIV epitopes covering all 12
recognized HLA supertypes, the NetCTL-1.2 method was shown to have a
sensitivity among the 5% top-scoring peptides above 0.72. On this
dataset, the best of the other methods achieved a sensitivity of 0.64.
The NetCTL-1.2 method is available at
http://www.cbs.dtu.dk/services/NetCTL.All used datasets are available