hERG Classification Model Based on a Combination of Support Vector Machine
Method and GRIND Descriptors.
Li Q1, Jørgensen FS2, Oprea T3,
Brunak S1 and Taboureau O1.
Mol Pharm: 4;5(1):117-127, 2008.
1Center for Biological Sequence Analysis, Department of Systems
Biology, Technical University of Denmark, DK-2800 Lyngby, Denmark
2Department of Medicinal Chemistry, The Faculty of Pharmaceutical
Sciences, University of Copenhagen, Universitetsparken 2,
DK-2100 Copenhagen, Denmark
3Division of Biocomputing, Department of Biochemistry and Molecular
Biology, University of New Mexico School of Medicine, MSC11 6145,
Albuquerque, New Mexico 87131
The human Ether-a-go-go Related Gene (hERG) potassium channel is one of the
major critical factors associated with QT interval prolongation and development
of arrhythmia called Torsades de Pointes (TdP). It has become a growing concern
of both regulatory agencies and pharmaceutical industries who invest
substantial effort in the assessment of cardiac toxicity of drugs. The
development of in silico tools to filter out potential hERG channel inhibitors
in early stages of the drug discovery process is of considerable interest.
Here, we describe binary classification models based on a large and diverse
library of 495 compounds. The models combine pharmacophore-based GRIND
descriptors with a support vector machine (SVM) classifier in order to
discriminate between hERG blockers and nonblockers. Our models were applied at
different thresholds from 1 to 40 microm and achieved an overall accuracy up to
94% with a Matthews coefficient correlation (MCC) of 0.86 ( F-measure of 0.90
for blockers and 0.95 for nonblockers). The model at a 40 microm threshold
showed the best performance and was validated internally (MCC of 0.40 and
F-measure of 0.57 for blockers and 0.81 for nonblockers, using a leave-one-out
cross-validation). On an external set of 66 compounds, 72% of the set was
correctly predicted ( F-measure of 0.86 and 0.34 for blockers and nonblockers,
respectively). Finally, the model was also tested on a large set of hERG
bioassay data recently made publicly available on PubChem
) to achieve about
73% accuracy ( F-measure of 0.30 and 0.83 for blockers and nonblockers,
respectively). Even if there is still some limitation in the assessment of hERG
blockers, the performance of our model shows an improvement between 10% and 20%
in the prediction of blockers compared to other methods, which can be useful in
the filtering of potential hERG channel inhibitors.