|
NetNGlyc Abstract
Contrary to widespread belief, acceptor sites for N-linked
glycosylation on protein sequences, are not well
characterised. The consensus sequence, Asn-Xaa-Ser/Thr
(where Xaa is not Pro), is known to be a prerequisite for
the modification. However, not all of these sequons are
modified and it is thus not discriminatory between
glycosylated and non-glycosylated asparagines. We train
artificial neural networks on the surrounding sequence
context, in an attempt to discriminate between acceptor and
non-acceptor sequons. In a cross-validated performance, the
networks could identify 86% of the glycosylated and 61% of
the non-glycosylated sequons, with an overall accuracy of
76%. The method can be optimised for high specificity
or high sensitivity. Apart from characterising individual
proteins, the prediction method can rapidly
scan complete proteomes.
Glycosylation is an important post-translational
modification, and is known to influence protein folding,
localisation and trafficking, protein solubility,
antigenicity, biological activity and half-life, as well as
cell-cell interactions. We investigate the spread of known
and predicted N-glycosylation sites across functional
categories of the human proteome.
An N-glycosylation site predictor for human proteins is
available at http://www.cbs.dtu.dk/services/NetNGlyc/
CURRENT NETWORK
The network will be updated and predictions can alter due to different
versions. The network is balanced to give optimal predictions whether or not you
submit sequences with homology to the known N-glycosylated proteins.
If however the submitted sequence is very close to or identical to the
sequences in our training dataset, the accuracy can be expected to be higher
than reported above.
FEEDBACK, COMMENTS AND SUGGESTIONS:
We would appreciate any confirmation or the opposite of our predictions. Since
an expanded data set with additional N-glycosylated sequences would
increase the performance of the network, we are very interested in receiving
such material. User feedback is the only way we will learn to
enhance the performance of the method. Any other comments regarding
the predictions or the data may be sent to:
Ramneek Gupta
CORRESPONDENCE
Ramneek Gupta,
|