N-glycosylation is known to occur on Asparagines which occur in the
Asn-Xaa-Ser/Thr stretch (where Xaa is any amino acid except Proline).
While this consensus tripeptide (also called the N-glycosylation
sequon
in many texts) may be a requirement, it is not always sufficient for the
Asparagine to be glycosylated. Furthermore, there are a few known instances
of N-glycosylation occuring within Asn-Xaa-Cys (a Cysteine opposed to a Serine/Threonine
at the N+2 position) e.g. plasma protein C (
PRTC_HUMAN),
von Willebrand factor (
VWF_HUMAN).
NetNGlyc attempts to distinguish glycosylated sequons from non-glycosylated
ones. By default, predictions are only shown on Asn-Xaa-Ser/Thr sequons. If you choose
to predict on all Asparagines, then please be careful while interpreting the output.
From what we know so far, only asparagines within Asn-Xaa-Ser/Thr
(and in some cases, Asn-Xaa-Cys) are N-glycosylated
in vivo.
In the sequence output above, Asn-Xaa-Ser/Thr sequons are highlighted in
blue, and N-glycosylated Asparagines are
red. With the scores for each position, Asn-Xaa-Ser/Thr
sequons can be identified (in case prediction is made on all Asparagines) by
a 'SEQUON' note in the right margin.
Proline just after the Asparagine, is known to
preclude N-linked glycosylation in most cases by rendering the Asparagine
inaccessible. NetNGlyc has been trained to ignore this Proline position (to be
able to pick up other sequence signals). Thus, Asn-Pro-Ser/Thr triplets might be
predicted as glycosylated but a warning is generated. Such sites may only be
worth considering if there is additional confirmatory evidence.
Thresholds and confidence
Any potential crossing the default threshold of 0.5, represents a predicted
glycosylated site (as long as it occurs in the required sequon Asn-Xaa-Ser/Thr
without Proline at Xaa). The 'potential' score is the averaged output of
nine neural networks. For further information, the jury agreement column
indicates how many of the nine networks support the prediction. The
N-Glyc Result column shows one of the following outputs for predictions
indicating
glycosylated sites:
+ Potential > 0.5
++ Potential > 0.5 AND Jury agreement (9/9) OR Potential>0.75
+++ Potential > 0.75 AND Jury agreement
++++ Potential > 0.90 AND Jury agreement
and
non-glycosylated sites:
- Potential < 0.5
-- Potential < 0.5 AND Jury agreement (all nine < 0.5)
--- Potential < 0.32 AND Jury agreement
For picking up N-glycosylation sites with high specificity (Asparagines
very likely to be glycosylated), use only (++) predictions (and better) for
Asparagines that occur within the Asn-Xaa-Ser/Thr triplet (no Proline at the Xaa position).
Note that identifying sites this way would compromise sensitivity (you may
lose some positive sites).
Warnings and notes in the right margin
SEQUON ASN-XAA-SER/THR.
If you request a prediction on all Asparagines
(instead of the default to predict only on Asn-Xaa-Ser/Thr sequons), then this note will
appear for Asparagine positions which do occur within the Asn-Xaa-Ser/Thr sequon.
WARNING: PRO-X1.
Proline occurs just after the Asparagine residue. This makes it highly unlikely
that the Asparagine is glycosylated, presumably due to conformational constraints.
WARNING: PRO-X2.
Proline occurs at the 3rd position C-terminal to the Asparagine in question
(2nd 'X' in NX[ST]X). This makes it somewhat unlikely that the Asparagine is glycosylated,
but this condition is not as harsh as the PRO-X1 condition.
GETTING HELP
Scientific problems:
Technical problems: