N-glycosylation is known to occur on Asparagines which occur in the
Asn-Xaa-Ser/Thr stretch (where Xaa is any amino acid except Proline).
While this consensus tripeptide (also called the N-glycosylation sequon
in many texts) may be a requirement, it is not always sufficient for the
Asparagine to be glycosylated. Furthermore, there are a few known instances
of N-glycosylation occuring within Asn-Xaa-Cys (a Cysteine opposed to a Serine/Threonine
at the N+2 position) e.g. plasma protein C (PRTC_HUMAN
von Willebrand factor (VWF_HUMAN
attempts to distinguish glycosylated sequons from non-glycosylated
ones. By default, predictions are only shown on Asn-Xaa-Ser/Thr sequons. If you choose
to predict on all Asparagines, then please be careful while interpreting the output.
From what we know so far, only asparagines within Asn-Xaa-Ser/Thr
(and in some cases, Asn-Xaa-Cys) are N-glycosylated in vivo
In the sequence output above, Asn-Xaa-Ser/Thr sequons are highlighted in
, and N-glycosylated Asparagines are
. With the scores for each position, Asn-Xaa-Ser/Thr
sequons can be identified (in case prediction is made on all Asparagines) by
a 'SEQUON' note in the right margin.
Proline just after the Asparagine, is known to
preclude N-linked glycosylation in most cases by rendering the Asparagine
inaccessible. NetNGlyc has been trained to ignore this Proline position (to be
able to pick up other sequence signals). Thus, Asn-Pro-Ser/Thr triplets might be
predicted as glycosylated but a warning is generated. Such sites may only be
worth considering if there is additional confirmatory evidence.
Thresholds and confidence
Any potential crossing the default threshold of 0.5, represents a predicted
glycosylated site (as long as it occurs in the required sequon Asn-Xaa-Ser/Thr
without Proline at Xaa). The 'potential' score is the averaged output of
nine neural networks. For further information, the jury agreement column
indicates how many of the nine networks support the prediction. The
N-Glyc Result column shows one of the following outputs for predictions
For picking up N-glycosylation sites with high specificity (Asparagines
very likely to be glycosylated), use only (++) predictions (and better) for
Asparagines that occur within the Asn-Xaa-Ser/Thr triplet (no Proline at the Xaa position).
Note that identifying sites this way would compromise sensitivity (you may
lose some positive sites).
+ Potential > 0.5
++ Potential > 0.5 AND Jury agreement (9/9) OR Potential>0.75
+++ Potential > 0.75 AND Jury agreement
++++ Potential > 0.90 AND Jury agreement
- Potential < 0.5
-- Potential < 0.5 AND Jury agreement (all nine < 0.5)
--- Potential < 0.32 AND Jury agreement
Warnings and notes in the right margin
If you request a prediction on all Asparagines
(instead of the default to predict only on Asn-Xaa-Ser/Thr sequons), then this note will
appear for Asparagine positions which do occur within the Asn-Xaa-Ser/Thr sequon.
Proline occurs just after the Asparagine residue. This makes it highly unlikely
that the Asparagine is glycosylated, presumably due to conformational constraints.
Proline occurs at the 3rd position C-terminal to the Asparagine in question
(2nd 'X' in NX[ST]X). This makes it somewhat unlikely that the Asparagine is glycosylated,
but this condition is not as harsh as the PRO-X1 condition.