PTM prediction tools

This is a survey of publicly available PTM web resources, databases and classification/prediction servers.

Resource   Classification      Reference      Information/
   method/Database            PMID
General PTM-related         
ELM   Consensus      1      Predicts Eukaryotic Linear Motifs (ELMs) based on
   patterns            consensus patterns. Applies context-based rules
            and logical filters. PMID:12824381
PROSITE   Consensus      2      Curated database of consensus patterns for many
   patterns            types of PTMs, including phosphorylation sites.
               MotifScan feature allows for scanning of query
               sequence. PMID:12230035
HPRD   Database      3      Human Protein Reference Database (HPRD).
               Highly curated database of disease-related
               proteins and their PTMs. PMID:14681466
RESID   Database      4      Part of the PIR protein database. Comprehensive
               collection of annotations and structures for PTMs.
Scansite   Weight matrix      5      Based on peptide library studies. Predicts kinase-
               specific motifs and other types of motifs involved
               in signal transduction, e.g. SH2 domain binding.
PREDIKIN   Expert system      6      The program produces a prediction of substrates
               for S/T protein kinases based on the primary
               sequence of a protein kinase catalytic domain.
NetPhos   Neural network      7      Predicts general phosphorylation status based on
               sets of experimentally validated S, T and Y
               phosphorylation sites. PMID:10600390
NetPhosK   Neural network      8      Predicts kinase-specific phosphorylation sites
               based on sets of experimentally validated S, T
               and Y phosphorylation sites. PMID:15174133
PhosphoELM   Database      9      Curated database of validated phosphorylation
               sites. PMID:9847189
Phosphosite   Database            Curated database of in vivo validated phospho-
               rylation sites.
bigPI   Weight matrix      10      Predicts GPI-modification sites using a composite
               prediction function including a weight matrix
               and physical model. PMID:10497036
DGPI   Weight matrix            Predics GPI cleavage site (GPI-anchor)
               in a protein.
GlycoMod   Look-up table      11      Predicts glycan structure from its experimentally
               determined mass. PMID:11680880
NetOGlyc   Neural network      12      Predicts mucin type CalNac O-glycosylation sites
               in mammalian proteins. PMID:15385431
NetNGlyc    Neural network      13      Predicts N-glycosylation sites in human proteins
               by examining the sequence context of Asn-
               Xaa-Ser/Thr sequons. In preparation.
DictyOGlyc   Neural network      14      Predicts GlcNAc o-glycosylation sites in Dictyoste-
               lium discoideum proteins. PMID:10521537
YinOYang   Neural network            Predicts O-beta-GlcNAc attachment sites in eukaryotic
               protein sequences.
O-GlycBase   Database      15      Predicts O-beta-GlcNAc attachment sites in eukaryotic
               protein sequences. PMID:9847232
Sulfinator   HMM      16      Predicts tyrosine sulfation sites using a combination
               of HMM models. PMID:12050077
ProP   Neural network      17      Predicts arginine and lysine propeptide cleavage sites
               in protein sequences. PMID:14985543
NetPicoRNA   Neural network      18      Predicts picornaviral protease cleavage sites in
               protein sequences. PMID:8931139
NetCorona   Neural network      19      Predicts coronavirus 3C-like proteinase (or protease)
               cleavage sites on protein sequences. PMID:15180906
Myristoylator   Neural network      20      Predicts N-terminal myristoylation on protein sequences.
NMT   Weight matrix      21      Predicts myristoylation sites on protein sequences.
SUMOplot   Weight matrix            Prediction of sumoylation on protein sequences.

Part of the table has been taken from the review:

"Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence"

Blom, N., Sicheritz-Pontén, T., Gupta, R., Gammeltoft, S. and Brunak, S.
Proteomics 2004, 4, 1633-1649. PMID:15174133

1. Puntervoll, P., Linding, R., Gemund, C., Chabanis-Davidson, S. et al., Nucleic Acids Res. 2003, 31, 3625-3630.
2. Sigrist, C., Cerutti, L., Hulo, N., Gattiker, A. et al., Brief Bioinform. 2002, 3, 265-274.
3. Peri, S., Navarro, J., Amanchy, R., Kristiansen, T. et al., Genome Res. 2003, 13, 2363-2371.
4. Garavelli, J., Nucleic Acids Res. 2003, 31, 499-501.
5. Yaffe, M., Leparc, G., Lai, J., Obata, T. et al., Nat. Biotechnol.2001, 19, 348-353.
6. Brinkworth, R., Breini, R., Kobe, B., Proc. Natl. Acad. USA 2003, 100, 74-79.
7. Blom, N., Gammeltoft, S., Brunak, S., J. Mol. Biol. 1999, 294, 1351-1362.
8. Blom, N., Sicheritz-Pontén, T., Gupta, R.,Gammeltoft, S. et al., Proteomics 2004, 4, 1633-1649.
9. Kreegipuu, A., Blom, N., Brunak, S., Nucleic Acids Res.1999, 27, 237-239.
10. Eisenhaber, B., Bork, P., Eisenhaber, F., J. Mol. Biol. 1999, 292, 741-758.
11. Cooper, C., Gasteiger, E., Packer, N., Proteomics 2001, 1, 340-349.
12. Julenius, K., Mølgaard, A., Gupta, R., Brunak, S., Glycobiology 2005, 15, 153-164.
13. Gupta, R., Jung, E., Brunak, S., In preparation, 2004.
14. Gupta, R., Jung, E., Gooley, A., Williams, K. et al., Glycobiology 1999, 9, 1009-1022.
15. Gupta, R., Birch, H., Rapacki, K., Brunak, S., Hansen, J., Nucleic Acids Res. 1999, 27, 365-370.
16. Monigatti, F., Gasteiger, E., Bairoch, A., Jung, E., Bioinformatics 2002, 18, 769-770.
17. Duckert, P., Brunak, S., Blom, N., Protein Eng. 2004, 17, 107-112.
18. Blom, N., Hansen, J., Blaas, D., Brunak, S., Protein Science, 1996, 5, 2203-2216.
19. Kiemer, L., Lund, O., Brunak, S., Blom, N., BMC Bioinformatics, 2004, 5, 72.
20. Bologna, G., Yvon, S., Veuthey, A.L., Proteomics, 2004, 4, 1626-1632.
21. Maurer-Stroh, S., Eisenhaber, B., Eisenhaber, F., J Mol Biol, 2002, 317, 541-557.

This page is part of the BioSapiens Network funded by the European Commission within its FP6 programme, under the thematic area "Life science, genomics and biotechnology for health", contract number LHSG-CT-2003-503265.


Zenia M. Larsen,