This site contains a downloadable version of and
supplementary information for the paper
In this webpage, all websites mentioned in the protocol are shown as active links, to make it easier for the reader to try all the online tools.
Accessing the paper
If you have access to an institutional subscription to Nature Protocols, it will be most convenient to download the paper from the journal website:
Otherwise, you can download a socalled ePrint of the paper here:
NEW: An automated implementation
A group in Tampere, Finland, have developed an automated implementation of our protocol for animal (including human) sequences. The method called PROlocalizer and is available here:
The paper is available through open access here:
PROlocalizer: integrated web service for protein subcellular localization prediction
Sample data sets
Here are two example data sets, one prokaryotic and one eukaryotic, that the reader may use for testing the methods described in the protocol. They are in the FASTA format required by all the servers.
The names of the proteins are their UniProt/Swiss-Prot identifiers. The annotated subcellular location is indicated after each name. The full annotation of a protein can be retrieved from the UniProt Knowledgebase by inserting its identifier into the search field at the top of the page.
Links from PROCEDURE
Here are links to all the websites mentioned in the PROCEDURE section of the protocol, shown together with the step number(s) in which they occur. Programs hosted at the Center for Biological Sequence Analysis are marked "(CBS)".
1-4. TargetP 1.1 (CBS): Secretory signal peptides, mitochondrial targeting peptides and chloroplast transit peptides in eukaryotes.
5-6. ChloroP 1.1 (CBS): Chloroplast transit peptides in plants.
7-9. SignalP 3.0 (CBS): Secretory signal peptides in eukaryotes, Gram-negative and Gram-positive bacteria.
10. TatP (CBS): Twin-arginine translocation signal peptides in bacteria.
13 B. LipoP (CBS): Lipoprotein signal peptides in bacteria.
14. PROSITE: Scan your sequence and watch for pattern PS00014 / ER_TARGET, indicating endoplasmic reticulum lumenal retention motifs in eukaryotes. Golgi predictor: Golgi retention signals in eukaryotes.
17. NCBI BLAST: Database searches - click "Protein-protein BLAST (blastp)" and choose "swissprot" or "nr" as the database.
Links from TROUBLESHOOTING
Here are links to all the websites mentioned in the TROUBLESHOOTING section of the protocol, shown together with the step number in which they occur. Programs hosted at the Center for Biological Sequence Analysis are marked "(CBS)".
1. SecretomeP (CBS): Signal peptide-less secretion in mammals and bacteria.
2. NetStart (CBS): start codon prediction in eukaryotic DNA sequences.
3. ProP (CBS): Propeptide cleavage in eukaryotes.
4. Phobius: Combined prediction of signal peptides and transmembrane alpha-helices.
Links from BOX 1
Here are links to all the non-CBS multicategory localisation prediction programs mentioned in BOX 1 (see step 18 of the PROCEDURE).
WoLF PSORT: 11 locations in animals and plants, and 10 in fungi.
PSORTb v.2.0: 5 locations in Gram-negative bacteria and 4 in Gram-positive bacteria.
iPSORT: 4 locations in plants and 3 in other eukaryotes (TargetP data).
LOCtree: 6 locations in plants, 5 in other eukaryotes and 3 in bacteria.
BaCelLo: 5 locations in plants and 4 in other eukaryotes.
Protein Prowler v. 1.2: 4 locations in plants and 3 in other eukaryotes (TargetP data).
CELLO v.2.5: 12 locations in eukaryotes, 5 in Gram-negative bacteria and 4 in Gram-positive bacteria.
PA-SUB v2.5: 9 locations in animals, 10 in plants, 9 in fungi, 6 in Gram-negative bacteria and 4 in Gram-positive bacteria.
TargetLoc: 4 locations in plants and 3 in other eukaryotes (TargetP data).
MultiLoc: 9 locations in animals, 10 in plants and 9 in fungi.