|
This site contains a downloadable version of and
supplementary information for the paper
Locating proteins in the cell using TargetP,
SignalP, and related tools
Olof Emanuelsson, Søren Brunak, Gunnar von Heijne, Henrik Nielsen
Nature Protocols 2, 953-971 (2007).
In this webpage, all websites mentioned in the protocol
are shown as active links, to make it easier for the reader to try all
the online tools.
Accessing the paper
If you have access to an institutional subscription to Nature
Protocols, it will be most convenient to download the paper from the
journal website:
Otherwise, you can download a socalled ePrint of the paper
here:
- NATURE_169477_CP.exe
- Note: this is an exetutable file, which only works under Windows.
Mac or Linux users please contact Henrik Nielsen.
- Instructions: Right-click the link and save the file to your
computer. To open the saved ePrint file simply double-click on the file
icon you saved to your computer. You must be connected to the Internet
the first time you open your ePrint. This one-time Internet connection
allows your ePrint to validate itself.
- Note: the ePrint can be printed, but only twice per computer.
It is not possible to print to a PDF with Acrobat Distiller or
equivalent programs.
- If you have questions or problems, consult the
ePrint FAQ or write to Henrik Nielsen.
NEW: An automated implementation
A group in Tampere, Finland, have developed
an automated implementation of our protocol for animal (including human)
sequences. The method called PROlocalizer and is available here:
The paper is available through open access here:
PROlocalizer: integrated web service for protein subcellular
localization prediction
Kirsti Laurila and Mauno Vihinen
Amino Acids 2010 Sep 2. [Epub ahead of print]
(PubMed link)
Sample data sets
Here are two example data sets, one prokaryotic and one eukaryotic, that
the reader may use for testing the methods described in the protocol. They
are in the FASTA format required by all the servers.
The names of the proteins are their UniProt/Swiss-Prot identifiers.
The annotated subcellular location is indicated after each name.
The full annotation of a protein can be retrieved from the
UniProt Knowledgebase by
inserting its identifier into the search field at the top of the page.
Links from PROCEDURE
Here are links to all the websites mentioned in the PROCEDURE section of the
protocol, shown together with the step number(s) in which they occur. Programs
hosted at the Center for
Biological Sequence Analysis are marked "(CBS)".
1-4. TargetP 1.1
(CBS): Secretory signal peptides, mitochondrial
targeting peptides and chloroplast transit peptides in eukaryotes.
5-6. ChloroP 1.1
(CBS): Chloroplast transit peptides in plants.
7-9. SignalP 3.0
(CBS): Secretory signal peptides in eukaryotes,
Gram-negative and Gram-positive bacteria.
10. TatP
(CBS): Twin-arginine translocation signal peptides in
bacteria.
11. TMHMM 2.0
(CBS): Transmembrane alpha-helices. (A list of other transmembrane
alpha-helix predictors is found at the PSORT
main page).
12. B2TMR and
HMM-B2TMR: Transmembrane beta-barrels. (A list of other transmembrane
beta-barrel predictors is found at the PSORT
main page).
13 A.
big-Pi:
GPI membrane anchors in eukaryotes;
NMT
and
Myristoylator:
Myristoyl membrane anchors in eukaryotes.
13 B. LipoP
(CBS): Lipoprotein signal peptides in bacteria.
14. PROSITE:
Scan your sequence and watch for pattern
PS00014 / ER_TARGET,
indicating endoplasmic reticulum
lumenal retention motifs in eukaryotes.
Golgi predictor:
Golgi retention signals in eukaryotes.
15. PredictNLS
and
NucPred:
Nuclear localisation signals in eukaryotes.
NetNES
(CBS): Leucine-rich nuclear export signals in eukaryotes.
16. PeroxiP
and
PTS1:
C-terminal peroxisomal targeting signals in eukaryotes.
17. NCBI BLAST:
Database searches - click "Protein-protein BLAST
(blastp)" and choose "swissprot" or "nr" as the database.
18. See the methods listed in BOX 1 below,
or consult the longer list at
the
PSORT main page.
Links from TROUBLESHOOTING
Here are links to all the websites mentioned in the TROUBLESHOOTING
section of the
protocol, shown together with the step number in which they occur. Programs
hosted at the Center for
Biological Sequence Analysis are marked "(CBS)".
1. SecretomeP
(CBS): Signal peptide-less secretion in mammals and bacteria.
2. NetStart
(CBS): start codon prediction in eukaryotic DNA sequences.
3. ProP
(CBS): Propeptide cleavage in eukaryotes.
4. Phobius:
Combined prediction of signal peptides and
transmembrane alpha-helices.
Links from BOX 1
Here are links to all the non-CBS multicategory localisation prediction
programs mentioned in BOX 1 (see step 18 of the PROCEDURE).
WoLF PSORT:
11 locations in animals and plants, and 10 in fungi.
PSORTb v.2.0:
5 locations in Gram-negative bacteria and
4 in Gram-positive bacteria.
iPSORT:
4 locations in plants and 3 in other eukaryotes (TargetP data).
LOCtree:
6 locations in plants, 5 in other eukaryotes and 3 in bacteria.
BaCelLo:
5 locations in plants and 4 in other eukaryotes.
Protein Prowler v. 1.2:
4 locations in plants and 3 in other eukaryotes (TargetP data).
CELLO v.2.5:
12 locations in eukaryotes, 5 in Gram-negative bacteria and
4 in Gram-positive bacteria.
PA-SUB v2.5:
9 locations in animals, 10 in plants, 9 in fungi, 6 in Gram-negative
bacteria and 4 in Gram-positive bacteria.
TargetLoc:
4 locations in plants and 3 in other eukaryotes (TargetP data).
MultiLoc:
9 locations in animals, 10 in plants and 9 in fungi.
CORRESPONDENCE
Henrik Nielsen,
|