The dataset used for training, validating, and testing TargetP 2.0 (using nested cross-validation) can be found here.
The sequences are in FASTA format with the UniProt AC as sequence name: Download
The annotations are in a tab-separated file where each line contains three fields: The UniProt AC, the type of targeting peptide,
and the length of the targeting peptide.
The type can be