DeepLoc-1.0: Eukaryotic protein subcellular localization predictor

Training and testing data sets

The dataset used to train and test the DeepLoc-1.0 server is available here deeploc_dataset

It is a fasta file composed by header and sequence. The header is composed by the accession number from Uniprot, the annotated subcellular localization and possibly a description field indicating if the protein was part of the test set. The subcellular localization includes an additional label, where S indicates soluble, M membrane and U unknown.

>Q3E7A9 Mitochondrion-S
MSNPCQKEACAIQDCLLSHQYDDAKCAKVIDQLYICCSKFYNDNGKDSRSPCCPLPSLLELKMKQRKLTPGDS
>Q9SMX3 Mitochondrion-M test
MVKGPGLYTEIGKKARDLLYRDYQGDQKFSVTTYSSTGVAITTTGTNKGSLFLGDVATQVKNNNFTADVKVST
DSSLLTTLTFDEPAPGLKVIVQAKLPDHKSGKAEVQYFHDYAGISTSVGFTATPIVNFSGVVGTNGLSLGTDV
AYNTESGNFKHFNAGFNFTKDDLTASLILNDKGEKLNASYYQIVSPSTVVGAEISHNFTTKENAITVGTQHAL>
DPLTTVKARVNNAGVANALIQHEWRPKSFFTVSGEVDSKAIDKSAKVGIALALKP


Scientific problems:        Technical problems: