TargetP predictions of unannotated A. thaliana and H. sapiens sequences


The plant version of TargetP was tested on the two fully sequenced Arabidopsis thaliana chromosomes, number 2 sequenced by TIGR and number 4 sequenced by The European Union Arabidopsis Sequencing Consortium et al. (as of 2000-01-07). The non-plant version was tested on the Ensembl Homo sapiens set (as of 2000-01-14).

As seen in the table, the predicted cTP abundance in the A. thaliana sets was approximately 14%, while the predicted mTP and SP abundances were about 10% and 15%, respectively, in all three sets. Using TMHMM and SignalP-HMM to further elucidate the fate of the sequences predicted to have an SP, the estimates of actually secreted proteins were lowered to approximately 11% for the Arabidopsis sets and to 8% for the Homo set (column marked (*)).

                        No.     -- Predicted abundance, % --
Data set                seq.s    cTP     mTP     SP    SP(*) 
------------------------------------------------------------
A. thaliana chr. 2       4054   13.1    10.5    16.7    11.1
A. thaliana chr. 4       3744   13.9    10.1    17.2    11.6
H. sapiens (Ensembl)    10228   -        9.3    12.8     8.0
------------------------------------------------------------
(*) predicted transmembrane proteins excluded (as described) 

Splitting up the predictions into reliability classes (RC): (actual numbers)
A. Thaliana chr. 2       A. Thaliana chr. 4       H. Sapiens (ensembl) 

RC  --Pred. cat.--       RC  --Pred. cat.--       RC  --Pred. cat.--
    cTP  mTP    SP           cTP  mTP    SP             mTP      SP
------------------       ------------------       ------------------
1    94   17   295       1    77   26   307       1      44     527
2   102   61   143       2   120   52   125       2     130     282
3    99   73    85       3   103   67    81       3     182     140
4   101  127    69       4   110  104    70       4     272     189
5   135  148    84       5   122  137    74       5     328     176
------------------       ------------------       ------------------
For description of reliability classes, see Output format.

Here are the detailed results of the predictions:
A. thaliana chromosome 2
A. thaliana chromosome 4
H. sapiens Ensembl set

FTP sites for the sequences:
A. thaliana chromosome 2
A. thaliana chromosome 4
H. sapiens Ensembl set




CORRESPONDENCE

Olof Emanuelsson,