Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

TargetP predictions of unannotated A. thaliana and H. sapiens sequences


The plant version of TargetP was tested on the two fully sequenced Arabidopsis thaliana chromosomes, number 2 sequenced by TIGR and number 4 sequenced by The European Union Arabidopsis Sequencing Consortium et al. (as of 2000-01-07). The non-plant version was tested on the Ensembl Homo sapiens set (as of 2000-01-14).

As seen in the table, the predicted cTP abundance in the A. thaliana sets was approximately 14%, while the predicted mTP and SP abundances were about 10% and 15%, respectively, in all three sets. Using TMHMM and SignalP-HMM to further elucidate the fate of the sequences predicted to have an SP, the estimates of actually secreted proteins were lowered to approximately 11% for the Arabidopsis sets and to 8% for the Homo set (column marked (*)).

                        No.     -- Predicted abundance, % --
Data set                seq.s    cTP     mTP     SP    SP(*) 
------------------------------------------------------------
A. thaliana chr. 2       4054   13.1    10.5    16.7    11.1
A. thaliana chr. 4       3744   13.9    10.1    17.2    11.6
H. sapiens (Ensembl)    10228   -        9.3    12.8     8.0
------------------------------------------------------------
(*) predicted transmembrane proteins excluded (as described) 

Splitting up the predictions into reliability classes (RC): (actual numbers)
A. Thaliana chr. 2       A. Thaliana chr. 4       H. Sapiens (ensembl) 

RC  --Pred. cat.--       RC  --Pred. cat.--       RC  --Pred. cat.--
    cTP  mTP    SP           cTP  mTP    SP             mTP      SP
------------------       ------------------       ------------------
1    94   17   295       1    77   26   307       1      44     527
2   102   61   143       2   120   52   125       2     130     282
3    99   73    85       3   103   67    81       3     182     140
4   101  127    69       4   110  104    70       4     272     189
5   135  148    84       5   122  137    74       5     328     176
------------------       ------------------       ------------------
For description of reliability classes, see Output format.

Here are the detailed results of the predictions:
A. thaliana chromosome 2
A. thaliana chromosome 4
H. sapiens Ensembl set

FTP sites for the sequences:
A. thaliana chromosome 2
A. thaliana chromosome 4
H. sapiens Ensembl set




CORRESPONDENCE

Olof Emanuelsson,