DNA2protSS Data Sets
This directory contains the data used in the paper:
"Protein structure and the sequential structure of mRNA: alpha-helix and beta-sheet signals at the nucleotide level"
S. Brunak and J. Engelbrecht, Proteins, 25, 237-252, 1996.
The directory contains 6 files. Each file contains either the mRNA or the amino acid sequence, and the corresponding eight category DSSP assignment of the secondary structure of the associated protein:
- nucall.seq - 719 mRNA sequences, all organisms, redundant data set
- proall.seq - 719 amino acid sequences, all organisms, redundant data set
- ebacnuc.seq - 44 mRNA enterobacterial sequences, non-redundant data set
- ebacpro.seq - 44 amino acid enterobacterial sequences, non-redundant data set
- mamnuc.seq - 77 mRNA mammalian sequences, non-redundant data set
- mampro.seq - 77 amino acid mammalian sequences, non-redundant data se