Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

DNA2protSS Data Sets


This directory contains the data used in the paper:

"Protein structure and the sequential structure of mRNA: alpha-helix and beta-sheet signals at the nucleotide level"
S. Brunak and J. Engelbrecht, Proteins, 25, 237-252, 1996.

The directory contains 6 files. Each file contains either the mRNA or the amino acid sequence, and the corresponding eight category DSSP assignment of the secondary structure of the associated protein:

  • nucall.seq - 719 mRNA sequences, all organisms, redundant data set
  • proall.seq - 719 amino acid sequences, all organisms, redundant data set
  • ebacnuc.seq - 44 mRNA enterobacterial sequences, non-redundant data set
  • ebacpro.seq - 44 amino acid enterobacterial sequences, non-redundant data set
  • mamnuc.seq - 77 mRNA mammalian sequences, non-redundant data set
  • mampro.seq - 77 amino acid mammalian sequences, non-redundant data se