Exercise M12

Exercise 1

EMBOSS is a collection of software tools that are freely available at http://emboss.sourceforge.net/ Go and have a look – see if you can find the list of tools, and then scroll through them to get a feel what you can do there.

Exercise 2

Using “dottup” from the EMBOSS package, see how colinear the following sequence pairs are:
  • AE013218 and AE016826
  • L43967 and AE017199

Based on the results, which pair of genomes would be more similar to each other? What happens if you change a few of the dottup parameters (e.g. word size)?

Exercise 3

Determine the %G+C, the dinucleotide relative abundance and codon statistics (codon usage, Nc) using programs from the EMBOSS package (or anything else you want to use!) for AE017244, AE017243 and U00089. Based on the results, which pair of sequences is most similar? The procedure to do this is described in Coenye & Vandamme (2003) – if you don’t have the paper, download it first- but of course you are welcome to use anything else you want!! Also, check the “deltarho-website” at http://deltarho.amc.nl/cgi-bin/bin/index.cgi.

Exercise 4

Try to find as many software packages as possible that would allow you to make “supertrees”. Make sure to have a look at the CLANN software (http://bioinf.nuim.ie/software/clann/). Browse through some of the software packages you found.