The purpose of this exercise is to explore AT content in a comparative genomics context.
We will not worry about statistics, but limit ourselves to exploratory data analysis.
You will find interesting patterns and try to explain these phenomena using common sense and biological
knowledge. Apart from writing parsers, this is perhaps the most valuable skill when doing bioinformatics.
Key tools used in this exercise:
The only tools needed for this exercise are a web-browser and an open mind.
Comparing genome features
To begin this exercise, point your browser to
A more comprehensive genome list can be found at
For the purposes of this exercise we will stick to the former, that is the CBS website, since it has some
features which we will be using.
Start by sorting the list by AT content and scroll up and down a bit.
Take note of any interesting observations and correlations.
Using scatter plots
Now we will try to take a more systematic approach.|
Select the Compare Within Search button on the atlas webpage.
This can be used to make scatter plots of a variety of sequence derived
First we will look at the AT content on a large scale.
Try to make a scatter plot of genome length vs. AT content.
Do you see a correlation?
The correlation of genome length and AT content is a topic that is still being
studied. It seems to be a rather complex relationship, influenced by growth
temperature and several other factors. Do you have an idea for an underlying
principle for this relationship?
To help with the above queston, try to plot DNA melting energy vs. AT content.
You should see a very clear correlation.
Try to make a plot with AT content on the x-axis and local repeats on the
What kind of correlation do you see?
After doing this for several types of local repeats, try to do the same for
What do you see?
Can you explain what you see? Why is there a difference between the local
level and the global level?
When you are done with this exercise, you can try to continue where you left off with exercise M3.
When everyone is done, we will have a summary of the exercise.