Exercise in Comparative Genomic Hybridization - Part 2
Written by: Peter Hallin
This exercise deals with the visualization of data from Comparative Genomic Hybridization (CGH) and comparison of mulitple genomes
Gene conservation : The BLASTatlas
In this part of the exercise, we will use the 'GeneWiz' software, developed here
at CBS. It is designed to visualize genomic properties e.g. DNA structural
parameters and base compositions. A recent Web Services based implementated
called the BLASTatlas enables users outside of CBS to programmatically
run the GeneWiz software and create an atlas visualization of any DNA sequence.
You will use a pre-written Perl script which is a Web Services client, to
visualize CGH data alongside with BLAST comparison of the reference against
GeneWiz reads numerical values - one per nucleotide position in the genome - and
applies a user defined color scale and plots binned data on a circular
representation. The BLASTatlas method goes a step further by allowing the user
also to include maps showing the homology between the reference genome ('query')
and any number of other genomes ('databases'). Each 'database' will appear in
its own lane. We have downloaded the proteomes of two E. coli and one
Shigella strains and the Web Services script will read these sequences
and upload them to the BLASTatlas server.
For further reading, please consult the BLASTatlas homepage or see the
QUESTION I: Familiarize yourself with the plot - try to
identify one or two regions, where only the three C.botulinum strains
display conservation on the protein level. That is, other Clostridum species
should have no or little conservation.
- BLASTatlas example: identifying conserved genomic regions
Take a minute to look
Clostridium BLASTatlas example. On page 2 of the PDF, you
will find the legend.
You will now submit the three proteomes and CGH maps which contains data from STX
negative, STX positive (STX, Shigella-like toxin), and E. coli EDL933
QUESTION II: Do you see a general correlation between
the BLAST lanes of the other O:157 strain (Sakai) and the reference strain
- Submit your job to the BLASTatlas service
This step will take roughly 2-4 minutes to execute. Log in to the computer system using SSH, having X11 enabled. Enter the directory for this exercise and run the Web Services script to produce a full view of the K12 genome. When the script has finished (2-4 minutes), view the output postscript.
cp -r /home/projects/pfh/2006-03-01_PhD/exercises/cgh2 ~/
perl full.pl > full.ps
QUESTION III: Is the
correlation between the two O157 strains validated by CGH data?
QUESTION IV: Repeat the method from above to produce a plot for stx2AB (see table below). Is the output as you would expect?
By editing the provided script (full.pl), you can zoom in on individual
regions of interrest. We will now look for the Shigella-like toxins (stx1AB).
You should not directly modify the full.pl script, rather you should make
a copy and change that instead. Find the out-commented zoom specification at line 39-40
(you can enable line numbers in the menu: Preferences->Show Line Numbers). ALSO
you must remove the out-commenting of the window specification at line 223
cp full.pl stx1.pl
perl stx1.pl > stx1.ps
|stx2A, stx2B||1352000 .. 1354000|
|FlgX||1573000 .. 1586000|
|FlhX,MotAB||2639000 .. 2656000|
|FliX||2696500 .. 2724000|
|stx1A, stx1B||2995000 .. 2997000|
When determining the properties of a bacterial pathogen, not only the presence of toxins are relevant. Other features like motility and ability to secrete proteins are important. Below is a list of regions you may find interesting - feel free to browse around. To
get a detailed list of the coordinates of the annotated features, you can view the genbank record of E. coli EDL993
Hallin PF, Binnewies TT, Ussery DW: The genome BLASTatlas-a GeneWiz extension for visualization of
whole-genome homology.Mol Biosyst 4:363-71 (2008) | Download PDF