Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other
CBS >> CBS Courses >> Scientific Communication of Comparative Genomics >> Course Programme >> Day 7

Day 7: Pangenome trees

Pangenome Trees

In this exercise we will compute pan-genome trees. This is a gene-content tree that illustrates similarities/differences between genomes inside a pan-genome.

The following is a condenced version of a guide that Lars has written for you. Keep this guide on the screen with you, because that one also contains the whys and the wherefores of what you are doing.

  1. Log in

    log in using SSH or Putty to the computer

    # log in to the computers again as, then:
    ssh -Y ibiology
    umask 022
    setenv MAKEFILES /home/people/pfh/bin/Makefile
  2. Create directory for this exercise

    # create directory 
    mkdir panTree
    cd panTree

  3. Copy Buchnera pangenome family tree into your dircetory and unpack it

    # copy the file containing the data you will need into your directory 
    cp /home/people/larssn/pantree/pantree.tar.gz .
    gunzip pantree.tar.gz
    tar xvf pantree.tar

  4. Go into the blast directory and start R

    # log in to the computers again as, then:
    cd pantree/buchnera/blast
    Once you have started R, you will see that the promt, the characters at the beginning of the lines change. This means that you are inside the program R. To exit this program, type q() and answer yes.

  5. Run the script which will prepare the fasta files

    # run the script that prepares tha fasta files
    source( “script_fastaprep.R” )

  6. Run the script that will compare all proteins with all of the other proteins

    In this case the organisms that we are comparing are fairly small, so this will take little time.

    # run the script that does the blasting, and then quit R
    source( “script_blastall.R” )

  7. Restart R in a new directory

    IMPORTANT: you have to quit R above, before changing directory, otherwise the next part will not work
    # move to a different directory and start R again
    cd ..
  8. Prepare gene families
    # run the script that creates gene families
    source( “script_panMatrix.R” )
  9. Create pan gene family tree plot
    # create tree
    source( “script_panTree.R” )
    You should now have a plot on your screen. 

  10. Save plot 

    # save as postscript file
    dev.copy(device = postscript, file = "")

  11. Quit R

    # quit R