The protein gp120 is the crucial envelope protein of HIV facilitating binding to and fusion with the target cell (human CD4 lymphocytes).
Take a look at the HIV sequences data in fasta format: The file gp120.fasta contains 27 gp120 sequences (envelope protein) from HIV-1, HIV-2 and SIV. As you can see, they do not have the same lengths.
Regarding the alignment, it is of interest that gp120 contains 9 conserved disulfide bridges. Also relevant is the so-called V3-loop. This is a surface-exposed highly immunogenic antibody-binding and hypervariable (immunological escape) region of gp120, which has been extensively sequenced. The location of the V3 loop in the gp120 molecule can be seen in this schematic visualization.
If you want, you can try to repeat the alignment with different settings of the options. You can read about the options in the HELP file.
In this part of the exercise we will use a number of programs from the PHYLIP package, implemented in a server at The Institut Pasteur, Paris.
First, we use the program protdist to compute a distance matrix from the alignment we made, and then we use the program neighbor to calculate Neighbor-joining and UPGMA trees based on this matrix.
This exercise might also be carried out using the ClustalW server or the ClustalW and Jalview server at EBI (European Bioinformatics Institute). Jalview is a java-based multiple alignment editor, which shows your alignment in cool colours and makes it possible to do manual post-processing of the alignment. However, this service is slower than the others, and Jalview itself tends to run painfully slow on some computers.
PAUP is a widely used
package of programs for inferring phylogenies, and exists in versions for
most computer platforms.
TreeView is a simple program for displaying phylogenies on Apple Macintosh and Windows PCs.
The BCM Search Launcher has a multiple alignment page offering several programs different from ClustalW. Note: the other algorithms tend to be slower and have stricter size limitations.
For more links concerning multiple alignments, see the Multiple Alignment Resource Page from the GNA-VSNS Biocomputing course.
For more links concerning phylogenetics servers at the WWW, see this list from the PHYLIP WWW site.