Below, you see two protein sequences. They are both globins from a midge, Chironomus thummi thummi, a very small and annoying insect.
>GLB7_CHITH 145 P02226 GLOBIN CTT-VIIA. APLSADQASLVKSTWAQVRNSEVEILAAVFTAYPDIQARFPQFAGKDVAS IKDTGAFATHAGRIVGFVSEIIALIGNESNAPAVQTLVGQLAASHKARGI SQAQFNEFRAGLVSYVSSNVAWNAAAESAWTAGLDNIFGLLFAAL >GLBP_CHITH 152 P11582 GLOBIN CTT-E/E' PRECURSOR. MKFIILALCVAAASALSGDQIGLVQSTYGKVKGDSVGILYAVFKADPTIQ AAFPQFVGKDLDAIKGGAEFSTHAGRIVGFLGGVIDDLPNIGKHVDALVA THKPRGVTHAQFNNFRAAFIAYLKGHVDYTAAVEAAWGATFDAFFGAVFA KMThese sequences are in the FASTA format, a very extensively used format for input to bioinformatics programs: a line beginning with a ">" contains the name of a sequence plus optional comments, while the other lines until the next ">" contains the sequence itself.
Do a global alignment of these two protein sequences, using the ALIGN service at the GENESTREAM network server, IGH, Montpellier, France. Hint: you can copy the sequences and sequence names from this page and paste them into the input windows at the French site.
Take a look at the result. Note that there is a gap in GLBP_CHITH - what is the corresponding sequence of GLB7_CHITH? This is an authentic example! (nature truly is fascinating...) If you don't believe me, retrieve the original database entries for GLBP_CHITH and GLB7_CHITH from the SWISS-PROT database.
Now try a local alignment of the same two sequences, using the LALIGN service instead. Compare the output with that of ALIGN. You will get the ten best-scoring local alignments, sorted by decreasing similarity score. Instead of LALIGN, you can also try the SIM alignment tool for protein sequences, or the ACNUC/LFasta tool for nucleic acid sequences.
Repeat the above procedure, using ALIGN and LALIGN (or SIM or ACNUC/LFasta) on the following pairs of protein and nucleotide sequences. Below, sequences in FASTA format are included as links to local files.
At the SIM and ACNUC/LFasta servers, you can set the gap penalties, and at SIM you also have a choice between a number of PAM and BLOSUM substitution matrices. Try to repeat some of the local alignments you have just made while varying these alignment parameters. Observe how the alignment length and % identity of the local alignments depends upon the matrix entropy. Tip: In order to avoid getting drowned in output, you can set the number of alignments to be computed to only 1.