Written by: Rasmus
When comparing DNA and protein sequences from different species it is
important to keep in mind that all living organisms at some point in
time has shared a common ancestor. Some organisms are closely related
and has recently derived from a common ancestor (e.g. Human and
Chimpanzee, which diverged 5-10 million years ago) and some are more
distantly related (e.g. Human and mouse, diverged 100-150 million years
The more closely lated two organisms are, the less sequence divergence
will be found (say, when comparing the Alpha Globin gene from each of
the organisms), and the more likely it will be that similair looking
genes from each organism still have the same function (MUCH more about
this when we get to pairwise alignment and BLAST searches).
taxonomy of the Great Apes: Human, Chimp, Gorilla, Orangutan - from
Phylogy vs. taxonomy
As we discussed in the lecture all life is organised in a hierachical taxomical system, which
approximates the "true" underlying phylogeny to a large degree. It's
therefore often important to know where a specific organism is placed
in the taxonomical system - this type of information will also always
be included along with DNA/Protein sequences from the big databases
such as GenBank and UniProt.
Today we will explore various ways look up and compare taxonomy.
A word about Wikipedia
The free online encylopedia Wikipedia (and other similar resources) is
a GREAT way to start out when
you need to look up information about a new topic - in this case
taxonomy. Almost all species entries in Wikipedia has a "Scientific classification" box
which includes taxonomical information (for example see the entry on Orangutan or Fucus vesiculosus
(Bladder wrack / Blæretang)).
HOWEVER: Keep in mind that Wikipedia is NOT a reliable source of
information, even if most entries are of a very good quality. The facts
in the Wikipedia entries has not been verfied by taxonomy experts and
can potential be wrong (everybody can go in an edit the text). We need
to look up the taxonomy in an official database (in this case we'll be
using NCBI Taxonomy) before you can state it as a fact.
You CANNOT quote Wikipedia as the only source of your information -
you'll need to find the original primary source of the information or
look it up in an offical database.
Let me repeat that: "You CANNOT quote
Wikipedia as the only source of your information".
The "Tree of Life" browser.
For the first part of the exercise we will be using the Tree of Life
("ToL") website to explore the
taxonomy of various organisms. It's easy to step up and down in the
evolutionary tree, and browse for interesting topics. Often a fair bit
of in-dept information is provided at the various higher levels of
taxonomy (for example "Mammals" or "Primates") - not only at the
species level. ToL is also well illustrated with a lot of pictures and
1) Open the Tree of Life website in a new browser windows/tab: http://www.tolweb.org
2) Spend a few minutes investigating the general lay-out of the website
and getting a feeling about what kinds of information are available.
Notice that specific banches of the overall Tree of Life can be
investigated by clicking directly on the tree on the main page.
3) Top-Down task: Investigate
the taxonomical position of the domestic cat (Felis silvestris) by starting at
the root of the tree and progressively going deeper into the
sub-branches. (Danish note: "mammal" = pattedyr - "Placental mammals"
er de pattedyr, der ikke er pung-dyr).
While you go from branch to branch scroll down the webpage to notice
what kinds of information are provided (for example about evolution).
- Do you encounter any extint animal groups along your route to the
- How many species are listed with-in the genus Felis?.
Let's consider the following situation: you know that the scientific
name of the domestic pig
is "Sus scrofa" and you want
to find out where it is placed in the taxonomical system.
5) Lastly we'll go dinosaur hunting in ToL. The task is to locate the
famous Tyrannosaurus rex, and
during the search we'll encounter an animal group that may be a bit of
- Search for "Sus
scrofa" in the search-box at the top of the main webpage, and
locate the page entitled "Sus scrofa".
- Now start walking "backwards" in the tree by clicking "Containing group".
- QUESTION 2a: What is the
name of the first higher-level group? Does this makes sense considering
the scientific naming scheme (the binominal names in Latin)?
- NOTICE: You can click the small tree icon in the upper left side
to active the "quick navigation"
menu. By "mousing-over" the icon a tool-tip about the function will be
- QUESTION 2b: Continue
navigating "backwards", until you encounter a the first taxonomical
group that includes animals that are clearly not pigs. What is the name
- QUESTION 2c: Navigate all
the way back to Eutheria
(Placental mammals). Which (surprising?) group is the "sister group" to
the one containing the pigs? (A sister group is the neighboring group
in the tree - the most closely related group).
- Search for "Dinosauria".
- QUESTION 3a: There will
be three sub-groups within Dinosauria.
Are they all extinct?
- Continue "up" (high level -> low level) the tree until you
reach Tyrannosauridae (there
is a lot of information anout what defines a "Tyrant dinosaur" at this
page). During the walk trough the tree notice what kinds of animals are
included at the various levels of taxonomy - especially notice with
groups are extinct and which are not.
- QUESTION 3b: Based on you
observations: are all dinosaurs extinct?
- QUESTION 3c: Is the Black
Bird a dinosaur - in the taxonomical sense?
NCBI Taxonomy Database
In the final part of the exercise we will be using the NCBI Taxonomy
database. NCBI Tax is a more "dry" and techinical database, which
contains accurate (standardized!)
hierachical taxonomy of around 180.000 organisms. The NCBI Tax database
also provides the numerical enumeration of species (and other
taxonomical levels) that is used and referenced in most Sequence
databases, such as GenBank (DNA) and UniProt (Protein). For example
human (Homo sapiens) has the
ID "9606" and Yeast (Saccharomyces cerevisiae) as the ID
NCBI Tax is not a database you would browse for fun (as you might with
ToL). It's good for looking up definitions, and for comparing the
taxonomical position of multiple organisms (since the information is so
1) Open the NCBI Taxonomy webpage in a new browser window/tab: http://www.ncbi.nlm.nih.gov/Taxonomy/
2) Notice: a number of "popular" model organisms are linked directly
from the main page.
3) Choose "Homo
- Dont Panic: An enormous
amout of information is shown - for example about genome sequences. In
this case we only need to look at the Taxonomical information presented
at the top of the page.
- Notice the Taxonomy ID - 9606 as mentioned above.
- "Lineage": Here the
entire hierachical taxonomy is presented as a densly written list. Each
taxonomical group on this list can be clicked and further investigated:
first an overview page also containing 3 levels of sub-groups will be
show - click the name again to get to the page dedicated to that entry.
Notice that a large number of taxonomical groups are listed (including
many "in-between" levels such as "Craniata" (sub-phylum) and "Gnathostomata" (super-class). Most
of these groups are simply left out in ToL for brievity.
You can swith between showning ALL subgroups and a more condensed list
by clicking "Lineage" - it will switch between "full" and "abbreviated".
IMPORTANT: You can investigate the taxonomical rank of any group with
out leaving the page by "mousing over" the text.
- QUESTION 4: What is the
TaxID of "Metazoa"?
- Notice: It is in principle possible to browse the taxonomy by
navigating deeper into the tree one step at the time (like with ToL) -
however finding Human from the "Primates" level would require a lot of
detailed knowledge about taxonomy and latin names for a lot of
Click "Primates" from the Human entry and try to find out which group
contains humans - you are on the right track once you get to the "Old
4) Comparing taxonomy using NCBI Tax.
Besides being useful for being the official database behind the
TaxID's used in GenBank (and other databases), NCBI Tax actually makes
it easy to compare taxonomy.
Let's take the situation where you have read an interesting paper
comparing a DNA sequence between the following four organisms: Homo sapiens (Human), Mus musculus (Mouse), Danio rerio (Zebra fish) og Drosophila melanogaster
(Fruit fly), but you have no idea about the relationship between the
four organisms. We can look this up in NCBI Tax:
- Open two browser windows/tabs [http://www.ncbi.nlm.nih.gov/Taxonomy/]
and search for Homo sapiens
and Mus musculus.
- By comparing the "lineage"
text it will be easy to find out at which taxonomical level human and
- QUESTION 5: Turn on "abbreviated" lineage information
and find lowest ranking common group for human and mouse - what is the
name and what is the rank?
- In order to get more information than just a latin name and a
taxonomy rank, you can try to look up the group in a different
database, such as ToL (NCBI will not reveal more than "placentals" if
you investigate it further).
- NOTICE: Since a "user friendly" database such as Tree of Life
doesn't contain the same amount of taxonomical groups, it may be
necessary pick a group with higher rank if the first one is not found.
- Open a few more browser windows/tabs and find the information for
Zebra fish and Fruit fly as well.
Remember to turn on "abbreviated" lineage information for easy
- QUESTION 6: Which ranked group do connect Human and
Zebra fish (ignore "no rank" groups)? Which rank? (You can look up this
group in ToL for finding out more information).
- QUESTION 7: Which ranked group do connect Human and
Fruit fly (ignore "no rank" groups)? Which rank? (As before furhter
information can be gained from ToL).