Exercise 2: Data Integration
STEP 1: Start Cytoscape and load the
network “galFiltered.sif”
that you have downloaded from
here.
It can also be found in your Cytoscape installation’s
sampleData
directory (it should be there).
Your network will contain a combination of protein-protein
(pp) and protein-DNA (pd) interactions.
STEP 2: Import expression data table: File
-> Import -> Attribute/Expression
Matrix…, and select the “galExpData.pvals”
file that you have also downloaded from
here
or found in your sampleData directory. This file contains gene expression measurements for three knock-out
perturbation experiments. In each experiment, the expression for a different
transcription factor knock-out strain was measured.
The
“XXXexp” attributes are the average log-ratios while the “XXXsig” are the
corresponding p-values for the replicated expression study.
After a brief load, a status window will appear, indicating
how many experimental conditions were found (three) and what type significance values
were included.
STEP 3: Now we will use the ‘Data Panel’ to
browse through the expression data (node attributes), as follows.
i.
Select
some node in the Cytoscape canvas.
ii.
In
Data Panel, click the Select Attributes button (top left table icon of
Panel), and select the attributes ‘gal1RGexp’,
‘gal4RGexp’, and ‘gal80Rexp’.
It is
common to use expression data in Cytoscape to set the visual attributes of the
nodes in a network. The steps for doing this are as follows:
STEP 4: To set visual properties: select the “VizMapper”
in the Control Panel.

STEP 5: On the VizMapper manager window,
click the button to create a new visual style (see figure) named something
like“Gal80” to duplicate the default style.
STEP 6: Set the “Node Color” attribute as follows:
i.
In the pull-down
next to “Node Color”, select “gal80Rexp” (see figure).

ii.
Under the
associated “Mapping Type”, select a “Continuous
Mapping”.
iii.
Click on the
“Graph View” field to bring up the mapping editor (see figure).

iv.
Add 3 break
points and move them to -1, 0 and 1 respectively. You can do this with the help
of the “Range Setting”.
v.
By
double-clicking on the range handles (small triangles), set the colors (you
should only need 2 colors),

then
close the window.
vi.
Note that the
default node color of pink may fall within this spectrum. A useful trick is to
choose a color outside this spectrum to distinguish nodes with no expression
value defined. Under Defaults, click
anywhere on the image to open the default editor. Then set the “NODE_FILL_COLOR” default to grey and
then “Apply”.

Here, we will use
expression values and p-values
together in setting visual properties.
STEP 7: Expression log-ratios range from
about -3 to +3 in this study (log10 so 1/1000X to 1000X fold-change), the p-value ranges from 0 to 1, as they
should. Select some nodes and look at their expression and p-values in the Data Panel. You can sort the list (up or down) by
clicking on the column heading.

STEP 8: Now, we will explore setting node
shapes according to p- values.
i.
Double-click the
Node Size tab in the VizMapper setting window.
ii.
In the Map
Attribute pull-down menu, select “gal80Rsig”.
iii.
In the pull-down
menu under Mapping, select Continuous Mapping
iv.
Click anywhere
on the “Graphic view” row to bring up the mapping editor.

The y-axis
represents the node size while the x-axis is the range of the attribute being
mapped (0-1 in this case).
v.
First click
“Add” twice to create 2 break points. Double-click on the lower bound
handle (solid red square) and set this size to 55. Set the upper bound size to
20. Slide the lower break point to 0.001 using the black triangle or by using
the “Range Setting”. Set the upper break point to 0.1. Set the lower break
point size to 50 by double-clicking on the open square. Set the upper break
point to 20. You should see something
like the following figure.

Close the
mapping editor dialog.
This
section presents one scenario on how expression data can be combined with network
data to tell a biological story. But first we need to load more relevant gene
names.
STEP 9: Load the “ORF2name.na” from the data files table
here:
File -> Import -> Node Attributes…
and then Open (is the file on your
Desktop?)
STEP 10: In the VizMapper, find the Node Label attribute and set it to
“GeneName” using a "Passthrough Mapper" (this is the attribute you just loaded).
STEP 11: Now select the neighborhood of
GAL4 and create a new sub-network.
i.
In
the Control Panel, select the Filter and create a new filter
(“NodeName” for example). Select the Attribute
“node.GeneName” and Add the filter.
Type GAL4 in the new text box and then click Apply.
ii.
To
focus the view on the selected node, click the “Zoom Selected Region” in the
menu bar.

Then zoom out with the ‘-‘magnifying glass. You should see
something like,

iii.
While
the GAL4 node is selected: Select ->
Nodes -> ‘First neighbors of selected nodes’
iv.
Create
a child sub-network: File -> New
-> Network -> ‘From selected nodes, all edges’
v.
In
the new sub-network, apply a graph layout algorithm using the yFiles Hierarchic
layout.
vi.
Use
the VizMapper to change the Edge Color attribute with a Discrete Mapping on the “interaction”
attribute. This will distinguish regulatory interactions, “pd”, from
protein-protein interactions, “pp”.

Now set the Edge
Target Arrow Shape in the VizMapper
with a Discrete Mapping on the
“interaction” attribute again.

Notice that all three dark red nodes (highly induced genes)
are in the same region of the graph. With a little exploration in the node
attribute browser, you should see the following:
i.
The
two genes that interact with all three highly induced genes are GAL11 (a general transcription
cofactor with many interactions) and GAL4.
ii.
Both
GAL4 and GAL11 show small changes in expression and neither change is
statistically significant suggesting that the critical change affecting GAL1,
GAL7, and GAL10 might be somewhere else in the network.
iii.
GAL4 interacts with GAL80,
which shows a significantly lower level (GAL80 was deleted after all).
Q6: If GAL80 levels are low (or absent) but most of the other genes
linked to GAL4 show significant
levels of induction, what does this say about the role of Gal80p? Is Gal80
activating or inhibiting the activity of Gal4?
Exersice 3: Active Modules
A method
for finding “active modules” in interaction networks using gene expression data
was published in Ideker T, et al. “Discovering
regulatory and signaling circuits in molecular interaction networks.”
Bioinformatics. 2002. jActiveModules
is the plugin implementation of this module search and scoring method. It
requires that node p-value attributes
have been imported (as we have done in the previous exercise).
STEP 1: Set the parameters that
jActiveModules will use to score modules. Plugins
-> jActiveModules to create the tab in the Control Panel. Select all available p-value attributes (3) under Expression Attributes For Analysis (this
can be tricky, try ctrl-click starting from the bottom).
STEP 2: Making sure the main galFiltered
network has been selected (this can be done in the Network tab of the Control Panel) run the search algorithm by
clicking Find Modules in the
jActiveModules panel. A results window should appear when the search is
finished.

STEP 3: Select a module result by
selecting a network row in the jActiveModules results window. This will select
the corresponding nodes in the larger graph.
STEP 4: Select the second ranking module
(with 14 nodes) and create a new sub-network. File -> New -> Network -> From selected nodes, all edges
STEP 5: Layout this sub-network with the
method of your choice.

Q7: What can we guess about the
activity of Rap1p in the gal80 deletion data?