squash is an add-on package for the R statistical environment. This package provides functions for color-based visualization of multivariate data, i.e. colorgrams or heatmaps. Lower-level functions are provided to map numeric values to colors, display a matrix as an array of colors, and draw color keys. Higher-level plotting functions are provided to generate a bivariate histogram, a dendrogram aligned with a color-coded matrix, a triangular distance matrix, and more.

The current version is 1.0.1 (2011-08-15).

As with many R packages, squash can be obtained from CRAN, or can can be downloaded and installed automatically by entering the following at the R prompt:


Previous versions are here.

Please send questions or comments about squash to Aron.


Aug 15, 2011: squash is now available from CRAN. I have fixed numerous bugs, added new functions, and enhanced functionality of existing functions. For anyone already using squash, I should point out that I have made some major interface changes, and I have renamed some functions for clarity, or to avoid conflicts. Sorry for any inconvenience; hopefully this will be the last time I do this.



The bivariate histogram

"hist2" is a useful alternative to a scatter plot, if the number of points is large.

  x <- rnorm(10000) 
  y <- rnorm(10000) + x
  hist2(x, y)
Graphical output

Generation and application of color maps

Here, 3-dimensional (x, y, z) points are plotted with x and y in the graph plane and z indicated by color.

"makecmap" defines a mapping from numbers to colors.

"jet" is a color palette.

"cmap" does the conversion from numbers to colors, using the previously defined mapping.

"hkey" draws a horizontal color key.

  map <- makecmap(iris$Petal.Length, colFn = jet)
  plot(iris[,1:2], pch = 16, 
    col = cmap(iris$Petal.Length, map = map),
    main = 'Iris data')
  hkey(map, 'Petal length')
Graphical output

The squashgram, for visual exploration of 3-dimensional data.

Given a large number of 3-dimensional points (x, y, z), how does z vary as a function of x and y?

The "squashgram" is similar to a 2-dimensional histogram, except that the color indicates a summary (in this case, the median) of all z values of the points falling into the bin.

  squashgram(depth ~ long + lat, FUN = mean,
    main = 'Earthquakes off Fiji')
Graphical output

Same as above, with the number of observations indicated by rectangle size

A larger square indicates more points falling into the rectangular interval, and thus greater confidence.

  squashgram(depth ~ long + lat, FUN = mean,
    main = 'Earthquakes off Fiji', shrink = 5)
Graphical output

Display a numeric matrix using colors

"colorgram" is similar to the built-in R function "image" but offers several additional features: 1. An optional color key is added. 2. A color can be specified for missing values, and for values outside the range of the color scale. 3. The size of each grid rectangle can be specified to convey additional information.

"blueorange" is a color palette.

  x <- y <- seq(-10, 10, length= 29)
  f <- function(x,y) { r <- sqrt(x^2+y^2); 10 * sin(r)/(r+1) }
  z <- outer(x, y, f)
  map <- makecmap(z, colFn = blueorange, n = 20, symm = TRUE)
  colorgram(x, y, z, map = map)
Graphical output

Display a matrix of RGB colors

"cimage" is similar to "colormap", except that there is no number-to-color mapping. Instead, we pass the function a matrix of RGB values.

  red <- green <- 0:255
  rg <- outer(red, green, rgb, blue = 1, maxColorValue = 255)
  cimage(red, green, zcol = rg)
Graphical output

Display a dendrogram with a color matrix underneath

The colors indicate characteristics of each item being clustered.

  us.dend <- hclust(dist(scale(state.x77)))
  income <- state.x77[, 'Income']
  frost <- state.x77[, 'Frost']
  murder <- state.x77[, 'Murder']
  ## generate color maps
  income.cmap <- makecmap(income, n = 5, colFn = colorRampPalette(c('black', 'green')))
  frost.cmap <- makecmap(frost, n = 5, colFn = colorRampPalette(c('black', 'blue')))
  murder.cmap <- makecmap(murder, n = 5, colFn = colorRampPalette(c('black', 'red')))
  us.mat <- data.frame(Frost = cmap(frost, frost.cmap),
                       Murder = cmap(murder, murder.cmap),
                       Income = cmap(income, income.cmap))
  par(mar = c(5,4,4,3)+0.1)  # make space for color keys
  dendromat(us.dend, us.mat,
    ylab = 'Distance', main = 'US states')
  vkey(frost.cmap, 'Frost')
  vkey(murder.cmap, 'Murder', y = 0.3)
  vkey(income.cmap, 'Income', y = 0.7)
Graphical output

Display a color-coded triangular distance matrix

  distogram(eurodist, title = 'Distance (km)')  
Graphical output

Color palettes

We provide a few functions to generate contiguous color palettes.

  squash.palettes <- c('rainbow2', 'jet', 'grayscale', 'heat', 'coolheat', 'blueorange', 'bluered', 'darkbluered')
  R.palettes <- c('rainbow', 'heat.colors', 'terrain.colors', 'topo.colors', 'cm.colors')
  plot(0:8, type = 'n', ann = FALSE, axes = FALSE)
  for (i in 1:5) {
    p <- R.palettes[i]
    hkey(makecmap(c(0, 9), colFn = get(p)), 
      title = p, x = 2, y = i - 1)
  for (i in 1:8) {
    p <- squash.palettes[i]
    hkey(makecmap(c(0, 9), colFn = get(p)), 
      title = p, x = 6, y = i - 1)
  text(3, 8, 'R palettes', font = 2)
  text(7, 8, 'squash palettes', font = 2)
Graphical output
Last modified 2011-08-15