squash

squash is an add-on package for the R statistical environment. This package provides functions for color-based visualization of multivariate data, i.e. colorgrams or heatmaps. Lower-level functions are provided to map numeric values to colors, display a matrix as an array of colors, and draw color keys. Higher-level plotting functions are provided to generate a bivariate histogram, a dendrogram aligned with a color-coded matrix, a triangular distance matrix, and more.

The current version is 1.0.6 (2014-08-04).

As with many R packages, squash can be obtained from CRAN, or can can be downloaded and installed automatically by entering the following at the R prompt:

install.packages('squash')

Previous versions are here.

News

July 30, 2015: squash is now on GitHub.

Aug 5, 2014: Version 1.0.6 is now available from CRAN. Bug fixes and minor updates only.

Examples

library(squash)

The bivariate histogram

"hist2" is a useful alternative to a scatter plot, if the number of points is large.

x <- rnorm(10000)
y <- rnorm(10000) + x
hist2(x, y) Generation and application of color maps

Here, 3-dimensional (x, y, z) points are plotted with x and y in the graph plane and z indicated by color.

"makecmap" defines a mapping from numbers to colors.

"jet" is a color palette.

"cmap" does the conversion from numbers to colors, using the previously defined mapping.

"hkey" draws a horizontal color key.

map <- makecmap(iris\$Petal.Length, colFn = jet)
plot(iris[,1:2], pch = 16,
col = cmap(iris\$Petal.Length, map = map),
main = 'Iris data')
hkey(map, 'Petal length') The squashgram, for visual exploration of 3-dimensional data.

Given a large number of 3-dimensional points (x, y, z), how does z vary as a function of x and y?

The "squashgram" is similar to a 2-dimensional histogram, except that the color indicates a summary (in this case, the median) of all z values of the points falling into the bin.

attach(quakes)
squashgram(depth ~ long + lat, FUN = mean,
main = 'Earthquakes off Fiji') Same as above, with the number of observations indicated by rectangle size

A larger square indicates more points falling into the rectangular interval, and thus greater confidence.

squashgram(depth ~ long + lat, FUN = mean,
main = 'Earthquakes off Fiji', shrink = 5) Display a numeric matrix using colors

"colorgram" is similar to the built-in R function "image" but offers several additional features: 1. An optional color key is added. 2. A color can be specified for missing values, and for values outside the range of the color scale. 3. The size of each grid rectangle can be specified to convey additional information.

"blueorange" is a color palette.

x <- y <- seq(-10, 10, length= 29)
f <- function(x,y) { r <- sqrt(x^2+y^2); 10 * sin(r)/(r+1) }
z <- outer(x, y, f)
map <- makecmap(z, colFn = blueorange, n = 20, symm = TRUE)
colorgram(x, y, z, map = map) Display a matrix of RGB colors

"cimage" is similar to "colormap", except that there is no number-to-color mapping. Instead, we pass the function a matrix of RGB values.

red <- green <- 0:255
rg <- outer(red, green, rgb, blue = 1, maxColorValue = 255)
cimage(red, green, zcol = rg) Display a dendrogram with a color matrix underneath

The colors indicate characteristics of each item being clustered.

us.dend <- hclust(dist(scale(state.x77)))

income <- state.x77[, 'Income']
frost <- state.x77[, 'Frost']
murder <- state.x77[, 'Murder']

## generate color maps
income.cmap <- makecmap(income, n = 5, colFn = colorRampPalette(c('black', 'green')))
frost.cmap <- makecmap(frost, n = 5, colFn = colorRampPalette(c('black', 'blue')))
murder.cmap <- makecmap(murder, n = 5, colFn = colorRampPalette(c('black', 'red')))

us.mat <- data.frame(Frost = cmap(frost, frost.cmap),
Murder = cmap(murder, murder.cmap),
Income = cmap(income, income.cmap))

par(mar = c(5,4,4,3)+0.1)  # make space for color keys
dendromat(us.dend, us.mat,
ylab = 'Distance', main = 'US states')

vkey(frost.cmap, 'Frost')
vkey(murder.cmap, 'Murder', y = 0.3)
vkey(income.cmap, 'Income', y = 0.7) Display a color-coded triangular distance matrix

distogram(eurodist, title = 'Distance (km)') Color palettes

We provide a few functions to generate contiguous color palettes.

squash.palettes <- c('rainbow2', 'jet', 'grayscale', 'heat', 'coolheat', 'blueorange', 'bluered', 'darkbluered')
R.palettes <- c('rainbow', 'heat.colors', 'terrain.colors', 'topo.colors', 'cm.colors')

plot(0:8, type = 'n', ann = FALSE, axes = FALSE)
for (i in 1:5) {
p <- R.palettes[i]
hkey(makecmap(c(0, 9), colFn = get(p)),
title = p, x = 2, y = i - 1)
}
for (i in 1:8) {
p <- squash.palettes[i]
hkey(makecmap(c(0, 9), colFn = get(p)),
title = p, x = 6, y = i - 1)
}
text(3, 8, 'R palettes', font = 2)
text(7, 8, 'squash palettes', font = 2) 