Today, we will see how to plot 3D histograms in R using the hist3D function from the plot3D package. As an example, I will use the iris built-in dataset. In this tutorial, I will use information about petal width and sepal width.
Load the plot3D package
library(plot3D)
Load the iris dataset
data("iris")
Count observations for each variable combination
x = "Petal.Width"
y = "Sepal.Width"
contingency = as.data.frame(table(iris[, c(x, y)]), stringsAsFactors = FALSE)
Reshape data from long to wide
change_format = function(x_level, contingency, x, y_levels){
line = t(contingency[contingency[, x] == x_level, "Freq"])
rownames(line) = x_level
colnames(line) = y_levels
return(line)
}
x_levels = unique(contingency[, x])
y_levels = unique(contingency[, y])
mat = do.call("rbind", lapply(x_levels, change_format, contingency, x, y_levels))
Add the missing variable combinations
x_levels_all = as.character(seq(min(x_levels), max(x_levels), by = 0.1))
y_levels_all = as.character(seq(min(y_levels), max(y_levels), by = 0.1))
mat = as.data.frame(mat)
mat[, y_levels_all[!y_levels_all %in% y_levels]] = rep(0, nrow(mat))
mat[x_levels_all[!x_levels_all %in% x_levels],] = rep(0, ncol(mat))
mat = as.matrix(mat[x_levels_all, y_levels_all])
3D histogram with default parameters
hist3D(x = as.numeric(rownames(mat)), y = as.numeric(colnames(mat)), z = mat,
xlab = x, ylab = y, zlab = "Number of observations")
3D histogram with customized parameters
Now, I will customize the following parameters:
- ticktype = “detailed” to add ticks (default: ticktype = “simple”)
- bty = “b2” to add grid lines (default: bty = “b”)
- theta = 120 to increase the azimuthal direction (default: theta = 40)
- phi = 10 to decrease the colatitude (default: phi = 40)
- d = 2 to lessen the perspective effect (default: d = 1)
- space = 0.5 to add space between bars (default: space = 0)
- lighting = TRUE to illuminate the facets (default: lighting = FALSE)
hist3D(x = as.numeric(rownames(mat)), y = as.numeric(colnames(mat)), z = mat,
xlab = x, ylab = y, zlab = "Number of observations",
ticktype = "detailed", bty = "b2", theta = 120, phi = 10,
d = 2, space = 0.5, lighting = TRUE)
3D histogram with customized colors
Finally, I will customize the 3D histogram coloring. Regarding the color palette, I will use the cm.colors palette (instead of the jet.col one which is used by default in hist3D):
barplot(rep(1, 3), col = cm.colors(3), main = "cm.colors")
Moreover, I will prepare a coloring matrix which has the same dimensions as the one containing the observations. I will define three categories in this matrix:
- the observations greater than 0 but less or equal to 1 will be colored with the first color
- the observations equal to 0 will be colored with the second color
- the observations greater than 1 will be colored with the third color
colors = ifelse(mat > 0, 1, 2)
colors = ifelse(mat > 1, 3, colors)
And then, I will remove the color legend. I will therefore change the following parameters:
- col = cm.colors(3) to change the color palette (default: col = NULL)
- colvar = colors to use other coloring values than the observations (default: colvar = z)
- colkey = FALSE to remove the color legend (default: colkey = NULL)
hist3D(x = as.numeric(rownames(mat)), y = as.numeric(colnames(mat)), z = mat,
xlab = x, ylab = y, zlab = "Number of observations",
ticktype = "detailed", bty = "b2", theta = 120, phi = 10,
d = 2, space = 0.5, lighting = TRUE,
col = cm.colors(3), colvar = colors, colkey = FALSE)
Conclusion
3D histograms are controversial but if you really need to make one, the hist3D function from the plot3D R package is really useful! Have you ever made 3D histograms?