-
Notifications
You must be signed in to change notification settings - Fork 15
The Joy of R Base Graphics
Maybe this makes me an R hipster, but I've recently rediscovered the joy of base graphics. I still use the awesome ggplot2 for EDA, but for handcrafted, artisanal production quality graphics, it's hard to beat the expressive power of base graphics. However, it is tedious and slow to create and tinker with plots. Even worse, plots that look terrific in the quartz graphics device often are two small and not proportioned correctly in different graphics devices.
Most of my base graphics (and lattice) knowledge was lost in the great ggplot years of 2010-2016. As I recollect base tricks, I'm jotting them down here.
All adjustable par()
settings should be stored before tinkering with par()
:
opar <- par(no.readonly=TRUE)
# ... your plot here ...
par(opar)
mfrow
: use when you need to divide a plot into equally-sized columns and rows:
par(mfrow=c(nrows, ncols))
par(mfrow=c(2, 2))
# plot 1
plot.new()
box()
# plot 2
plot.new()
box()
# plot 3
plot.new()
box()
# plot 4
plot.new()
box()
The margins look off. We can kill all margins:
par(mfrow=c(2, 2), mar=c(0, 0, 0, 0))
for (i in 1:4) {
plot.new()
box()
}
Each margin specified in mar
is per plot. For y- and x- labels, we need more space:
par(mfrow=c(2, 2), mar=c(2, 2, 0, 0))
for (i in 1:4) {
plot.new()
box()
}
Or if we care about the margins of the entire plot:
par(mfrow=c(2, 2), oma=c(1, 1, 1, 1), mar=c(0, 0, 0, 0))
for (i in 1:4) {
plot.new()
box()
}
It's hard to see the margin on the white background, but it's there.
par(mfrow=c(2, 2), oma=c(1, 1, 1, 1), mar=c(3, 3, 0, 0))
for (i in 1:4) {
plot.new()
axis(1)
axis(2)
}
line
is which margin line (e.g. mar
) to plot mtext
label on:
plot.new()
axis(1)
axis(2)
# line 0
mtext("y-axis", side=2, col='darkgreen', line=0)
mtext("x-axis", side=1, col='darkgreen', line=0)
# line 1
mtext("y-axis", side=2, col='darkblue', line=1)
mtext("x-axis", side=1, col='darkblue', line=1)
# line 2
mtext("y-axis", side=2, col='purple', line=2)
mtext("x-axis", side=1, col='purple', line=2)
# line 3
mtext("y-axis", side=2, col='orange', line=3)
mtext("x-axis", side=1, col='orange', line=3)
# line 4
mtext("y-axis", side=2, col='red', line=-1)
mtext("x-axis", side=1, col='red', line=-1)
par(mfrow=c(2, 2), oma=c(1, 1, 1, 1), mar=c(3, 3, 0, 0))
for (i in 1:4) {
plot.new()
axis(1)
mtext(sprintf("plot %d y-axis", i), side=1, line=2)
axis(2)
mtext(sprintf("plot %d y-axis", i), side=2, line=2)
}
Many journals require subfigure labels like A, B, etc. Here's how I do it:
plot(rnorm(100), rnorm(100), bty='n')
mtext("A", # the subfigure label
font=2, # make the label bold
side=2, # above the y-axis (side 2)
las=1, # rotate text to be upright
at=par("yaxp")[2]*1.10, # place the text 10% higher than the last tick
line=-1.8, # inset the label
cex=1.4, # make the font size larger
col='gray30') # black is too harsh visually.
When drafting plots, I use capital variables to adjust size and other parameters. Finalized plots usually get wrapped in functions.
For example, CEX <- 1
is good for looking at plots in quartz, but for postscript you usually need to make everything much larger. This can be done by setting everything to CEX <- 6
. All of your plot functions should use arguments like cex=CEX
. I find this easier than using par()
, since if I do wrap the plot code in a function, it's easier to change CEX
to an argument.
setEPS()
HEIGHT <- 24
CEX <- 2
LWD <- 2*CEX
RATIO <- phi^2
postscript(graphdir("fig-name"), width=RATIO*HEIGHT, height=HEIGHT)
# plot code
dev.off()
If there's a cleaner way, let me know.
Recently I've been building a bunch of panel plots (e.g. lattice plots, or "Small multiples" using Tufte's term). Often I develop a sketch with ggplot2's facet_grid
, but because I'm OCD, I like using base R graphics for final plots. Perviously I had used par(mfrow=c(nrows, ncols))
to set up multiple plots, but lately I've realized this is a bad way of doing it. The reason is that most panel plots need components relative to the total plot, not the individual sub-plots. Figuring out where these go usually take tedious adjustments into the outer margins of text()
or legend()
elements. Lattice is a cool package and the framework for customizing it using trellis settings is cool, but you can't easily do simple stuff like have panel row labels ("strips" in lattice lexicon) on the outside right side rather than left (stupidly, userOuterStrips()
from latticeExtra
only adds outer strips to the left, gah!!). It should be easier: we should just say center with respect to total plot width, and that's that.
The way I build panel plots now up is like so. Say I want a 3 x 2 panel plot with a legend and y and x labels. I first build the main plotting area:
> ncol <- 3; nrow <- 2
> npans <- ncol*nrow
> lmat <- matrix(1:npans, ncol=ncol, nrow=nrow, byrow=TRUE)
> lmat
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
This specifies the positions of each panel, for panels 1, 2, 3, ... 6. Now imagine two elements of a matrix have the same number, e.g.
1 2 3 10
4 5 6 10
7 8 9 10
Here the column of 10s is one whole plot. It's x/y dimensions can be retrieved with par('usr')
, and then global row center across all other panels can easily be found for a legend. Let's see a real example of adding a legend (recopying all previous code for self-contained example):
ncol <- 3; nrow <- 2
npans <- ncol*nrow
lmat <- matrix(1:npans, ncol=ncol, nrow=nrow, byrow=TRUE)
opar <- par(no.readonly=TRUE)
par(oma=rep(1, 4), mar=rep(2, 4))
layout(lmat)
n <- 30 # num random data points
for (i in 1:npans) {
# random data
x <- rnorm(n)
y <- rnorm(1, 0, 10) * x + rnorm(n, 0, 3)
plot(x, y, xlab='', ylab='')
}
par(opar)
Next, we add a column to the right to make room for a legend. The legend has less width than the panels, so now we specify layout's widths
and heights
:
ncol <- 3; nrow <- 2
npans <- ncol*nrow
lmat <- matrix(1:npans, ncol=ncol, nrow=nrow, byrow=TRUE)
# now, add a new plot region for the legend
lmat <- cbind(lmat, npans + 1)
# we need to adjust the width and heights.
# the height vector stays same, as we aren't adding new row.
lw <- rep(1, ncol)
lh <- rep(1, nrow)
# we added a column, so we need a new width
lw <- c(lw, 0.4)
opar <- par(no.readonly=TRUE)
par(oma=rep(1, 4), mar=rep(2, 4))
layout(lmat, widths=lw, heights=lh)
n <- 30 # num random data points
for (i in 1:npans) {
# random data
x <- rnorm(n)
y <- rnorm(1, 0, 10) * x + rnorm(n, 0, 3)
plot(x, y, xlab='', ylab='')
}
plot.new()
legend('center', legend=c('some', 'stuff'), fill=c('red', 'green'))
par(opar)
Here's the final product:
Finally, we add a x axis label. This is to be a new row in the layout matrix. But, if we want the width to be only across panels, and not include the legend, we create a dummy panel with nothing it:
ncol <- 3; nrow <- 2
npans <- ncol*nrow
lmat <- matrix(1:npans, ncol=ncol, nrow=nrow, byrow=TRUE)
# now, add a new plot region for the legend
lmat <- cbind(lmat, npans + 1)
# we need to adjust the width and heights.
# the height vector stays same, as we aren't adding new row.
lw <- rep(1, ncol)
lh <- rep(1, nrow)
# we added a column, so we need a new width
lw <- c(lw, 0.4)
# now, add a x-label
lmat <- rbind(lmat, npans + 2)
lh <- c(lh, 0.08)
# now, we want the plot centered within the panels, NOT including the legend.
# so we create a dummy panel 9 in the last element
lmat[row(lmat) == 3 & col(lmat) == 4] <- 9
opar <- par(no.readonly=TRUE)
par(oma=rep(2, 4), mar=rep(1, 4))
layout(lmat, widths=lw, heights=lh)
n <- 30 # num random data points
for (i in 1:npans) {
# random data
x <- rnorm(n)
y <- rnorm(1, 0, 10) * x + rnorm(n, 0, 3)
plot(x, y, xlab='', ylab='')
}
par(mar=rep(0, 4))
plot.new()
legend('center', legend=c('some', 'stuff'), fill=c('red', 'green'))
plot.new()
coords <- par('usr')
text(mean(coords[1:2]), coords[3], 'x axis label', xpd=NA)
plot.new() ## dummy panel
par(opar)
Let's look at that layout matrix:
> lmat
[,1] [,2] [,3] [,4]
[1,] 1 2 3 7
[2,] 4 5 6 7
[3,] 8 8 8 9
Now, use a similar approach for the y axis.
mtext()
is convenient, but it's behavior is odd. It doesn't obey srt
to rotate text, meaning you cannot rotate panel plot labels on the right side to read top to bottom (ridiculous!!). text()
is better behaved.
Usually I use par('usr')
to get the plot coordinates, and create a centered distance off this:
corners <- par('usr')
for (i in seq_len(prod(dims))) {
corners = par("usr") # get plot coordinates
# iterating over the ith panel, this is the first set of rows:
if (i <= dims[2]) {
# plot panel col label, centered (mean of x min/max) and 4% past the max y value
text(x=mean(corners[1:2]), y=corners[4]*(1+0.04),
latex2exp:::TeX(paste0(col_lab, prettyNum(panel_col))),
font=2, cex=1.6, xpd=NA, col=title_col)
}
# and this is the last column:
if (i %% dims[2] == 0) {
# plot panel row label, centered (mean of y min/max) and at the max y value (this looks right to me)
text(x = corners[2], y = mean(corners[3:4]),
latex2exp:::TeX(paste0(row_lab, prettyNum(panel_row))),
srt = 270, font=2, cex=1.6, xpd=NA, col=title_col)
}