Adapting the small multiples principle to fields that are not exactly parallel.
Reading, Recording, Notebook, Rmarkdown
Faceting is useful whenever you want different rows of your dataset to appear in different panels. What if you want to compare different columns, or work with several datasets? A more general alternative is to use concatenation or repetition.
We’re going to illustrate this using vega-lite, but the principles also apply to ggplot21. Suppose we want to plot several weather variables next to one another – we can use hconcat
. The idea here is to construct each plot separately and then combine them only at the very end.
{
const width = 180,
= 130;
height
// maximum temperature plot
const temp_max = vl.markLine()
.data(weather)
.encode(
.x().month("date"),
vl.y().average("temp_max"),
vl.color().fieldN("location")
vl
) .width(width)
.height(height);
// precipitation plot
const precip = vl.markLine()
.data(weather)
.encode(
.x().month("date"),
vl.y().average("precipitation"),
vl.color().fieldN("location")
vl
) .width(width)
.height(height);
// precipitation plot
const wind = vl.markLine()
.data(weather)
.encode(
.x().month("date"),
vl.y().average("wind"),
vl.color().fieldN("location")
vl
).width(width)
.height(height);
return vl.hconcat(temp_max, precip, wind)
.data(weather)
.render()
}
robservable("@krisrs1128/examples-of-repetition", include = 4, height = 220)
This implementation is straightforward, but very clumsy. Whenever you find yourself copying and pasting code you should ask yourself whether there is a more elegant way to implement the same idea. In this case, there is, by reusing the same template for everything except the \(y\)-axis encoding.
{const width = 180,
= 130;
height
const base = vl.markLine()
.data(weather)
.encode(
.x().month("date"),
vl.color().fieldN("location")
vl
).width(width)
.height(height);
const temp_max = base.encode(vl.y().average("temp_max")),
= base.encode(vl.y().average("precipitation")),
precip = base.encode(vl.y().average("wind"));
wind
return vl.hconcat(temp_max, precip, wind).render()
}
robservable("@krisrs1128/examples-of-repetition", include = 5, height = 220)
That’s better, but we can be even more concise, by using repetition. This lets us reuse the same template by referring to an abstract vl.repeat()
object in the encoding.
{const width = 180,
= 130;
height
return vl.markLine()
.data(weather)
.encode(
.x().month("date"),
vl.color().fieldN("location"),
vl.y().average(vl.repeat("column"))
vl
).width(width)
.height(height)
.repeat({"column": ["temp_max", "precipitation", "wind"]})
.render()
}
robservable("@krisrs1128/examples-of-repetition", include = 6, height = 220)
Let’s use this idea to generate a scatterplot matrix, a type of plot that shows all pairs of scatterplots between columns in a dataset. This type of plot is often useful in revealing correlations between fields.
{const width = 140,
= 140;
height
return vl.markPoint({filled: true, size: 3, opacity: 0.8})
.data(weather)
.encode(
.x().fieldQ(vl.repeat("column")),
vl.y().fieldQ(vl.repeat("row")),
vl.color().fieldN("location")
vl
).width(width)
.height(height)
.repeat({
column: ["temp_max", "precipitation", "wind"],
row: ["temp_max", "precipitation", "wind"],
}) .render()
}
robservable("@krisrs1128/examples-of-repetition", include = 7)
The analogous function is called grid.arrange
from the gridExtra
package↩︎