ggblanket is a package of ggplot2 wrapper functions.
The primary objective is to simplify ggplot2 visualisation.
Secondary objectives relate to:
library(dplyr)
library(ggplot2)
library(ggblanket)
library(patchwork)
penguins2 <- palmerpenguins::penguins |>
mutate(sex = stringr::str_to_sentence(sex)) |>
tidyr::drop_na(sex)
Each gg_*
function wraps a ggplot2
ggplot(aes(...))
function with the applicable ggplot2
geom_*()
function. Each gg_*
function is named
after the geom_*
function they wrap.
The colour and fill aesthetics of ggplot2 are merged into a single
concept represented by the col
argument. This argument
means that everything should be coloured according to it, i.e. all
points, lines and polygon interiors.
# ggplot2
p1 <- penguins2 |>
ggplot() +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
colour = species))
p2 <- penguins2 |>
ggplot() +
geom_density(aes(x = flipper_length_mm,
fill = species)) +
labs(fill = "Species")
p1 / p2
# ggblanket
p1 <- penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = species)
p2 <- penguins2 |>
gg_density(
x = flipper_length_mm,
col = species)
p1 / p2
The pal
argument is used to customise the colours of the
geom. A user can provide a vector of colours to this argument. It can be
named or not. It works in a consistent way - regardless of whether a
col
argument is added or not. A named palette can be used
to make individual colours stick to particular values. ggblanket uses
alpha defaults to make outputs look pretty.
# ggplot2
p1 <- penguins2 |>
ggplot() +
geom_histogram(aes(x = body_mass_g),
fill = "#414D6B")
p2 <- penguins2 |>
ggplot() +
geom_jitter(aes(x = species,
y = body_mass_g,
colour = sex)) +
scale_colour_manual(values = c("#1B9E77", "#9E361B"))
p1 / p2
# ggblanket
p1 <- penguins2 |>
gg_histogram(
x = body_mass_g,
pal = "#414D6B")
p2 <- penguins2 |>
gg_jitter(
x = species,
y = body_mass_g,
col = sex,
pal = c("#1B9E77", "#9E361B"))
p1 / p2
Faceting is treated as if it were an aesthetic. Users just provide an
unquoted variable to facet by. If a single facet (or facet2) variable is
provided, it’ll default to a “wrap” layout. But users can change this
with a facet_layout = "grid"
argument.
# ggplot2
penguins2 |>
ggplot() +
geom_violin(aes(x = sex,
y = body_mass_g)) +
facet_wrap(vars(species))
A facet2
argument is also provided for extra
functionality and flexibility. If both facet
and
facet2
variables are provided, then it’ll default to a
“grid” layout of facet
by facet2
. But users
can change this with a facet_layout = "wrap"
argument.
# ggplot2
penguins2 |>
ggplot() +
geom_histogram(aes(x = flipper_length_mm)) +
facet_grid(rows = vars(sex), cols = vars(species))
Unspecified x
, y
, and col
titles are converted to sentence case with snakecase::to_sentence. All
titles can be manually changed using the *_title
arguments.
The default conversion is intended to make titles sometimes able to be
left as is. Use *_title = ""
to remove a title.
# ggplot2
penguins2 |>
ggplot() +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
colour = sex)) +
facet_wrap(vars(species)) +
scale_x_continuous(breaks = scales::breaks_pretty(n = 3))
# ggblanket
penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
facet = species)
Prefixed arguments are available to customise titles, scales, guides,
and faceting. These prefixes organise the adjustments by whether they
relate to x
, y
, col
or
facet
.
# ggplot2
penguins2 |>
ggplot() +
geom_jitter(aes(x = species,
y = body_mass_g,
colour = sex)) +
expand_limits(y = 0) +
scale_x_discrete(labels = \(x) stringr::str_sub(x, 1, 1)) +
scale_y_continuous(breaks = scales::breaks_width(1500),
labels = scales::label_number(big.mark = " "),
expand = expansion(mult = c(0, 0.05)),
trans = "sqrt") +
labs(x = "Species", y = "Body mass (g)", col = NULL) +
theme(legend.position = "top") +
theme(legend.justification = "left") +
scale_colour_manual(values = scales::hue_pal()(2),
guide = ggplot2::guide_legend(title.position = "top"))
# ggblanket
penguins2 |>
gg_jitter(
x = species,
y = body_mass_g,
col = sex,
x_labels = \(x) stringr::str_sub(x, 1, 1),
y_include = 0,
y_breaks = scales::breaks_width(1500),
y_labels = scales::label_number(big.mark = " "),
y_expand = expansion(mult = c(0, 0.05)),
y_trans = "sqrt",
y_title = "Body mass (g)",
col_legend_place = "t",
col_title = "")
These prefixed arguments work nicely with the Rstudio autocomplete, if users:
gg_*
functions.With these settings and use of the pipe, users can type the prefix, and then use the tab and arrow keys to assist in finding and selecting the arguments they need to adjust.
Users can use the theme
argument in a gg_*
function for the theme of a plot.
Alternatively, users can set the theme globally using the
ggplot2::theme_set
function, such that all subsequent plots
will use this by default.
ggblanket provides two complete ggplot2 theme functions called
light_mode
(the default) and dark_mode
. The
first argument is the base_size
. This changes the size of
all the text to this, except the title is 10% higher and the caption is
10% lower. In quarto, it is likely that users will want to set the
*_mode
theme to have a larger base_size
(e.g. ggplot2::theme_set(light_mode(11))
).
Note that theme_set(theme_grey())
resets the set theme
for ggplot2 code to theme_grey and for ggblanket gg_*
functions to light_mode()
. If you want ggblanket
gg_*
functions to default to using
theme_grey()
, then you must modify the base_size slightly
(e.g. theme_set(theme_grey(11.01))
).
# ggblanket
# theme_set(dark_mode(10))
penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
title = "Penguins body mass by flipper length",
subtitle = "Palmer Archipelago, Antarctica",
caption = "Source: Gorman, 2020",
theme = dark_mode(10))
Note that the gg_*
function will by default adjust what
gridlines are present and the placement of the legend. Therefore, if you
are providing a theme other than light_mode
or
dark_mode
, ggblanket works well if this theme has both
vertical and horizontal gridlines. If users want everything adjusted as
per the theme, then they can +
their theme onto the plot
instead.
# ggblanket
p1 <- penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
x_breaks = scales::breaks_pretty(n = 3),
theme = theme_grey(),
title = "theme= theme_grey()")
p2 <- penguins2 |>
gg_point(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
x_breaks = scales::breaks_pretty(n = 3),
title = "+ theme_grey()") +
theme_grey()
p1 + p2
...
The ...
argument is placed in the gg_*
function within the wrapped ggplot2::geom_*
function. This
means all other arguments in the geom_*
function are
available to users. Common arguments from ...
to add are
size
, linewidth
and width
.
# ggblanket
penguins2 |>
gg_smooth(
x = flipper_length_mm,
y = body_mass_g,
col = sex,
linewidth = 0.5, #accessed via geom_smooth
level = 0.99) #accessed via geom_smooth
Where the orientation is normal (i.e. vertical):
It does the opposite where the orientation is horizontal.
Note this symmetry approach does not apply: * if a
transformation other than identity or reverse is applied to x or y
scales. * for gg_raster
, gg_contour_filled
or
gg_density_2d_filled
In some circumstances the ggplot2 approach to default scales may be
preferable. In these cases, users can revert to the ggplot2 approach by
using *_limits = c(NA, NA)
and
*_expand = c(0.05, 0.05)
(or add
scale_*_continuous()
).
# ggplot2
penguins2 |>
group_by(species, sex) |>
summarise(body_mass_g = mean(body_mass_g)) |>
ggplot() +
geom_col(aes(x = body_mass_g,
y = species,
fill = sex),
position = "dodge",
width = 0.66)
# ggblanket
penguins2 |>
group_by(species, sex) |>
summarise(body_mass_g = mean(body_mass_g)) |>
gg_col(
x = body_mass_g,
y = species,
col = sex,
position = "dodge",
width = 0.66)
Sometimes with small plots or faceted plots etc, the labels can be
too squashed. Making the breaks width bigger can waste space, due to the
afore-mentioned approach of ggblanket to making pretty scales. An
alternative approach is to use the str_keep_seq
function
with the *_labels
arguments to only keep every 2nd (or nth)
label.
# ggblanket
penguins2 |>
group_by(species, sex) |>
summarise(body_mass_g = mean(body_mass_g)) |>
ungroup() |>
gg_col(
y = body_mass_g,
x = species,
col = sex,
position = "dodge",
width = 0.5,
x_labels = \(x) stringr::str_sub(x, 1, 1),
y_labels = \(x) str_keep_seq(x),
title = "Keep every 2nd label",
theme = light_mode(title_face = "plain"))
Users can make plots with multiple layers with ggblanket by adding on
ggplot2::geom_*
layers.
The gg_*
function puts the aesthetic variables
(i.e. x
, y
, col
) within the
wrapped ggplot
function. Therefore, these aesthetics will
inherit to any subsequent layers added.
Where there are multiple geom layers in a desired plot, users should
determine the gg_*
function with care:
gg_*
function should be appropriate to be
the bottom layer of the plot. This is because the geoms will plot in
order.# ggblanket + ggplot2
p1 <- ggplot2::economics |>
slice_min(order_by = date, n = 10) |>
gg_line(
x = date,
y = unemploy,
pal = guardian()[1],
x_title = "",
y_title = "Unemployment",
y_include = 0,
linewidth = 1,
x_breaks = scales::breaks_width("3 months"),
title = "gg_line + geom_point",
theme = light_mode(title_face = "plain")) +
geom_point(colour = guardian()[2])
p2 <- ggplot2::economics |>
slice_min(order_by = date, n = 10) |>
gg_point(
x = date,
y = unemploy,
pal = guardian()[2],
x_title = "",
y_title = "Unemployment",
y_include = 0,
x_breaks = scales::breaks_width("3 months"),
title = "gg_point + geom_line",
theme = light_mode(title_face = "plain")) +
geom_line(colour = guardian()[1], linewidth = 1)
p1 + p2
If some geom layers have a col
aesthetic and some do
not, then a gg_*
function should be chosen that
has a col
argument in it. This will enable ggblanket legend
placement and access to col_*
arguments. It is also a more
reliable approach. If later layers do not require the col
aesthetic, then the inheit.aes = FALSE
argument should be
used.
In some situations, gg_blank
may be
required.
If users are building a horizontal plot that includes multiple
geoms, it is recommended that users build the plot vertically with
ggblanket - and then use ggplot2::coord_flip
to make it
horizontally.
Users need to ensure that the scales built by their gg_*
function are appropriate for subsequent layers. Plot scales are built by
the gg_*
function based on the data
,
x
, y
, *_limits
,
*_include
, stat
, position
and
coord
arguments in the gg_*
function.
# ggblanket + ggplot2
d <- penguins2 |>
group_by(species) |>
summarise(body_mass_g = mean(body_mass_g)) |>
mutate(lower = body_mass_g * 0.95) |>
mutate(upper = body_mass_g * 1.2)
p1 <- d |>
gg_col(
x = species,
y = body_mass_g,
col = species,
width = 0.75,
y_include = c(0, max(d$upper)),
y_labels = \(x) x / 1000,
y_title = "Body mass kg",
col_legend_place = "n") +
geom_errorbar(aes(ymin = lower, ymax = upper),
colour = "black",
width = 0.1) +
coord_flip()
p2 <- d |>
gg_col(
x = species,
y = body_mass_g,
col = species,
colour = "#d3d3d3",
fill = "#d3d3d3",
width = 0.75,
y_include = c(0, max(d$upper)),
y_labels = \(x) x / 1000,
y_title = "Body mass kg",
col_legend_place = "n") +
geom_errorbar(aes(ymin = lower, ymax = upper),
width = 0.1) +
coord_flip()
p1 / p2
ggblanket requires unquoted variables only for x
,
y
, col
, facet
and
facet2
. You cannot wrap these in a function. Instead you
need to apply the function to the relevant variable in the data prior to
plotting. For example, reordering or reversing a factor or dropping
NAs.
p1 <- diamonds |>
count(color) |>
gg_col(
x = n,
y = color,
width = 0.75,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "Default y",
theme = light_mode(title_face = "plain")
)
p2 <- diamonds |>
count(color) |>
mutate(color = forcats::fct_rev(color)) |>
gg_col(
x = n,
y = color,
width = 0.75,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "Reverse y",
theme = light_mode(title_face = "plain")
)
p3 <- diamonds |>
count(color) |>
mutate(color = forcats::fct_reorder(color, n)) |>
gg_col(
x = n,
y = color,
width = 0.75,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "Reordered y ascending by x",
theme = light_mode(title_face = "plain")
)
p4 <- diamonds |>
count(color) |>
mutate(color = color |>
forcats::fct_reorder(n) |>
forcats::fct_rev()) |>
gg_col(
x = n,
y = color,
width = 0.75,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "Reordered y decending by x",
theme = light_mode(title_face = "plain")
)
(p1 + p2) / (p3 + p4)
ggblanket keeps unused factor levels in the plot. If users wish to drop unused levels they should likewise do it in the data prior to plotting.
p1 <- diamonds |>
count(color) |>
filter(color %in% c("E", "G", "I")) |>
gg_point(
x = n,
y = color,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "A factor filtered",
theme = light_mode(title_face = "plain"))
p2 <- diamonds |>
count(color) |>
filter(color %in% c("E", "G", "I")) |>
mutate(color = forcats::fct_drop(color)) |>
gg_point(
x = n,
y = color,
x_labels = \(x) x / 1000,
x_title = "Count (thousands)",
title = "A factor filtered & unused levels dropped",
theme = light_mode(title_face = "plain"))
p1 + p2
ggblanket uses different defaults for colouring. The default
pal
is:
#357BA2
or mako(9)[5]
for no
col
variableggblanket::guardian
for a discrete col
variable with 4 or less levels (or unique values if a character). This
palette is colourblind safe.scales::hue_pal
for a discrete col
variable with 5 or more levels (or unique values if not ordered).viridis::mako
reversed for a continuous
col
variable"#bebebe"
or grey
for NAggblanket uses different alpha
defaults for the
different gg_*
functions. Polygons that generally have no
gap or overlap default to 1: gg_bin_2d
,
gg_contour_filled
, gg_density_2d_filled
,
gg_hex
- as well as gg_sf
for polygons with a
col
aesthetic. Polygons that generally overlap default to
0.5: gg_density
. Polygons that generally have key lines
within them also default to 0.5: gg_boxplot
,
gg_crossbar
, gg_ribbon
and
gg_smooth
. gg_label
defaults to 0.05.
gg_blank
has no alpha argument. Other polygons default to
0.9: gg_area
, gg_bar
, gg_col
,
gg_histogram
, gg_polygon
,
gg_rect
, gg_tile
and gg_violin
.
For all other contexts, alpha defaults to 1.
By default, ggblanket keeps values outside of the limits
(*_oob = scales::oob_keep
) in calculating the geoms and
scales to plot. It also does not clip anything outside the
cartesian coordinate space by default
(coord = ggplot2::coord_cartesian(clip = "off"
)).
ggplot2 by default drops values outside of the limits in calculating
the geoms and scales to plot (scales::oob_censor
), and
clips anything outside the cartesian coordinate space
(coord = ggplot2::coord_cartesian(clip = "on"
)).
Users should be particularly careful when setting limits for stats
other than identity
.
p1 <- economics |>
gg_smooth(
x = date,
y = unemploy,
y_labels = \(x) str_keep_seq(x),
title = "No x_limits set",
theme = light_mode(title_face = "plain")) +
geom_vline(xintercept = c(lubridate::ymd("1985-01-01", "1995-01-01")),
col = guardian(n = 1),
linetype = 3) +
geom_point(col = guardian(n = 1), alpha = 0.05)
p2 <- economics |>
gg_smooth(
x = date,
y = unemploy,
x_limits = c(lubridate::ymd("1985-01-01", "1995-01-01")),
x_labels = \(x) stringr::str_sub(x, 3, 4),
y_labels = \(x) str_keep_seq(x),
title = "x_limits set",
theme = light_mode(title_face = "plain")) +
geom_point(col = guardian(n = 1), alpha = 0.1)
p3 <- economics |>
gg_smooth(
x = date,
y = unemploy,
x_limits = c(lubridate::ymd("1985-01-01", "1995-01-01")),
x_labels = \(x) stringr::str_sub(x, 3, 4),
y_labels = \(x) str_keep_seq(x),
coord = coord_cartesian(clip = "on"),
title = "x_limits set & cartesian space clipped",
theme = light_mode(title_face = "plain")) +
geom_point(col = guardian(n = 1), alpha = 0.1)
p4 <- economics |>
gg_smooth(
x = date,
y = unemploy,
x_limits = c(lubridate::ymd("1985-01-01", "1995-01-01")),
x_labels = \(x) stringr::str_sub(x, 3, 4),
x_oob = scales::oob_censor,
y_labels = \(x) str_keep_seq(x),
title = "x_limits set & x_oob censored",
theme = light_mode(title_face = "plain")) +
geom_point(col = guardian(n = 1), alpha = 0.1)
p5 <- economics |>
filter(between(date, lubridate::ymd("1985-01-01"), lubridate::ymd("1995-01-01"))) |>
gg_smooth(
x = date,
y = unemploy,
x_labels = \(x) stringr::str_sub(x, 3, 4),
y_labels = \(x) str_keep_seq(x),
title = "x data filtered",
theme = light_mode(title_face = "plain")) +
geom_point(col = guardian(n = 1), alpha = 0.1)
p1 / (p2 + p3) / (p4 + p5)
ggblanket is much slower than ggplot2 in computational speed, due to the hack that underlies ggblanket to make its pretty continuous scales.
bench::mark({
penguins2 |>
gg_point(x = flipper_length_mm,
y = body_mass_g,
col = species)
}, iterations = 10)
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch> <bch:> <dbl> <bch:byt> <dbl>
#> 1 { gg_point(penguins2, x = flipper_l… 121ms 123ms 8.13 804KB 8.13
bench::mark({
penguins2 |>
ggplot() +
geom_point(aes(x = flipper_length_mm, y = body_mass_g, colour = species))
}, iterations = 10)
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:> <bch:> <dbl> <bch:byt> <dbl>
#> 1 { ggplot(penguins2) + geom_point(a… 2.12ms 2.18ms 439. 12.6KB 0
*_title
is equivalent to
ggplot2::labs(* = ...)
or
ggplot2::scale_*(name = ...)
.TRUE
always
comes before FALSE
.*_include
works in a similar way to
ggplot2::expand_limits(* = ...)
.See the ggblanket website for further information, including articles and function reference.