| Title: | Marimekko Plots for 'ggplot2' |
|---|---|
| Description: | Create marimekko (mosaic) plots as a 'ggplot2' layer. Column widths encode marginal proportions of one categorical variable and segment heights encode conditional proportions of a second categorical variable. Based on the mosaic display method by Hartigan and Kleiner (1981) <doi:10.1007/978-1-4613-9464-8_37>. |
| Authors: | Dawid Kałędkowski [aut, cre] (ORCID: <https://orcid.org/0000-0001-9533-457X>) |
| Maintainer: | Dawid Kałędkowski <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-17 07:14:38 UTC |
| Source: | https://github.com/gogonzo/marimekko |
Compute marimekko tile rectangles as a data frame
fortify_marimekko( data, formula, weight = NULL, gap = 0.01, gap_x = NULL, gap_y = NULL, standardize = FALSE )fortify_marimekko( data, formula, weight = NULL, gap = 0.01, gap_x = NULL, gap_y = NULL, standardize = FALSE )
data |
A data frame. |
formula |
A one-sided formula specifying the mosaic hierarchy,
using the same syntax as |
weight |
Name of the weight variable (unquoted or string), or
|
gap |
Numeric. Size of gap between tiles. Default |
gap_x |
Numeric. Horizontal gap. Overrides |
gap_y |
Numeric. Vertical gap. Overrides |
standardize |
Logical. Equal-width columns. Default |
A data frame with columns for each formula variable, plus
fill, colour, xmin, xmax, ymin, ymax, x, y,
weight, .proportion, .marginal, and .residuals.
titanic <- as.data.frame(Titanic) fortify_marimekko(titanic, formula = ~ Class | Survived, weight = Freq) # 3-variable formula fortify_marimekko(titanic, formula = ~ Class | Survived | Sex, weight = Freq)titanic <- as.data.frame(Titanic) fortify_marimekko(titanic, formula = ~ Class | Survived, weight = Freq) # 3-variable formula fortify_marimekko(titanic, formula = ~ Class | Survived | Sex, weight = Freq)
Generalized mosaic plot with formula-based variable nesting
geom_marimekko( mapping = NULL, data = NULL, formula = NULL, gap = 0.01, gap_x = NULL, gap_y = NULL, colour = NULL, alpha = 0.9, show_percentages = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ... )geom_marimekko( mapping = NULL, data = NULL, formula = NULL, gap = 0.01, gap_x = NULL, gap_y = NULL, colour = NULL, alpha = 0.9, show_percentages = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Aesthetic mapping. Optionally accepts |
data |
A data frame. |
formula |
A one-sided formula specifying the mosaic hierarchy. See the sections above for a detailed explanation. Quick reference:
|
gap |
Numeric. Gap between tiles as fraction of plot area.
Default |
gap_x |
Numeric. Horizontal gap override. Default |
gap_y |
Numeric. Vertical gap override. Default |
colour |
Tile border colour. Default |
alpha |
Tile transparency. Default |
show_percentages |
Logical. If |
na.rm |
Logical. Remove missing values. Default |
show.legend |
Logical. Show legend. Default |
inherit.aes |
Logical. Inherit aesthetics from |
... |
Additional arguments passed to the layer. |
A list of ggplot2 layers (geom + axis scales).
The formula uses two operators to encode the full partitioning hierarchy in a single expression:
| (pipe)Separates nesting levels. Each | switches the
splitting direction, alternating horizontal, vertical, horizontal,
vertical, and so on. The first variable (or group) listed is the
outermost split — it partitions the entire plot area. Each
subsequent level partitions the tiles created by the previous
level.
+ (plus)Groups variables at the same nesting level.
All variables joined by + share the same splitting direction
and are applied sequentially within that level. The first +
variable partitions the current tiles, then the second +
variable further subdivides those tiles, still in the same
direction.
The formula is read left to right, from the coarsest (outermost) partition to the finest (innermost):
~ a | bFirst split the plot horizontally by a
(columns whose widths reflect marginal proportions of a).
Then, within each column, split vertically by b (rows whose
heights reflect conditional proportions of b given a).
This is the classic two-variable marimekko / mosaic plot.
~ a | b | cHorizontal by a, then vertical by b,
then horizontal again by c. Three levels of nesting with
alternating directions (h v h).
~ a + b | cHorizontal by a, then horizontal again
by b (same direction because + groups them), then vertical
by c. This is the double decker pattern — all horizontal
splits first, with a single vertical split at the end.
~ a | b + cHorizontal by a, then vertical by b,
then vertical again by c. Two vertical variables nested
within each column.
The stat computes the following variables that can be accessed with
ggplot2::after_stat():
.proportionConditional proportion of the tile within its
immediate parent. For a formula ~ a | b, this is the proportion
of b within each level of a, i.e. .
Values sum to 1 within each parent tile. Useful for mapping to
alpha to fade tiles by their local share:
aes(alpha = after_stat(.proportion)).
.marginalJoint (marginal) proportion of the tile relative to
the whole dataset, i.e. . Values sum to 1
across all tiles. Used internally for x-axis percentage labels when
show_percentages = TRUE, and can be mapped to aesthetics to
emphasise cells by overall frequency.
.residualsPearson residual measuring departure from statistical
independence between the horizontal and vertical variable groups.
Computed as , where is the observed
cell count and is the count expected under independence.
Positive values indicate the cell is more frequent than expected;
negative values indicate less frequent. When only one direction
(all horizontal or all vertical) is present, .residuals is set to 0.
Map to alpha or fill to highlight deviations:
aes(alpha = after_stat(abs(.residuals))).
library(ggplot2) titanic <- as.data.frame(Titanic) # 2-variable mosaic ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) # 3-variable mosaic (h -> v -> h) ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived | Sex ) # Multi-variable fill with interaction() ggplot(titanic) + geom_marimekko( aes(fill = interaction(Sex, Survived), weight = Freq), formula = ~ Class | Sex + Survived ) # Fade tiles by conditional proportion ggplot(titanic) + geom_marimekko( aes(fill = Survived, alpha = after_stat(.proportion), weight = Freq), formula = ~ Class | Survived ) + guides(alpha = "none") # Highlight cells that deviate from independence ggplot(titanic) + geom_marimekko( aes(fill = Survived, alpha = after_stat(abs(.residuals)), weight = Freq), formula = ~ Class | Survived ) + guides(alpha = "none")library(ggplot2) titanic <- as.data.frame(Titanic) # 2-variable mosaic ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) # 3-variable mosaic (h -> v -> h) ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived | Sex ) # Multi-variable fill with interaction() ggplot(titanic) + geom_marimekko( aes(fill = interaction(Sex, Survived), weight = Freq), formula = ~ Class | Sex + Survived ) # Fade tiles by conditional proportion ggplot(titanic) + geom_marimekko( aes(fill = Survived, alpha = after_stat(.proportion), weight = Freq), formula = ~ Class | Survived ) + guides(alpha = "none") # Highlight cells that deviate from independence ggplot(titanic) + geom_marimekko( aes(fill = Survived, alpha = after_stat(abs(.residuals)), weight = Freq), formula = ~ Class | Survived ) + guides(alpha = "none")
Add labels with background to a marimekko plot
geom_marimekko_label( mapping = NULL, data = NULL, position = "identity", ..., size = 3.5, colour = "black", fill = alpha("white", 0.7), label.padding = unit(0.15, "lines"), na.rm = FALSE, show.legend = FALSE, inherit.aes = FALSE )geom_marimekko_label( mapping = NULL, data = NULL, position = "identity", ..., size = 3.5, colour = "black", fill = alpha("white", 0.7), label.padding = unit(0.15, "lines"), na.rm = FALSE, show.legend = FALSE, inherit.aes = FALSE )
mapping |
Set of aesthetic mappings. Only |
data |
A data frame. Default |
position |
Position adjustment. Default |
... |
Additional arguments passed to the layer. |
size |
Text size. Default |
colour |
Text colour. Default |
fill |
Label background colour. Default |
label.padding |
Amount of padding around label. Default
|
na.rm |
Logical. Remove missing values. Default |
show.legend |
Logical. Show legend. Default |
inherit.aes |
Logical. Inherit aesthetics. Default |
A ggplot2 layer.
library(ggplot2) titanic <- as.data.frame(Titanic) ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) + geom_marimekko_label(aes(label = after_stat(weight)))library(ggplot2) titanic <- as.data.frame(Titanic) ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) + geom_marimekko_label(aes(label = after_stat(weight)))
Add text labels to a marimekko plot
geom_marimekko_text( mapping = NULL, data = NULL, position = "identity", ..., size = 3.5, colour = "white", na.rm = FALSE, show.legend = FALSE, inherit.aes = FALSE )geom_marimekko_text( mapping = NULL, data = NULL, position = "identity", ..., size = 3.5, colour = "white", na.rm = FALSE, show.legend = FALSE, inherit.aes = FALSE )
mapping |
Set of aesthetic mappings. Only |
data |
A data frame. Default |
position |
Position adjustment. Default |
... |
Additional arguments passed to the layer. |
size |
Text size. Default |
colour |
Text colour. Default |
na.rm |
Logical. Remove missing values. Default |
show.legend |
Logical. Show legend. Default |
inherit.aes |
Logical. Inherit aesthetics. Default |
A ggplot2 layer.
library(ggplot2) titanic <- as.data.frame(Titanic) ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) + geom_marimekko_text(aes(label = after_stat(weight)))library(ggplot2) titanic <- as.data.frame(Titanic) ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) + geom_marimekko_text(aes(label = after_stat(weight)))
A character vector of 8 bold colours inspired by Marimekko's iconic Unikko poppy pattern. Vibrant, high-contrast tones suited for categorical data visualisation.
marimekko_palmarimekko_pal
An object of class character of length 8.
Retrieve computed tile positions from a marimekko layer
StatMarimekkoTilesStatMarimekkoTiles
An object of class StatMarimekkoTiles (inherits from Stat, ggproto, gg) of length 2.
Use StatMarimekkoTiles as the stat argument in ggplot2::layer()
to pair the tile data with any geom. The only requirement is that
geom_marimekko() must appear before the custom layer so that
tile positions are computed first.
geom_marimekko(), geom_marimekko_text(),
geom_marimekko_label(), fortify_marimekko()
library(ggplot2) titanic <- as.data.frame(Titanic) # Bubble overlay — point size encodes tile count ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived, alpha = 0.4 ) + layer( stat = StatMarimekkoTiles, geom = GeomPoint, mapping = aes(size = after_stat(weight)), data = titanic, position = "identity", show.legend = FALSE, inherit.aes = FALSE, params = list(colour = "white", alpha = 0.7) ) + scale_size_area(max_size = 12) # Residual markers — colour and size show deviation from independence ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) + layer( stat = StatMarimekkoTiles, geom = GeomPoint, mapping = aes( size = after_stat(abs(.residuals)), colour = after_stat(ifelse(.residuals > 0, "over", "under")) ), data = titanic, position = "identity", show.legend = TRUE, inherit.aes = FALSE, params = list(alpha = 0.8) ) + scale_colour_manual( values = c(over = "tomato", under = "steelblue"), name = "Deviation" ) + scale_size_continuous(range = c(1, 8), name = "|Residual|")library(ggplot2) titanic <- as.data.frame(Titanic) # Bubble overlay — point size encodes tile count ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived, alpha = 0.4 ) + layer( stat = StatMarimekkoTiles, geom = GeomPoint, mapping = aes(size = after_stat(weight)), data = titanic, position = "identity", show.legend = FALSE, inherit.aes = FALSE, params = list(colour = "white", alpha = 0.7) ) + scale_size_area(max_size = 12) # Residual markers — colour and size show deviation from independence ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) + layer( stat = StatMarimekkoTiles, geom = GeomPoint, mapping = aes( size = after_stat(abs(.residuals)), colour = after_stat(ifelse(.residuals > 0, "over", "under")) ), data = titanic, position = "identity", show.legend = TRUE, inherit.aes = FALSE, params = list(alpha = 0.8) ) + scale_colour_manual( values = c(over = "tomato", under = "steelblue"), name = "Deviation" ) + scale_size_continuous(range = c(1, 8), name = "|Residual|")
Removes x-axis gridlines and adjusts spacing for mosaic plots. Also applies the marimekko_pal fill scale.
theme_marimekko(base_size = 12, ...)theme_marimekko(base_size = 12, ...)
base_size |
Base font size. Default |
... |
Arguments passed to |
A ggplot2 theme.
library(ggplot2) titanic <- as.data.frame(Titanic) ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) + theme_marimekko()library(ggplot2) titanic <- as.data.frame(Titanic) ggplot(titanic) + geom_marimekko( aes(fill = Survived, weight = Freq), formula = ~ Class | Survived ) + theme_marimekko()