All the new features in {gt} 0.10.0

2024-01-23

The gt package (the one that helps you make beautiful, publication-quality tables in R) has been updated to version 0.10.0. Now, is this a big release? Sure, of course, it is. But it’s not as big as the 0.9.0 one, which had tons of blog posts just to cover everything that was new. This one post is going to cover, as best as it can, all of the big new features in gt 0.10.0. Let’s get this started by looking at nanoplots.

Nanoplots, tiny interactive plots in your gt table

Plots in a table need to be somewhat simple by design. Generally, there isn’t a lot of space to work with! We had these basic design requirements when we started developing the feature known as nanoplots:

  1. compact of marks and labels
  2. basic interactivity
  3. different plot types
  4. customizability

Through some iteration we arrived at something that satisfies all of the design criteria. The new cols_nanoplot() function is the entry point to generating nanoplots in a gt table. Let’s introduce you to it by way of an example (using the new illness dataset). The cols_nanoplot() function can take input data from any number of columns you specify in the columns argument. In the example below, the columns that start with ‘day’ (seven columns in total) each have a single numeric value. The values are taken in the order of column specification in columns and are used in that order in every nanoplot (we’re making the default "line" plot type here). A new column will be generated for the nanoplot, and we’re giving it a specific name ("nanoplots") and also a label (with md("*Progression*"); yes, we can use Markdown here).

library(gt)

illness |>
  dplyr::slice_head(n = 10) |>
  gt(rowname_col = "test") |>
  tab_header("Partial summary of daily tests performed on YF patient") |>
  tab_stubhead(label = md("**Test**")) |>
  cols_hide(columns = c(starts_with("norm"), units)) |>
  cols_nanoplot(
    columns = starts_with("day"),
    new_col_name = "nanoplots",
    new_col_label = md("*Progression*")
  ) |>
  cols_align(align = "center", columns = nanoplots) |>
  tab_footnote(
    footnote = "Measurements from Day 3 through to Day 8.",
    locations = cells_column_labels(columns = nanoplots)
  )
Partial summary of daily tests performed on YF patient
Test Progression1
Viral load
12.0K 250 12.0K 4.20K 1.60K 830 760 520 250
WBC
30.3 4.26 5.26 4.26 9.92 10.5 24.8 30.3 19.0
Neutrophils
27.2 4.72 4.87 4.72 7.92 18.2 22.1 27.2 16.6
RBC
5.98 2.68 5.72 5.98 4.23 4.83 4.12 2.68 3.32
Hb
153 75 153 135 126 115 75 87 95
PLT
74.1 25.6 67.0 38.6 27.4 26.2 74.1 36.2 25.6
ALT
12.8K 512 12.8K 12.6K 6.43K 4.26K 1.62K 673 512
AST
23.7K 782 23.7K 21.4K 14.7K 8.69K 2.19K 1.14K 782
TBIL
163 105 117 144 137 158 127 105 163
DBIL
144 71.4 71.4 105 94.6 144 118 83.6 126
1 Measurements from Day 3 through to Day 8.

If you hover over the data points in a nanoplot, you’ll see values for the data points. They are automatically formatted to be compact (limited space here!), and we also have advanced options to help you control the formatting (customizability!). Also, hovering over the left edge of the nanoplot plot area will show the value range.

That was a pretty simple example that barely scratches the surface of what you can do with nanoplots. Right now, there are three types of nanoplots available: "line", "bar", "boxplot". Here’s an example of the "bar" type of nanoplot, which uses the sza dataset to visualize solar altitude angles.

sza |>
  dplyr::filter(latitude == 20 & tst <= "1200") |>
  dplyr::select(-latitude) |>
  dplyr::filter(!is.na(sza)) |>
  dplyr::mutate(saa = 90 - sza) |>
  dplyr::select(-sza) |>
  tidyr::pivot_wider(
    names_from = tst,
    values_from = saa,
    names_sort = TRUE
  ) |>
  gt(rowname_col = "month") |>
  tab_header(
    title = "Solar Altitude Angles",
    subtitle = "Average values every half hour from 05:30 to 12:00"
  ) |>
  cols_nanoplot(
    columns = matches("0"),
    plot_type = "bar",
    missing_vals = "zero",
    new_col_name = "saa",
    plot_height = "2.5em",
    options = nanoplot_options(
      data_bar_stroke_color = "GoldenRod",
      data_bar_fill_color = "DarkOrange",
      y_val_fmt_fn = function(x) paste0(vec_fmt_number(x, decimals = 1), "&deg;") 
    )
  ) |>
  tab_options(
    table.width = px(400),
    column_labels.hidden = TRUE
  ) |>
  cols_align(
    align = "center",
    columns = everything()
  )
Solar Altitude Angles
Average values every half hour from 05:30 to 12:00
jan
47 0 0.0° 0.0° 0.0° 5.1° 11.3° 17.3° 23.9° 28.5° 33.5° 37.9° 41.7° 44.5° 46.4° 47.0°
feb
52.8 0 0.0° 0.0° 1.1° 7.5° 14.2° 20.4° 26.7° 32.3° 37.8° 42.6° 46.9° 50.0° 52.2° 52.8°
mar
62.3 0 0.0° 0.0° 4.3° 11.2° 18.0° 24.8° 31.4° 37.7° 43.8° 49.5° 54.5° 58.6° 61.4° 62.3°
apr
74.5 0 0.0° 1.5° 8.5° 15.6° 22.6° 29.7° 36.6° 43.5° 50.3° 56.8° 63.1° 68.7° 72.8° 74.5°
may
85 0 0.0° 5.0° 11.8° 18.8° 25.7° 32.8° 39.8° 46.8° 53.9° 60.9° 63.9° 74.8° 81.2° 85.0°
jun
88 0 0.8° 7.3° 14.0° 20.7° 27.5° 34.3° 41.2° 48.1° 55.0° 61.9° 68.9° 75.8° 82.7° 88.0°
jul
86.9 0 1.2° 7.7° 14.3° 20.9° 27.7° 34.5° 41.3° 48.2° 55.0° 61.9° 68.8° 75.7° 82.3° 86.9°
aug
88.1 0 0.0° 6.2° 12.9° 19.8° 26.7° 33.6° 40.6° 47.6° 54.6° 61.7° 68.7° 75.7° 82.7° 88.1°
sep
78.4 0 0.0° 2.8° 9.8° 16.8° 23.9° 30.9° 37.9° 44.9° 51.9° 58.7° 65.3° 71.4° 76.3° 78.4°
oct
66.9 0 0.0° 0.0° 5.9° 12.9° 19.8° 26.7° 33.5° 40.1° 46.5° 52.5° 58.0° 62.6° 65.7° 66.9°
nov
55.6 0 0.0° 0.0° 2.2° 8.7° 15.5° 21.7° 28.2° 34.0° 39.8° 44.7° 49.3° 52.6° 54.9° 55.6°
dec
48.2 0 0.0° 0.0° 0.0° 5.7° 12.0° 18.2° 23.9° 29.5° 34.4° 39.1° 42.8° 45.8° 47.6° 48.2°

This example demonstrates the use of the nanoplot_options() helper function, which is to be invoked at the options argument of cols_nanoplot(). Through that helper, layers of the nanoplots can be selectively removed, the aesthetics of the remaining plot components can be modified, and display values can even be customized. We were able to modify the display of the bar values (on hover) with the y_val_fmt_fn argument; we just had to supply a function to perform that numeric formatting.

Let’s have one more example with nanoplots, this time involving box plots. For that, we use plot_type = "boxplot". We’ll take a slice of the pizzaplace dataset and create a simple table that displays a box plot of pizza sales for a selection of days. If you can get string-based data of the form "2.6,3.6,0,1.5" in a column, that’s valid input for a nanoplot. This is easy to do with dplyr::summarize(), and the preparatory work in the following example does just that.

pizzaplace |>
  dplyr::filter(date <= "2015-01-14") |>
  dplyr::mutate(time = as.numeric(hms::as_hms(time))) |>
  dplyr::summarize(time = paste(time, collapse = ","), .by = date) |>
  dplyr::mutate(is_weekend = lubridate::wday(date) %in% 6:7) |>
  gt() |>
  tab_header(title = "Pizza Sales in Early January 2015") |>
  fmt_date(columns = date, date_style = 2) |>
  cols_nanoplot(
    columns = time,
    plot_type = "boxplot",
    options = nanoplot_options(y_val_fmt_fn = function(x) hms::as_hms(x))
  ) |>
  cols_hide(columns = is_weekend) |>
  cols_align(align = "center", columns = nanoplots) |>
  cols_align(align = "left", columns = date) |>
  tab_style(
    style = cell_borders(
      sides = "left", color = "gray"),
    locations = cells_body(columns = nanoplots)
  ) |>
  tab_style_body(
    style = cell_fill(color = "#E5FEFE"),
    values = TRUE,
    targets = "row"
  ) |>
  tab_options(column_labels.hidden = TRUE)
Pizza Sales in Early January 2015
Thursday, January 1, 2015
12:16:45.9513:34:0715:53:1818:31:25.521:16:00
Friday, January 2, 2015
12:03:00.613:34:4917:54:0419:23:0221:56:46.6
Saturday, January 3, 2015
12:22:22.2514:42:42.7517:01:3819:36:0021:55:42
Sunday, January 4, 2015
11:43:5413:56:35.7516:41:4820:11:09.7521:09:20
Monday, January 5, 2015
11:59:07.814:22:1916:49:3818:46:5420:37:40
Tuesday, January 6, 2015
12:05:2113:03:5315:03:1217:59:4820:05:12
Wednesday, January 7, 2015
12:00:52.8513:04:5614:31:1018:07:4620:25:45
Thursday, January 8, 2015
11:54:1412:34:2014:19:5817:52:4220:29:45.4
Friday, January 9, 2015
12:08:17.713:32:5016:14:2818:57:4621:37:41.6
Saturday, January 10, 2015
12:27:4713:48:4918:33:2520:13:40.522:40:06.5
Sunday, January 11, 2015
12:08:37.513:37:4616:34:0718:13:1220:31:02
Monday, January 12, 2015
12:12:58.914:08:1616:21:1018:22:55.519:57:53.4
Tuesday, January 13, 2015
12:29:5113:40:0816:14:4318:31:0720:31:32.5
Wednesday, January 14, 2015
12:07:3012:54:3517:04:0818:38:19.2521:22:08.1

The other trick was to convert the string-based 24-hour-clock time values (e.g., "11:38:36") to the number of seconds elapsed in a day. Doing so gives us continuous values that can be incorporated into each box plot. And, by supplying a function to the y_val_fmt_fn argument within nanoplot_options(), we can transform the integer seconds values back to clock times for display on hover.

These examples only show part of what’s possible with the feature. We intend to go much further with nanoplots in future releases. If you’d like to see a few more examples, take a look at the docs for cols_nanoplot().

Add columns/rows to your table, even start from an empty table

The nanoplots examples showed us something new in gt: making new columns. This wasn’t possible before but is very possible now. We can add new columns to a table with the cols_add() function, and it works quite a bit like the dplyr mutate() function. You supply name-value pairs where the name is the new column name, and the value part describes the data that will go into the column. The latter can: (1) be a vector where the length of the number of rows in the data table, (2) be a single value (which will be repeated all the way down), or (3) involve other columns in the table (as they represent vectors of the correct length).

The new columns are added to the end of the column series by default but can instead be added internally by using either the .before or .after arguments. If entirely empty (i.e., all NA) columns need to be added, you can use any of the NA types (e.g., NA, NA_character_, NA_real_, etc.) for such columns.

Let’s look at a simple example using a subset of the exibble dataset. We’ll add a single column to the right of all the existing columns and call it country. This new column needs eight values, and these will be supplied when using cols_add().

exibble |>
  dplyr::select(num, char, datetime, currency, group) |>
  gt(rowname_col = "row") |>
  cols_add(
    country = c("TL", "PY", "GL", "PA", "MO", "EE", "CO", "AU")
  )
num char datetime currency group country
1.111e-01 apricot 2018-01-01 02:22 49.950 grp_a TL
2.222e+00 banana 2018-02-02 14:33 17.950 grp_a PY
3.333e+01 coconut 2018-03-03 03:44 1.390 grp_a GL
4.444e+02 durian 2018-04-04 15:55 65100.000 grp_a PA
5.550e+03 NA 2018-05-05 04:00 1325.810 grp_b MO
NA fig 2018-06-06 16:11 13.255 grp_b EE
7.770e+05 grapefruit 2018-07-07 05:22 NA grp_b CO
8.880e+06 honeydew NA 0.440 grp_b AU

We can add multiple columns with a single use of cols_add(). The columns generated can be formatted and otherwise manipulated just as any column could be in a gt table. The following example extends the first one by adding more columns and immediately using them in various function calls like fmt_flag() and fmt_scientific().

exibble |>
  dplyr::select(num, char, datetime, currency, group) |>
  gt(rowname_col = "row") |>
  cols_add(
    country = c("TL", "PY", "GL", "PA", "MO", "EE", "CO", "AU"),
    empty_col = NA_character_,
    big_num = num ^ 3
  ) |>
  fmt_flag(columns = country) |>
  sub_missing(columns = empty_col, missing_text = "EMPTINESS") |>
  fmt_scientific(columns = big_num)
num char datetime currency group country empty_col big_num
1.111e-01 apricot 2018-01-01 02:22 49.950 grp_a EMPTINESS 1.37 × 10−3
2.222e+00 banana 2018-02-02 14:33 17.950 grp_a EMPTINESS 1.10 × 101
3.333e+01 coconut 2018-03-03 03:44 1.390 grp_a EMPTINESS 3.70 × 104
4.444e+02 durian 2018-04-04 15:55 65100.000 grp_a EMPTINESS 8.78 × 107
5.550e+03 NA 2018-05-05 04:00 1325.810 grp_b EMPTINESS 1.71 × 1011
NA fig 2018-06-06 16:11 13.255 grp_b EMPTINESS NA
7.770e+05 grapefruit 2018-07-07 05:22 NA grp_b EMPTINESS 4.69 × 1017
8.880e+06 honeydew NA 0.440 grp_b EMPTINESS 7.00 × 1020

It is possible to start with an empty table (i.e., no columns and no rows) and add one or more columns to that. The first cols_add() call for an empty table can have columns of arbitrary length, but note that subsequent uses of cols_add() must adhere to the rule of new columns being the same length as existing. Here, we start from nothing and then add two columns of values:

dplyr::tibble() |>
  gt() |>
  cols_add(
    numbers = 1:5,
    spelled = vec_fmt_spelled_num(1:5)
  ) |>
  tab_header("Starting from Scratch.")
Starting from Scratch.
numbers spelled
1 one
2 two
3 three
4 four
5 five

Rows can be added. And we can do that with the new rows_add() function. We supply the new row data through name-value pairs or two-sided formula expressions. The new rows are added to the bottom of the table by default but can be added internally by using either the .before or .after arguments. Let’s have an example of this:

exibble |>
  gt(rowname_col = "row") |>
  rows_add(
    row = "row_9",
    num = 9.999E7,
    char = "ilama",
    fctr = "nine",
    group = "grp_b"
  )
num char fctr date time datetime currency group
row_1 1.111e-01 apricot one 2015-01-15 13:35 2018-01-01 02:22 49.950 grp_a
row_2 2.222e+00 banana two 2015-02-15 14:40 2018-02-02 14:33 17.950 grp_a
row_3 3.333e+01 coconut three 2015-03-15 15:45 2018-03-03 03:44 1.390 grp_a
row_4 4.444e+02 durian four 2015-04-15 16:50 2018-04-04 15:55 65100.000 grp_a
row_5 5.550e+03 NA five 2015-05-15 17:55 2018-05-05 04:00 1325.810 grp_b
row_6 NA fig six 2015-06-15 NA 2018-06-06 16:11 13.255 grp_b
row_7 7.770e+05 grapefruit seven NA 19:10 2018-07-07 05:22 NA grp_b
row_8 8.880e+06 honeydew eight 2015-08-15 20:20 NA 0.440 grp_b
row_9 9.999e+07 ilama nine NA NA NA NA grp_b

This adds a single row, but you can use vectors having multiple values to add multiple rows with a single use of the function.

Another way to use rows_add() is to start from virtually nothing (really, just the definition of columns) and build up a table using sporadic invocations of rows_add(). This might be useful in interactive or programmatic applications. Here’s an example where two columns are defined with dplyr’s tibble() function (and no rows are present initially); with two calls of rows_add(), two separate rows are added:

dplyr::tibble(
  time = lubridate::POSIXct(),
  event = character(0)
) |>
  gt() |>
  rows_add(
    time = lubridate::ymd_hms("2022-01-23 12:36:10"),
    event = "start"
  ) |>
  rows_add(
    time = lubridate::ymd_hms("2022-01-23 13:41:26"),
    event = "completed"
  )
time event
2022-01-23 12:36:10 start
2022-01-23 13:41:26 completed

Adding columns and rows while in the gt API is actually pretty convenient. While the examples here are limited in showing everything that’s possible, you can find a few more in the docs for cols_add() and rows_add().

Units notation provides a simple way to express measurement units

Something you might see often in tables are measurement units. These are typically found in the column labels of a table, and they let the reader know what units the values below have (this is DRY for display tables). Previously, you could provide simple units, but it wasn’t easy to formulate those that involved more specialized typesetting. We now have a better solution for this in gt with what we call units notation. With this syntax, gt will ensure that any measurement units are formatted correctly no matter what the output type is. We can now format units in the table body with fmt_units(), we can attach units to column labels with cols_units(), and we can use units notation in the already-available cols_label() and tab_spanner() functions.

The units notation involves a shorthand of writing units that feels familiar and is fine-tuned for the task at hand. Each unit is treated as a separate entity (parentheses and other symbols included), and the addition of subscript text and exponents is flexible and relatively easy to formulate. Here are some examples:

The new cols_units() function lets you attach units to column labels, setting off the measurement units from the column label with a comma and a space (and this can be customized with .units_pattern). Here’s an example of that with a table generated from a summarized version of the pizzaplace dataset.

pizzaplace |>
  dplyr::mutate(month = lubridate::month(date, label = TRUE, abbr = TRUE)) |>
  dplyr::group_by(month) |>
  dplyr::summarize(
    n_sold = dplyr::n(),
    rev = sum(price)
  ) |>
  dplyr::mutate(chg = (rev - dplyr::lag(rev)) / dplyr::lag(rev)) |>
  dplyr::mutate(month = as.character(month)) |>
  gt(rowname_col = "month") |>
  fmt_integer(columns = n_sold) |>
  fmt_currency(columns = rev, use_subunits = FALSE) |>
  fmt_percent(columns = chg) |>
  sub_missing() |>
  cols_label(
    n_sold = "Number of Pizzas Sold",
    rev = "Revenue Generated",
    chg = "Monthly Changes in Revenue"
  ) |>
  cols_units(
    n_sold = "units month^-1",
    rev = "USD month^-1",
    chg = "% change *m*/*m*"
  )
Number of Pizzas Sold, units month−1 Revenue Generated, USD month−1 Monthly Changes in Revenue, % change m/m
Jan 4,232 $69,793
Feb 3,961 $65,160 −6.64%
Mar 4,261 $70,397 8.04%
Apr 4,151 $68,737 −2.36%
May 4,328 $71,403 3.88%
Jun 4,107 $68,230 −4.44%
Jul 4,392 $72,558 6.34%
Aug 4,168 $68,278 −5.90%
Sep 3,890 $64,180 −6.00%
Oct 3,883 $64,028 −0.24%
Nov 4,266 $70,395 9.95%
Dec 3,935 $64,701 −8.09%

If you should have a column that contains text values already in units notation, that column could be formatted and subsequently rendered nicely by using the new fmt_units() function. It so happens that the illness dataset has a column (units) with values in the correct format. We’ll point fmt_units() toward that column, and that’ll make the rendered measurement units fit for publication.

illness |>
  gt() |>
  fmt_units(columns = units) |>
  sub_missing(columns = -starts_with("norm")) |>
  sub_missing(columns = c(starts_with("norm"), units), missing_text = "") |>
  sub_large_vals(rows = test == "MYO", threshold = 1200) |>
  fmt_number(
    decimals = 2,
    drop_trailing_zeros = TRUE
  ) |>
  tab_header(title = "Laboratory Findings for the YF Patient") |>
  tab_spanner(label = "Day", columns = starts_with("day")) |>
  cols_label_with(fn = ~ gsub("day_", "", .)) |>
  cols_merge_range(col_begin = norm_l, col_end = norm_u) |>
  cols_label(
    starts_with("norm") ~ "Normal Range",
    test ~ "Test",
    units ~ "Units"
  ) |>
  tab_style(
    style = cell_text(align = "center"),
    locations = cells_column_labels(columns = starts_with("day"))
  ) |>
  tab_style(
    style = cell_fill(color = "aliceblue"),
    locations = cells_body(columns = c(test, units))
  ) |>
  opt_vertical_padding(scale = 0.4) |>
  opt_align_table_header(align = "left") |>
  tab_options(heading.padding = px(10))
Laboratory Findings for the YF Patient
Test Units Day Normal Range
3 4 5 6 7 8 9
Viral load copies per mL 12,000 4,200 1,600 830 760 520 250
WBC ×109/L 5.26 4.26 9.92 10.49 24.77 30.26 19.03 4–10
Neutrophils ×109/L 4.87 4.72 7.92 18.21 22.08 27.17 16.59 2–8
RBC ×1012/L 5.72 5.98 4.23 4.83 4.12 2.68 3.32 4–5.5
Hb g/L 153 135 126 115 75 87 95 120–160
PLT ×109/L 67 38.6 27.4 26.2 74.1 36.2 25.6 100–300
ALT U/L 12,835 12,632 6,426.7 4,263.1 1,623.7 672.6 512.4 9–50
AST U/L 23,672 21,368 14,730 8,691 2,189 1,145 782.5 15–40
TBIL µmol/L 117.2 143.8 137.2 158.1 127.3 105.1 163.2 0–18.8
DBIL µmol/L 71.4 104.6 94.6 143.9 117.8 83.6 126.3 0–6.8
NH3 mmol/L 115.2 135.2 131 176.7 84.2 72.4 91.9 10–47
PT s 24.6 42.4 53.7 54 22.6 16.8 29.5 9.4–12.5
APTT s 39.2 57.2 65.9 68.3 62.4 61.7 114.7 25.1–36.5
PTA % 41 25 19 14 51 55 31 70–130
DD mg/L 32.9 35.1 24.5 25.6 18.7 24.7 64.8 0–5
FDP µg/mL 84.7 92.5 77.2 157.2 291.7 0–5
Fibrinogen mg/dL 238.1 216.8 135 85.2 105.7 64.3 200–400
LDH U/L 5,727.3 2,622.8 2,418.7 546.3 637.2 80–285
HBDH
5,971.2 5,826.9 4,826.9 2,871.2 1,163.6 74–182
CK U/L 725 792.1 760.2 1,263.6 1,294.2 38–174
CKMB U/L 75 71 58 65 68 –25
BNP pg/mL 37 73 482 421 1,332 –100
MYO ng/mL 636.6 762.1 364.6 ≥1200 ≥1200 ≥1200 ≥1200 0–140
TnI ng/mL 0.03 0.04 0.05 0.16 0.14 2.84 8.94 0–0.03
CREA µmol/L 705.6 683.6 523.6 374 259.6 241.8 211.4 59–104
BUN mmol/L 20.13 25.33 13.33 7.84 4.23 3.92 3.41 1.7–8.3
AMY U/L 232.8 394.6 513.7 642.9 538.9 0–115
LPS U/L 227.6 526.9 487.9 437.8 414.5 5.6–51.3
K mmol/L 4.19 4.64 4.34 4.83 4.53 4.37 5.74 3.5–5.3
Na mmol/L 136.3 135.7 142.1 140.8 144.8 143.6 144.2 137–147
Cl mmol/L 91.2 92.9 96.6 99.2 102.1 99.5 105.2 99–110
Ca mmol/L 1.74 1.64 2.25 2.35 2.16 2.03 2.29 2.2–2.55
P mmol/L 2.96 3.23 1.47 1.15 0.97 1.57 1.63 0.81–1.45
Lac mmol/L 2.32 2.42 2.19 2.66 6.15 5.46 1.33–1.78
CRP mg/L 43.6 38.6 28.6 21.5 4.3 6.4 0–5
PCT ng/mL 0.57 1.35 2.26 1.79 3.48 5.92 –0.05
IL-6
165.9 58.3 74.6 737.2 –7
CD3+CD4+ T cells per µL 174 153 184 243 370 252 706–1,125
CD3+CD8+ T cells per µL 142 135 126 132 511 410 323–836

You can use units notation in cols_label(); this approach lets us express both the label text and the measurement units in a single string. To mark text as that in units notation text, we wrap it with "{{" and "}}". Here’s an example of that using a portion of the towny dataset.

towny |>
  dplyr::select(
    name, population_2021, density_2021, land_area_km2, latitude, longitude
  ) |>
  dplyr::filter(population_2021 > 100000) |>
  dplyr::arrange(desc(population_2021)) |>
  dplyr::slice_head(n = 10) |>
  gt() |>
  fmt_integer(columns = population_2021) |>
  fmt_number(
    columns = c(density_2021, land_area_km2),
    decimals = 1
  ) |>
  fmt_number(columns = latitude, decimals = 2) |>
  fmt_number(columns = longitude, decimals = 2, scale_by = -1) |>
  cols_label(
    starts_with("population") ~ "Population",
    starts_with("density") ~ "Density, {{*persons* km^-2}}",
    land_area_km2 ~ "Area, {{km^2}}",
    latitude ~ "Latitude, {{:degrees:N}}",
    longitude ~ "Longitude, {{:degrees:W}}"
  )
name Population Density, persons km−2 Area, km2 Latitude, °N Longitude, °W
Toronto 2,794,356 4,427.8 631.1 43.74 79.37
Ottawa 1,017,449 364.9 2,788.2 45.42 75.69
Mississauga 717,961 2,452.6 292.7 43.60 79.65
Brampton 656,480 2,469.0 265.9 43.69 79.76
Hamilton 569,353 509.1 1,118.3 43.26 79.87
London 422,324 1,004.3 420.5 42.97 81.23
Markham 338,503 1,604.8 210.9 43.88 79.26
Vaughan 323,103 1,186.0 272.4 43.83 79.50
Kitchener 256,885 1,877.7 136.8 43.42 80.47
Windsor 229,660 1,572.8 146.0 42.28 83.00

This can similarly be done with tab_spanner(). Simply use a string that has both label text and text in units notation in the label argument. Here is a towny-based example that shows how it’s done:

towny |>
  dplyr::select(
    name, ends_with("2001"), ends_with("2006"), matches("2001_2006")
  ) |>
  dplyr::filter(population_2001 > 100000) |>
  dplyr::arrange(desc(pop_change_2001_2006_pct)) |>
  dplyr::slice_head(n = 10) |>
  gt() |>
  fmt_integer() |>
  fmt_percent(columns = matches("change"), decimals = 1) |>
  tab_spanner(
    label = "Population",
    columns = starts_with("population")
  ) |>
  tab_spanner(
    label = "Density, {{*persons* km^-2}}",
    columns = starts_with("density")
  ) |>
  cols_label(
    ends_with("01") ~ "2001",
    ends_with("06") ~ "2006",
    matches("change") ~ md("Population Change,<br>2001 to 2006")
  )
name Population Density, persons km−2 Population Change,
2001 to 2006
2001 2006 2001 2006
Brampton 325,428 433,806 1,224 1,632 33.3%
Vaughan 182,022 238,866 668 877 31.2%
Markham 208,615 261,573 989 1,240 25.4%
Barrie 103,710 128,430 1,047 1,297 23.8%
Richmond Hill 132,030 162,704 1,310 1,614 23.2%
Oakville 144,738 165,613 1,042 1,192 14.4%
Mississauga 612,925 668,599 2,094 2,284 9.1%
Cambridge 110,372 120,371 977 1,065 9.1%
Burlington 150,836 164,415 810 883 9.0%
Guelph 106,170 114,943 1,214 1,315 8.3%

The notation here provides several conveniences for defining units, and it gives us nicely formatted units no matter what the table output format might be (i.e., HTML, LaTeX, RTF, etc.). Look for the How to use gt’s units notation. section in the documentation for functions that handle it (here is one instance of that in the cols_units() docs).

The from_column() helper function lets you get formatting parameters from adjacent columns

A very useful new helper function, from_column(), has been added so you can fetch values (for compatible arguments) from a column in the input table. For example, if you are using fmt_scientific(), and the number of significant figures should vary across the values to be formatted, a column containing those values for the n_sigfig argument can be referenced by from_column().

The new constants dataset contains data values that are either very small or very large, so scientific formatting is a strong requirement here. The dataset values also greatly differ in the degree of measurement precision. Two separate columns (sf_value and sf_uncert) account for this and contain the exact number of significant figures for each measurement value and the associated uncertainty value. We can use the n_sigfig argument of fmt_scientific() in conjunction with the from_column() helper to get the correct number of significant digits for each value.

constants |>
  dplyr::filter(grepl("Planck", name)) |>
  gt() |>
  fmt_scientific(
    columns = value,
    n_sigfig = from_column(column = "sf_value")
  ) |>
  fmt_scientific(
    columns = uncert,
    n_sigfig = from_column(column = "sf_uncert")
  ) |>
  cols_hide(columns = starts_with("sf")) |>
  fmt_units(columns = units) |>
  sub_missing(missing_text = "")
name value uncert units
molar Planck constant 3.990312712 × 10−10
J Hz−1 mol−1
Planck constant 6.62607015 × 10−34
J Hz−1
Planck constant in eV/Hz 4.135667696 × 10−15
eV Hz−1
Planck length 1.616255 × 10−35 1.8 × 10−40 m
Planck mass 2.176434 × 10−8 2.4 × 10−13 kg
Planck mass energy equivalent in GeV 1.220890 × 1019 1.4 × 1014 GeV
Planck temperature 1.416784 × 1032 1.6 × 1027 K
Planck time 5.391247 × 10−44 6.0 × 10−49 s
reduced Planck constant 1.054571817 × 10−34
J s
reduced Planck constant in eV s 6.582119569 × 10−16
eV s
reduced Planck constant times c in MeV fm 1.973269804 × 102
MeV fm

We simply couldn’t use a static value for n_sigfig in fmt_scientific() and doing so would result in the presentation of misleading values.

We can use from_column() in tab_style(). Well, inside the stylizing helper functions like cell_text() that are used in tab_style(). Here’s a really nice sp500-based example that shows this in conjunction with cols_add():

sp500 |>
  dplyr::filter(date > "2015-01-01") |>
  dplyr::arrange(date) |>
  dplyr::slice_head(n = 5) |>
  dplyr::select(date, open, close) |>
  gt(rowname_col = "date") |>
  fmt_currency(columns = c(open, close)) |>
  cols_add(dir = ifelse(close < open, "red", "forestgreen")) |>
  cols_label(dir = "") |>
  text_case_match(
    "red" ~ fontawesome::fa("arrow-down"),
    "forestgreen" ~ fontawesome::fa("arrow-up")
  ) |>
  tab_style(
    style = cell_text(color = from_column("dir")),
    locations = cells_body(columns = dir)
  )
open close
2015-01-02 $2,058.90 $2,058.20
2015-01-05 $2,054.44 $2,020.58
2015-01-06 $2,022.15 $2,002.61
2015-01-07 $2,005.55 $2,025.90
2015-01-08 $2,030.61 $2,062.14

Most of the formatting functions (fmt_*()) work with from_column(). To find out which arguments can be used with from_column(), look for the Compatibility of arguments with the from_column() helper function section in the formatting function’s documentation (here is one instance of that in the fmt_scientific() docs).

In closing

There’s so much great new stuff in gt, and we’ll keep working to make things better and easier for you. We are always listening to what you want, and we have a few ways you can reach us. Found something strange in gt? Have a cool idea? Then file an issue! Want to ask a question or discuss improvements before filing an issue? Try out the Discussions page in the gt repository for that.

For news on gt and other table packages (like Great Tables), follow the engaging @gt_package account on X/Twitter! We also have a Discord server which has a more casual atmosphere (and there’s plenty of table talk on there); we’d love to see you there!