posit-blogs - All the new features in {gt} 0.10.0

The gt package (the one that helps you make beautiful, publication-quality tables in R) has been updated to version 0.10.0. Now, is this a big release? Sure, of course, it is. But it’s not as big as the 0.9.0 one, which had tons of blog posts just to cover everything that was new. This one post is going to cover, as best as it can, all of the big new features in gt 0.10.0. Let’s get this started by looking at nanoplots.

Nanoplots, tiny interactive plots in your gt table

Plots in a table need to be somewhat simple by design. Generally, there isn’t a lot of space to work with! We had these basic design requirements when we started developing the feature known as nanoplots:

Through some iteration we arrived at something that satisfies all of the design criteria. The new cols_nanoplot() function is the entry point to generating nanoplots in a gt table. Let’s introduce you to it by way of an example (using the new illness dataset). The cols_nanoplot() function can take input data from any number of columns you specify in the columns argument. In the example below, the columns that start with ‘day’ (seven columns in total) each have a single numeric value. The values are taken in the order of column specification in columns and are used in that order in every nanoplot (we’re making the default "line" plot type here). A new column will be generated for the nanoplot, and we’re giving it a specific name ("nanoplots") and also a label (with md("*Progression*"); yes, we can use Markdown here).

library(gt)

illness |>
  dplyr::slice_head(n = 10) |>
  gt(rowname_col = "test") |>
  tab_header("Partial summary of daily tests performed on YF patient") |>
  tab_stubhead(label = md("**Test**")) |>
  cols_hide(columns = c(starts_with("norm"), units)) |>
  cols_nanoplot(
    columns = starts_with("day"),
    new_col_name = "nanoplots",
    new_col_label = md("*Progression*")
  ) |>
  cols_align(align = "center", columns = nanoplots) |>
  tab_footnote(
    footnote = "Measurements from Day 3 through to Day 8.",
    locations = cells_column_labels(columns = nanoplots)
  )

Partial summary of daily tests performed on YF patient
Test	Progression¹
Viral load
WBC
Neutrophils
RBC
Hb
PLT
ALT
AST
TBIL
DBIL
¹ Measurements from Day 3 through to Day 8.

If you hover over the data points in a nanoplot, you’ll see values for the data points. They are automatically formatted to be compact (limited space here!), and we also have advanced options to help you control the formatting (customizability!). Also, hovering over the left edge of the nanoplot plot area will show the value range.

That was a pretty simple example that barely scratches the surface of what you can do with nanoplots. Right now, there are three types of nanoplots available: "line", "bar", "boxplot". Here’s an example of the "bar" type of nanoplot, which uses the sza dataset to visualize solar altitude angles.

sza |>
  dplyr::filter(latitude == 20 & tst <= "1200") |>
  dplyr::select(-latitude) |>
  dplyr::filter(!is.na(sza)) |>
  dplyr::mutate(saa = 90 - sza) |>
  dplyr::select(-sza) |>
  tidyr::pivot_wider(
    names_from = tst,
    values_from = saa,
    names_sort = TRUE
  ) |>
  gt(rowname_col = "month") |>
  tab_header(
    title = "Solar Altitude Angles",
    subtitle = "Average values every half hour from 05:30 to 12:00"
  ) |>
  cols_nanoplot(
    columns = matches("0"),
    plot_type = "bar",
    missing_vals = "zero",
    new_col_name = "saa",
    plot_height = "2.5em",
    options = nanoplot_options(
      data_bar_stroke_color = "GoldenRod",
      data_bar_fill_color = "DarkOrange",
      y_val_fmt_fn = function(x) paste0(vec_fmt_number(x, decimals = 1), "&deg;") 
    )
  ) |>
  tab_options(
    table.width = px(400),
    column_labels.hidden = TRUE
  ) |>
  cols_align(
    align = "center",
    columns = everything()
  )

Solar Altitude Angles
Average values every half hour from 05:30 to 12:00
jan
feb
mar
apr
may
jun
jul
aug
sep
oct
nov
dec

This example demonstrates the use of the nanoplot_options() helper function, which is to be invoked at the options argument of cols_nanoplot(). Through that helper, layers of the nanoplots can be selectively removed, the aesthetics of the remaining plot components can be modified, and display values can even be customized. We were able to modify the display of the bar values (on hover) with the y_val_fmt_fn argument; we just had to supply a function to perform that numeric formatting.

Let’s have one more example with nanoplots, this time involving box plots. For that, we use plot_type = "boxplot". We’ll take a slice of the pizzaplace dataset and create a simple table that displays a box plot of pizza sales for a selection of days. If you can get string-based data of the form "2.6,3.6,0,1.5" in a column, that’s valid input for a nanoplot. This is easy to do with dplyr::summarize(), and the preparatory work in the following example does just that.

pizzaplace |>
  dplyr::filter(date <= "2015-01-14") |>
  dplyr::mutate(time = as.numeric(hms::as_hms(time))) |>
  dplyr::summarize(time = paste(time, collapse = ","), .by = date) |>
  dplyr::mutate(is_weekend = lubridate::wday(date) %in% 6:7) |>
  gt() |>
  tab_header(title = "Pizza Sales in Early January 2015") |>
  fmt_date(columns = date, date_style = 2) |>
  cols_nanoplot(
    columns = time,
    plot_type = "boxplot",
    options = nanoplot_options(y_val_fmt_fn = function(x) hms::as_hms(x))
  ) |>
  cols_hide(columns = is_weekend) |>
  cols_align(align = "center", columns = nanoplots) |>
  cols_align(align = "left", columns = date) |>
  tab_style(
    style = cell_borders(
      sides = "left", color = "gray"),
    locations = cells_body(columns = nanoplots)
  ) |>
  tab_style_body(
    style = cell_fill(color = "#E5FEFE"),
    values = TRUE,
    targets = "row"
  ) |>
  tab_options(column_labels.hidden = TRUE)

Pizza Sales in Early January 2015
Thursday, January 1, 2015
Friday, January 2, 2015
Saturday, January 3, 2015
Sunday, January 4, 2015
Monday, January 5, 2015
Tuesday, January 6, 2015
Wednesday, January 7, 2015
Thursday, January 8, 2015
Friday, January 9, 2015
Saturday, January 10, 2015
Sunday, January 11, 2015
Monday, January 12, 2015
Tuesday, January 13, 2015
Wednesday, January 14, 2015

The other trick was to convert the string-based 24-hour-clock time values (e.g., "11:38:36") to the number of seconds elapsed in a day. Doing so gives us continuous values that can be incorporated into each box plot. And, by supplying a function to the y_val_fmt_fn argument within nanoplot_options(), we can transform the integer seconds values back to clock times for display on hover.

These examples only show part of what’s possible with the feature. We intend to go much further with nanoplots in future releases. If you’d like to see a few more examples, take a look at the docs for cols_nanoplot().

Add columns/rows to your table, even start from an empty table

The nanoplots examples showed us something new in gt: making new columns. This wasn’t possible before but is very possible now. We can add new columns to a table with the cols_add() function, and it works quite a bit like the dplyr mutate() function. You supply name-value pairs where the name is the new column name, and the value part describes the data that will go into the column. The latter can: (1) be a vector where the length of the number of rows in the data table, (2) be a single value (which will be repeated all the way down), or (3) involve other columns in the table (as they represent vectors of the correct length).

The new columns are added to the end of the column series by default but can instead be added internally by using either the .before or .after arguments. If entirely empty (i.e., all NA) columns need to be added, you can use any of the NA types (e.g., NA, NA_character_, NA_real_, etc.) for such columns.

Let’s look at a simple example using a subset of the exibble dataset. We’ll add a single column to the right of all the existing columns and call it country. This new column needs eight values, and these will be supplied when using cols_add().

exibble |>
  dplyr::select(num, char, datetime, currency, group) |>
  gt(rowname_col = "row") |>
  cols_add(
    country = c("TL", "PY", "GL", "PA", "MO", "EE", "CO", "AU")
  )

num	char	datetime	currency	group	country
1.111e-01	apricot	2018-01-01 02:22	49.950	grp_a	TL
2.222e+00	banana	2018-02-02 14:33	17.950	grp_a	PY
3.333e+01	coconut	2018-03-03 03:44	1.390	grp_a	GL
4.444e+02	durian	2018-04-04 15:55	65100.000	grp_a	PA
5.550e+03	NA	2018-05-05 04:00	1325.810	grp_b	MO
NA	fig	2018-06-06 16:11	13.255	grp_b	EE
7.770e+05	grapefruit	2018-07-07 05:22	NA	grp_b	CO
8.880e+06	honeydew	NA	0.440	grp_b	AU

We can add multiple columns with a single use of cols_add(). The columns generated can be formatted and otherwise manipulated just as any column could be in a gt table. The following example extends the first one by adding more columns and immediately using them in various function calls like fmt_flag() and fmt_scientific().

exibble |>
  dplyr::select(num, char, datetime, currency, group) |>
  gt(rowname_col = "row") |>
  cols_add(
    country = c("TL", "PY", "GL", "PA", "MO", "EE", "CO", "AU"),
    empty_col = NA_character_,
    big_num = num ^ 3
  ) |>
  fmt_flag(columns = country) |>
  sub_missing(columns = empty_col, missing_text = "EMPTINESS") |>
  fmt_scientific(columns = big_num)

num	char	datetime	currency	group	empty_col	big_num
1.111e-01	apricot	2018-01-01 02:22	49.950	grp_a	EMPTINESS	1.37 × 10⁻³
2.222e+00	banana	2018-02-02 14:33	17.950	grp_a	EMPTINESS	1.10 × 10¹
3.333e+01	coconut	2018-03-03 03:44	1.390	grp_a	EMPTINESS	3.70 × 10⁴
4.444e+02	durian	2018-04-04 15:55	65100.000	grp_a	EMPTINESS	8.78 × 10⁷
5.550e+03	NA	2018-05-05 04:00	1325.810	grp_b	EMPTINESS	1.71 × 10¹¹
NA	fig	2018-06-06 16:11	13.255	grp_b	EMPTINESS	NA
7.770e+05	grapefruit	2018-07-07 05:22	NA	grp_b	EMPTINESS	4.69 × 10¹⁷
8.880e+06	honeydew	NA	0.440	grp_b	EMPTINESS	7.00 × 10²⁰

It is possible to start with an empty table (i.e., no columns and no rows) and add one or more columns to that. The first cols_add() call for an empty table can have columns of arbitrary length, but note that subsequent uses of cols_add() must adhere to the rule of new columns being the same length as existing. Here, we start from nothing and then add two columns of values:

dplyr::tibble() |>
  gt() |>
  cols_add(
    numbers = 1:5,
    spelled = vec_fmt_spelled_num(1:5)
  ) |>
  tab_header("Starting from Scratch.")

Starting from Scratch.
numbers	spelled
1	one
2	two
3	three
4	four
5	five

Rows can be added. And we can do that with the new rows_add() function. We supply the new row data through name-value pairs or two-sided formula expressions. The new rows are added to the bottom of the table by default but can be added internally by using either the .before or .after arguments. Let’s have an example of this:

exibble |>
  gt(rowname_col = "row") |>
  rows_add(
    row = "row_9",
    num = 9.999E7,
    char = "ilama",
    fctr = "nine",
    group = "grp_b"
  )

	num	char	fctr	date	time	datetime	currency	group
row_1	1.111e-01	apricot	one	2015-01-15	13:35	2018-01-01 02:22	49.950	grp_a
row_2	2.222e+00	banana	two	2015-02-15	14:40	2018-02-02 14:33	17.950	grp_a
row_3	3.333e+01	coconut	three	2015-03-15	15:45	2018-03-03 03:44	1.390	grp_a
row_4	4.444e+02	durian	four	2015-04-15	16:50	2018-04-04 15:55	65100.000	grp_a
row_5	5.550e+03	NA	five	2015-05-15	17:55	2018-05-05 04:00	1325.810	grp_b
row_6	NA	fig	six	2015-06-15	NA	2018-06-06 16:11	13.255	grp_b
row_7	7.770e+05	grapefruit	seven	NA	19:10	2018-07-07 05:22	NA	grp_b
row_8	8.880e+06	honeydew	eight	2015-08-15	20:20	NA	0.440	grp_b
row_9	9.999e+07	ilama	nine	NA	NA	NA	NA	grp_b

This adds a single row, but you can use vectors having multiple values to add multiple rows with a single use of the function.

Another way to use rows_add() is to start from virtually nothing (really, just the definition of columns) and build up a table using sporadic invocations of rows_add(). This might be useful in interactive or programmatic applications. Here’s an example where two columns are defined with dplyr’s tibble() function (and no rows are present initially); with two calls of rows_add(), two separate rows are added:

dplyr::tibble(
  time = lubridate::POSIXct(),
  event = character(0)
) |>
  gt() |>
  rows_add(
    time = lubridate::ymd_hms("2022-01-23 12:36:10"),
    event = "start"
  ) |>
  rows_add(
    time = lubridate::ymd_hms("2022-01-23 13:41:26"),
    event = "completed"
  )

time	event
2022-01-23 12:36:10	start
2022-01-23 13:41:26	completed

Adding columns and rows while in the gt API is actually pretty convenient. While the examples here are limited in showing everything that’s possible, you can find a few more in the docs for cols_add() and rows_add().

Units notation provides a simple way to express measurement units

Something you might see often in tables are measurement units. These are typically found in the column labels of a table, and they let the reader know what units the values below have (this is DRY for display tables). Previously, you could provide simple units, but it wasn’t easy to formulate those that involved more specialized typesetting. We now have a better solution for this in gt with what we call units notation. With this syntax, gt will ensure that any measurement units are formatted correctly no matter what the output type is. We can now format units in the table body with fmt_units(), we can attach units to column labels with cols_units(), and we can use units notation in the already-available cols_label() and tab_spanner() functions.

The units notation involves a shorthand of writing units that feels familiar and is fine-tuned for the task at hand. Each unit is treated as a separate entity (parentheses and other symbols included), and the addition of subscript text and exponents is flexible and relatively easy to formulate. Here are some examples:

The new cols_units() function lets you attach units to column labels, setting off the measurement units from the column label with a comma and a space (and this can be customized with .units_pattern). Here’s an example of that with a table generated from a summarized version of the pizzaplace dataset.

pizzaplace |>
  dplyr::mutate(month = lubridate::month(date, label = TRUE, abbr = TRUE)) |>
  dplyr::group_by(month) |>
  dplyr::summarize(
    n_sold = dplyr::n(),
    rev = sum(price)
  ) |>
  dplyr::mutate(chg = (rev - dplyr::lag(rev)) / dplyr::lag(rev)) |>
  dplyr::mutate(month = as.character(month)) |>
  gt(rowname_col = "month") |>
  fmt_integer(columns = n_sold) |>
  fmt_currency(columns = rev, use_subunits = FALSE) |>
  fmt_percent(columns = chg) |>
  sub_missing() |>
  cols_label(
    n_sold = "Number of Pizzas Sold",
    rev = "Revenue Generated",
    chg = "Monthly Changes in Revenue"
  ) |>
  cols_units(
    n_sold = "units month^-1",
    rev = "USD month^-1",
    chg = "% change *m*/*m*"
  )

	Number of Pizzas Sold, units month⁻¹	Revenue Generated, USD month⁻¹	Monthly Changes in Revenue, % change m/m
Jan	4,232	$69,793	—
Feb	3,961	$65,160	−6.64%
Mar	4,261	$70,397	8.04%
Apr	4,151	$68,737	−2.36%
May	4,328	$71,403	3.88%
Jun	4,107	$68,230	−4.44%
Jul	4,392	$72,558	6.34%
Aug	4,168	$68,278	−5.90%
Sep	3,890	$64,180	−6.00%
Oct	3,883	$64,028	−0.24%
Nov	4,266	$70,395	9.95%
Dec	3,935	$64,701	−8.09%

If you should have a column that contains text values already in units notation, that column could be formatted and subsequently rendered nicely by using the new fmt_units() function. It so happens that the illness dataset has a column (units) with values in the correct format. We’ll point fmt_units() toward that column, and that’ll make the rendered measurement units fit for publication.

illness |>
  gt() |>
  fmt_units(columns = units) |>
  sub_missing(columns = -starts_with("norm")) |>
  sub_missing(columns = c(starts_with("norm"), units), missing_text = "") |>
  sub_large_vals(rows = test == "MYO", threshold = 1200) |>
  fmt_number(
    decimals = 2,
    drop_trailing_zeros = TRUE
  ) |>
  tab_header(title = "Laboratory Findings for the YF Patient") |>
  tab_spanner(label = "Day", columns = starts_with("day")) |>
  cols_label_with(fn = ~ gsub("day_", "", .)) |>
  cols_merge_range(col_begin = norm_l, col_end = norm_u) |>
  cols_label(
    starts_with("norm") ~ "Normal Range",
    test ~ "Test",
    units ~ "Units"
  ) |>
  tab_style(
    style = cell_text(align = "center"),
    locations = cells_column_labels(columns = starts_with("day"))
  ) |>
  tab_style(
    style = cell_fill(color = "aliceblue"),
    locations = cells_body(columns = c(test, units))
  ) |>
  opt_vertical_padding(scale = 0.4) |>
  opt_align_table_header(align = "left") |>
  tab_options(heading.padding = px(10))

Laboratory Findings for the YF Patient
Test	Units	Day							Normal Range
Test	Units	3	4	5	6	7	8	9	Normal Range
Viral load	copies per mL	12,000	4,200	1,600	830	760	520	250
WBC	×10⁹/L	5.26	4.26	9.92	10.49	24.77	30.26	19.03	4–10
Neutrophils	×10⁹/L	4.87	4.72	7.92	18.21	22.08	27.17	16.59	2–8
RBC	×10¹²/L	5.72	5.98	4.23	4.83	4.12	2.68	3.32	4–5.5
Hb	g/L	153	135	126	115	75	87	95	120–160
PLT	×10⁹/L	67	38.6	27.4	26.2	74.1	36.2	25.6	100–300
ALT	U/L	12,835	12,632	6,426.7	4,263.1	1,623.7	672.6	512.4	9–50
AST	U/L	23,672	21,368	14,730	8,691	2,189	1,145	782.5	15–40
TBIL	µmol/L	117.2	143.8	137.2	158.1	127.3	105.1	163.2	0–18.8
DBIL	µmol/L	71.4	104.6	94.6	143.9	117.8	83.6	126.3	0–6.8
NH3	mmol/L	115.2	135.2	131	176.7	84.2	72.4	91.9	10–47
PT	s	24.6	42.4	53.7	54	22.6	16.8	29.5	9.4–12.5
APTT	s	39.2	57.2	65.9	68.3	62.4	61.7	114.7	25.1–36.5
PTA	%	41	25	19	14	51	55	31	70–130
DD	mg/L	32.9	35.1	24.5	25.6	18.7	24.7	64.8	0–5
FDP	µg/mL	84.7	92.5	77.2	—	—	157.2	291.7	0–5
Fibrinogen	mg/dL	238.1	216.8	135	85.2	105.7	—	64.3	200–400
LDH	U/L	5,727.3	2,622.8	2,418.7	546.3	—	637.2	—	80–285
HBDH		5,971.2	5,826.9	4,826.9	2,871.2	—	1,163.6	—	74–182
CK	U/L	725	792.1	760.2	1,263.6	—	1,294.2	—	38–174
CKMB	U/L	75	71	58	65	—	68	—	–25
BNP	pg/mL	37	—	73	—	482	421	1,332	–100
MYO	ng/mL	636.6	762.1	364.6	≥1200	≥1200	≥1200	≥1200	0–140
TnI	ng/mL	0.03	0.04	0.05	0.16	0.14	2.84	8.94	0–0.03
CREA	µmol/L	705.6	683.6	523.6	374	259.6	241.8	211.4	59–104
BUN	mmol/L	20.13	25.33	13.33	7.84	4.23	3.92	3.41	1.7–8.3
AMY	U/L	—	232.8	394.6	513.7	—	642.9	538.9	0–115
LPS	U/L	—	227.6	526.9	487.9	—	437.8	414.5	5.6–51.3
K	mmol/L	4.19	4.64	4.34	4.83	4.53	4.37	5.74	3.5–5.3
Na	mmol/L	136.3	135.7	142.1	140.8	144.8	143.6	144.2	137–147
Cl	mmol/L	91.2	92.9	96.6	99.2	102.1	99.5	105.2	99–110
Ca	mmol/L	1.74	1.64	2.25	2.35	2.16	2.03	2.29	2.2–2.55
P	mmol/L	2.96	3.23	1.47	1.15	0.97	1.57	1.63	0.81–1.45
Lac	mmol/L	2.32	2.42	2.19	2.66	—	6.15	5.46	1.33–1.78
CRP	mg/L	43.6	38.6	28.6	21.5	—	4.3	6.4	0–5
PCT	ng/mL	0.57	—	1.35	2.26	1.79	3.48	5.92	–0.05
IL-6		—	—	165.9	58.3	74.6	737.2	—	–7
CD3+CD4+	T cells per µL	—	174	153	184	243	370	252	706–1,125
CD3+CD8+	T cells per µL	—	142	135	126	132	511	410	323–836

You can use units notation in cols_label(); this approach lets us express both the label text and the measurement units in a single string. To mark text as that in units notation text, we wrap it with "{{" and "}}". Here’s an example of that using a portion of the towny dataset.

towny |>
  dplyr::select(
    name, population_2021, density_2021, land_area_km2, latitude, longitude
  ) |>
  dplyr::filter(population_2021 > 100000) |>
  dplyr::arrange(desc(population_2021)) |>
  dplyr::slice_head(n = 10) |>
  gt() |>
  fmt_integer(columns = population_2021) |>
  fmt_number(
    columns = c(density_2021, land_area_km2),
    decimals = 1
  ) |>
  fmt_number(columns = latitude, decimals = 2) |>
  fmt_number(columns = longitude, decimals = 2, scale_by = -1) |>
  cols_label(
    starts_with("population") ~ "Population",
    starts_with("density") ~ "Density, {{*persons* km^-2}}",
    land_area_km2 ~ "Area, {{km^2}}",
    latitude ~ "Latitude, {{:degrees:N}}",
    longitude ~ "Longitude, {{:degrees:W}}"
  )

name	Population	Density, persons km⁻²	Area, km²	Latitude, °N	Longitude, °W
Toronto	2,794,356	4,427.8	631.1	43.74	79.37
Ottawa	1,017,449	364.9	2,788.2	45.42	75.69
Mississauga	717,961	2,452.6	292.7	43.60	79.65
Brampton	656,480	2,469.0	265.9	43.69	79.76
Hamilton	569,353	509.1	1,118.3	43.26	79.87
London	422,324	1,004.3	420.5	42.97	81.23
Markham	338,503	1,604.8	210.9	43.88	79.26
Vaughan	323,103	1,186.0	272.4	43.83	79.50
Kitchener	256,885	1,877.7	136.8	43.42	80.47
Windsor	229,660	1,572.8	146.0	42.28	83.00

This can similarly be done with tab_spanner(). Simply use a string that has both label text and text in units notation in the label argument. Here is a towny-based example that shows how it’s done:

towny |>
  dplyr::select(
    name, ends_with("2001"), ends_with("2006"), matches("2001_2006")
  ) |>
  dplyr::filter(population_2001 > 100000) |>
  dplyr::arrange(desc(pop_change_2001_2006_pct)) |>
  dplyr::slice_head(n = 10) |>
  gt() |>
  fmt_integer() |>
  fmt_percent(columns = matches("change"), decimals = 1) |>
  tab_spanner(
    label = "Population",
    columns = starts_with("population")
  ) |>
  tab_spanner(
    label = "Density, {{*persons* km^-2}}",
    columns = starts_with("density")
  ) |>
  cols_label(
    ends_with("01") ~ "2001",
    ends_with("06") ~ "2006",
    matches("change") ~ md("Population Change,<br>2001 to 2006")
  )

name	Population		Density, persons km⁻²		Population Change, 2001 to 2006
name	2001	2006	2001	2006	Population Change, 2001 to 2006
Brampton	325,428	433,806	1,224	1,632	33.3%
Vaughan	182,022	238,866	668	877	31.2%
Markham	208,615	261,573	989	1,240	25.4%
Barrie	103,710	128,430	1,047	1,297	23.8%
Richmond Hill	132,030	162,704	1,310	1,614	23.2%
Oakville	144,738	165,613	1,042	1,192	14.4%
Mississauga	612,925	668,599	2,094	2,284	9.1%
Cambridge	110,372	120,371	977	1,065	9.1%
Burlington	150,836	164,415	810	883	9.0%
Guelph	106,170	114,943	1,214	1,315	8.3%

The notation here provides several conveniences for defining units, and it gives us nicely formatted units no matter what the table output format might be (i.e., HTML, LaTeX, RTF, etc.). Look for the How to use gt’s units notation. section in the documentation for functions that handle it (here is one instance of that in the cols_units() docs).

The from_column() helper function lets you get formatting parameters from adjacent columns

A very useful new helper function, from_column(), has been added so you can fetch values (for compatible arguments) from a column in the input table. For example, if you are using fmt_scientific(), and the number of significant figures should vary across the values to be formatted, a column containing those values for the n_sigfig argument can be referenced by from_column().

The new constants dataset contains data values that are either very small or very large, so scientific formatting is a strong requirement here. The dataset values also greatly differ in the degree of measurement precision. Two separate columns (sf_value and sf_uncert) account for this and contain the exact number of significant figures for each measurement value and the associated uncertainty value. We can use the n_sigfig argument of fmt_scientific() in conjunction with the from_column() helper to get the correct number of significant digits for each value.

constants |>
  dplyr::filter(grepl("Planck", name)) |>
  gt() |>
  fmt_scientific(
    columns = value,
    n_sigfig = from_column(column = "sf_value")
  ) |>
  fmt_scientific(
    columns = uncert,
    n_sigfig = from_column(column = "sf_uncert")
  ) |>
  cols_hide(columns = starts_with("sf")) |>
  fmt_units(columns = units) |>
  sub_missing(missing_text = "")

name	value	uncert	units
molar Planck constant	3.990312712 × 10⁻¹⁰		J Hz⁻¹ mol⁻¹
Planck constant	6.62607015 × 10⁻³⁴		J Hz⁻¹
Planck constant in eV/Hz	4.135667696 × 10⁻¹⁵		eV Hz⁻¹
Planck length	1.616255 × 10⁻³⁵	1.8 × 10⁻⁴⁰	m
Planck mass	2.176434 × 10⁻⁸	2.4 × 10⁻¹³	kg
Planck mass energy equivalent in GeV	1.220890 × 10¹⁹	1.4 × 10¹⁴	GeV
Planck temperature	1.416784 × 10³²	1.6 × 10²⁷	K
Planck time	5.391247 × 10⁻⁴⁴	6.0 × 10⁻⁴⁹	s
reduced Planck constant	1.054571817 × 10⁻³⁴		J s
reduced Planck constant in eV s	6.582119569 × 10⁻¹⁶		eV s
reduced Planck constant times c in MeV fm	1.973269804 × 10²		MeV fm

We simply couldn’t use a static value for n_sigfig in fmt_scientific() and doing so would result in the presentation of misleading values.

We can use from_column() in tab_style(). Well, inside the stylizing helper functions like cell_text() that are used in tab_style(). Here’s a really nice sp500-based example that shows this in conjunction with cols_add():

sp500 |>
  dplyr::filter(date > "2015-01-01") |>
  dplyr::arrange(date) |>
  dplyr::slice_head(n = 5) |>
  dplyr::select(date, open, close) |>
  gt(rowname_col = "date") |>
  fmt_currency(columns = c(open, close)) |>
  cols_add(dir = ifelse(close < open, "red", "forestgreen")) |>
  cols_label(dir = "") |>
  text_case_match(
    "red" ~ fontawesome::fa("arrow-down"),
    "forestgreen" ~ fontawesome::fa("arrow-up")
  ) |>
  tab_style(
    style = cell_text(color = from_column("dir")),
    locations = cells_body(columns = dir)
  )

	open	close
2015-01-02	$2,058.90	$2,058.20
2015-01-05	$2,054.44	$2,020.58
2015-01-06	$2,022.15	$2,002.61
2015-01-07	$2,005.55	$2,025.90
2015-01-08	$2,030.61	$2,062.14

Most of the formatting functions (fmt_*()) work with from_column(). To find out which arguments can be used with from_column(), look for the Compatibility of arguments with the from_column() helper function section in the formatting function’s documentation (here is one instance of that in the fmt_scientific() docs).

In closing

There’s so much great new stuff in gt, and we’ll keep working to make things better and easier for you. We are always listening to what you want, and we have a few ways you can reach us. Found something strange in gt? Have a cool idea? Then file an issue! Want to ask a question or discuss improvements before filing an issue? Try out the Discussions page in the gt repository for that.

For news on gt and other table packages (like Great Tables), follow the engaging @gt_package account on X/Twitter! We also have a Discord server which has a more casual atmosphere (and there’s plenty of table talk on there); we’d love to see you there!

Nanoplots, tiny interactive plots in your gt table

Add columns/rows to your table, even start from an empty table

Units notation provides a simple way to express measurement units

The from_column() helper function lets you get formatting parameters from adjacent columns

In closing

The `from_column()` helper function lets you get formatting parameters from adjacent columns