Commercial enterprise offerings

Introducing Great Tables for Python v0.1.0

Written by Posit Team
2023-12-07
Four pages of the great tables documentation. The Python logo is next to the pages.

Get our email updates

Interested in learning more about Posit + Python tools? Join our email list.

With the Great Tables package, anyone can make great-looking display tables in Python. Though the project/package is still fairly early in its development, you can do some really great things with it today! We very recently put out our first major release of the Great Tables (v0.1.0), and it’s available in PyPI. You can install it by using:

pip install great_tables

In this introductory post, we’ll walk through a few examples that touch upon the more common table-making use cases. We’ll demonstrate how you can:

  • configure the structure of the table
  • format table-cell values
  • integrate source notes
  • incorporate tables within Quarto documents.

The Great Tables package is all about making it simple to produce nice-looking display tables. Display tables? Well, yes, we are trying to distinguish between data tables (e.g., DataFrames) and those tables you’d find in a web page, a journal article, or in a magazine. We can think of display tables as output only, where we’d not want to use them as input ever again. Other features include annotations, table element styling, and text transformations that serve to communicate the subject matter more clearly.

A Basic Table

Let’s get right to making a table with the package. We’ll start by making use of the very small, but useful, exibble dataset (which is available in the package). After importing the GT class and that dataset, we’ll feed that Pandas table to GT(). This serves as the main entry point into the Great Tables API.

from great_tables import GT, exibble

# Create a display table showing the table tailor-made for examples: exibble
gt_tbl = GT(exibble)

# Now, show the gt table
gt_tbl
num char fctr date time datetime currency row group
0.1111 apricot one 2015-01-15 13:35 2018-01-01 02:22 49.95 row_1 grp_a
2.222 banana two 2015-02-15 14:40 2018-02-02 14:33 17.95 row_2 grp_a
33.33 coconut three 2015-03-15 15:45 2018-03-03 03:44 1.39 row_3 grp_a
444.4 durian four 2015-04-15 16:50 2018-04-04 15:55 65100.0 row_4 grp_a
5550.0 five 2015-05-15 17:55 2018-05-05 04:00 1325.81 row_5 grp_b
fig six 2015-06-15 2018-06-06 16:11 13.255 row_6 grp_b
777000.0 grapefruit seven 19:10 2018-07-07 05:22 row_7 grp_b
8880000.0 honeydew eight 2015-08-15 20:20 0.44 row_8 grp_b

That doesn’t look too bad at all! Sure, it’s basic, but we really didn’t really ask for much. What we got was a proper table with column labels along with all of the cell data. Oftentimes, however, you’ll want a bit more, so we’ll endeavor to include some additional table components and flourishes in the upcoming examples.

Some More Complex Tables

Let’s take things a bit further and make a deluxe table with the included gtcars dataset. Great Tables is all about having a smörgasbord of methods that allow you to refine the presentation until you are fully satisfied. This revamped display table will have a handy Stub component that emphasizes the row labels. Groupings of rows will be generated via categorical values in a column. We’ll add a table title (and subtitle!) with tab_header(). The numerical values will be formatted with fmt_integer() and fmt_currency(). Column labels will be enhanced via cols_label(), and a source note will be included through the use of the tab_source_note() method.

from great_tables.data import gtcars
from great_tables import md, html

gtcars_mini = gtcars[["mfr", "model", "year", "hp", "trq", "msrp"]].tail(10)

(
    GT(gtcars_mini, rowname_col="model", groupname_col="mfr")
    .tab_spanner(label=md("*Performance*"), columns=["hp", "trq"])
    .tab_header(
        title=html("Data listing from <strong>gtcars</strong>"),
        subtitle=html("A <span style='font-size:12px;'>small selection</span> of great cars."),
    )
    .cols_label(year="Year Produced", hp="HP", trq="Torque", msrp="Price (USD)")
    .fmt_integer(columns=["year", "hp", "trq"], use_seps=False)
    .fmt_currency(columns="msrp")
    .tab_source_note(source_note="Source: the gtcars dataset within the Great Tables package.")
)
Data listing from gtcars
A small selection of great cars.
Year Produced Performance Price (USD)
HP Torque
Mercedes-Benz
AMG GT 2016 503 479 $129,900.00
SL-Class 2016 329 354 $85,050.00
Tesla
Model S 2017 259 243 $74,500.00
Porsche
718 Boxster 2017 300 280 $56,000.00
718 Cayman 2017 300 280 $53,900.00
911 2016 350 287 $84,300.00
Panamera 2016 310 295 $78,100.00
McLaren
570 2016 570 443 $184,900.00
Rolls-Royce
Dawn 2016 563 575 $335,000.00
Wraith 2016 624 590 $304,350.00
Source: the gtcars dataset within the Great Tables package.

With the six different methods applied, the table looks really very presentable! The rendering you’re seeing here has been done through a Quarto document. This is basically to say that the Great Tables package is ready to rock inside your Quarto doc.

Let’s keep going and get to deluxe example #2. For this one, we’ll use the airquality dataset (also included in the package, within the data submodule). With this table, two spanner labels will be added with tab_spanner(). It’s an easy-to-use method where you only need to provide the spanner label text and the columns for that label to span across. Columns can be freely moved around with cols_move_to_start() (there are also the cols_move_to_end() and the general cols_move() methods), which makes structuring the table much easier.

from great_tables.data import airquality

airquality_mini = airquality.head(10).assign(Year=1973)

(
    GT(airquality_mini)
    .tab_header(
        title="New York Air Quality Measurements",
        subtitle="Daily measurements in New York City (May 1-10, 1973)",
    )
    .cols_label(
        Ozone=html("Ozone,<br>ppbV"),
        Solar_R=html("Solar R.,<br>cal/m<sup>2</sup>"),
        Wind=html("Wind,<br>mph"),
        Temp=html("Temp,<br>&deg;F"),
    )
    .tab_spanner(label="Date", columns=["Year", "Month", "Day"])
    .tab_spanner(label="Measurement", columns=["Ozone", "Solar.R", "Wind", "Temp"])
    .cols_move_to_start(columns=["Year", "Month", "Day"])
)
New York Air Quality Measurements
Daily measurements in New York City (May 1-10, 1973)
Date Measurement Solar R.,
cal/m2
Year Month Day Ozone,
ppbV
Wind,
mph
Temp,
°F
1973 5 1 41.0 7.4 67 190.0
1973 5 2 36.0 8.0 72 118.0
1973 5 3 12.0 12.6 74 149.0
1973 5 4 18.0 11.5 62 313.0
1973 5 5 14.3 56
1973 5 6 28.0 14.9 66
1973 5 7 23.0 8.6 65 299.0
1973 5 8 19.0 13.8 59 99.0
1973 5 9 8.0 20.1 61 19.0
1973 5 10 8.6 69 194.0

That table is looking good! And the great thing about all these methods is that they can be used in virtually any order.

Formatting Table Cells

We didn’t want to skimp on formatting methods for table cells with this early release. There are 11 fmt_*() methods available right now:

  • fmt_number(): format numeric values
  • fmt_integer(): format values as integers
  • fmt_percent(): format values as percentages
  • fmt_scientific(): format values to scientific notation
  • fmt_currency(): format values as currencies
  • fmt_bytes(): format values as bytes
  • fmt_roman(): format values as Roman numerals
  • fmt_date(): format values as dates
  • fmt_time(): format values as times
  • fmt_markdown(): format Markdown text
  • fmt(): set a column format with a formatting function

The basic idea behind the formatting implementation was to make formatting an easy task but also to provide the user with a lot of power via mixing and matching several options. Most of these methods have a locale argument, which allows for numbers, dates, and times to be easily displayed in locale-specific ways. We did all this to make the formatting task broadly useful to as many users as possible. Let’s take a look at an example of this with a smaller version of the exibble dataset:

exibble_smaller = exibble[["date", "time"]].head(4)

(
    GT(exibble_smaller)
    .fmt_date(columns="date", date_style="wday_month_day_year")
    .fmt_date(columns="date", rows=[2, 3], date_style="day_month_year", locale="de-CH")
    .fmt_time(columns="time", time_style="h_m_s_p")
)
date time
Thursday, January 15, 2015 1:35:00 PM
Sunday, February 15, 2015 2:40:00 PM
15 März 2015 3:45:00 PM
15 April 2015 4:50:00 PM

We support hundreds of locales, from af to zu! While there are more formatting methods yet to be added, the ones that are available all work exceedingly well.

The Documentation Site

The documentation site was built using quartodoc. It’s designed to look easy on the eyes while also providing a ton of useful information.

Check out these handy sections of the docs to learn more about how to create your own Great Tables:

Since we care a lot about documentation, we’ll continue working toward making this documentation site the best it can be.

We Are on the Path to Table Success

While we’re only getting started on this package, we feel things are coming along nicely! But it’s nothing without you, our users, so we’d really like to hear from you (we welcome any and all feedback). There are quite a few ways to do this; here are all the options:

Thanks for reading! For our part, we’ll keep working on making Great Tables live up to its name.