The Test Set:
Episodes

A posit video podcast series for data science. Hosted by Michael Chow, with Hadley Wickham and Wes McKinney.

EPISODES

Thumbnail of a computer

EPISODE 1

Spreadsheets, bikes, and the accidental empire of R packages

Before Hadley Wickham became a pillar of modern data science, he was a spreadsheet-loving teenager making databases for his dad's job. In this episode, he reflects on the early days of his involvement with R, the birth of tidyverse, and how real-world unpredictability — like a bear in a field — shapes data science.

Thumbnail of a computer

EPISODE 2

Wes Mckinney: Part 1 — Building Pandas, Arrow and a speedrunning legacy

Wes McKinney's fingerprints are all over the modern data stack — from inventing Pandas to co-creating Arrow. But before all that, Wes was organizing speedrun communities and hacking together better ways to wrangle datasets in finance. In this conversation, he shares his origin story and what makes good tools good.

Thumbnail of a computer

EPISODE 3

Wes Mckinney: Part 2 — Funding the Future of Open Source, what's next

In Part 2 of our conversation with Wes McKinney, we go beyond the code and into the complicated, mission-driven world of open source funding, community-building, and product strategy. Wes talks about what it takes to make critical tools like Arrow sustainable — from pitching to mavericks at Two Sigma to navigating the politics of Apache Software Foundation governance. Also, metal.

Thumbnail of a computer

EPISODE 4

Mine Çetinkaya-Rundel: Teaching in the AI era — and keeping students engaged

Mine Çetinkaya-Rundel is a Professor of the Practice and the Director of Undergraduate Studies at the Department of Statistical Science and an affiliated faculty in the Computational Media, Arts, and Cultures program at Duke University. She joins The Test Set to talk about publishing code, building narrative, and wrestling with ambiguity.

Thumbnail of a computer

EPISODE 5

Roger Peng: Impact takes time, and it's totally worth it

Roger Peng is a professor of Statistics and Data Sciences at the University of Texas at Austin, and he's also the host of a popular data science podcast, Not So Standard Deviations. We talk about the evolution of the data science community, how the conversations have changed, and why teaching remains central to Roger's mission.

Thumbnail of a computer

EPISODE 6

Michael Chow: From psychology and Python to constrained creativity

For this one, we turn the mic around. Wes McKinney takes over the interviewer's chair to chat with his co-host, Michael Chow. Michael's a principal software engineer at Posit, but he started out studying how people think — literally, with a PhD in cognitive psychology. Somewhere along the way, he got hooked on data science, helped build adaptive learning tools at DataCamp, and now spends his days thinking about how to make Python easier to use and more fun.

Thumbnail of Julia Silge

EPISODE 7

Julia Silge: Part 1 — Positron, pineapple pizza, and the art of iteration

In part one of our conversation with Julia Silge, astronomer-turned–data science leader, we explore why data science needs a different kind of IDE. Julia takes us inside Positron, Posit’s next-generation, data-scientist-first environment, and unpacks the day-to-day realities that make data science work unlike software engineering. Along the way, we get a first-hand account of a legendary pineapple-pizza protest and how to juggle multiple projects at once.

Thumbnail of Julia Silge

EPISODE 8

Julia Silge: Part 2 — Glue work, licensing, and open source in the age of LLMs

In part two of our conversation with Julia Silge, we discuss how work actually ships: the boundaries, the glue, and the tools that turn noise into signal. From there, we go macro and wonder what the LLM era means for humanity’s contributions, plus how licensing is evolving to protect sustainability without abandoning openness.