How we build the posit::conf() program

Since we recently notified speakers for posit::conf(2023), we thought this would be a good time to explain more about our selection process. This year, we received 303 submissions and accepted 105 talks (92 regular and 13 lightning talks), giving an acceptance rate of 34%. Getting your talk accepted to conf is hard! We thought past and future submitters might find it helpful to learn a bit more about what we look for and how the process unfolds.

Our main objective for posit::conf() is to help attendees become better data scientists and data science managers. We want attendees to leave with actionable insights that they can start using as soon as they get back to work, as well as new ways to think about problems that pay off in the longer term. Our selection process follows directly from these goals. We look for compelling talks on topics of broad interest, and there’s no special back channel for submission, i.e., Posit employees, Posit customers, and conference sponsors all go through the same process as everyone else.

What exactly is the process? First, we assign every submission to three internal reviewers (using a design based on a Latin square). This year, we had a pool of 28 self-nominated reviewers from throughout the company who each evaluated around 33 submissions. They score each submission on its topic and delivery. A high score for the topic implies that the talk is likely to be of great interest to a wide swathe of the community and to have an impact on their day-to-day practice. In terms of delivery, a high score means the speaker is enthusiastic and clear.

To get a sense of what a strong submission looks like, here are five examples from this year’s submissions:

Once the internal reviews are complete, the program committee does some exploratory data analysis in R, looking at the distribution of topic and delivery scores. We then pick thresholds for these scores that give us a long short list that is roughly twice the desired number of talks. (Since we offer speaker training, we weigh the topic more heavily than the delivery). Two members of the program committee then watch each short-listed talk. We do also rate these talks, but our main priority is to understand what the talk is about, which we capture in the form of tags.

With all this data in hand, we can start making decisions and building the program. We treat the program (i.e. which talks are in which sessions) as an integral part of the talk selection process. Why don’t we just select the “best” talks and then create the program? We used to! But it didn’t work so well: we ended up with an overrepresentation of talks on certain topics and a few talks that had nothing in common (and so were hard to place into a session). Now, we build sessions as we go, working through the short list of submissions from highest to lowest rated. This approach helps to create more coherent sessions and to ensure that we are covering a broad mix of topics. When there are several similarly ranked talks on the same topic, we generally favor newer presenters over old hands and external folks over internal. Lightning talks tend to be high-scoring talks that didn’t fall neatly into one of the regular sessions and whose topic is amenable to the compressed 5-minute format.

Building the program is hard because there are always many more excellent talks than we can possibly accept. Our reviewers tell us that they truly enjoy watching these videos. We’re grateful to our community for sharing this abundance of great ideas. We hope this post provides some helpful insights into the posit::conf() program, and we’d love to see you in Chicago!

The posit::conf(2023) program committee

Garrick Aden-Buie, Jenny Bryan, Michael Chow, Mine Çetinkaya-Rundel, Rachael Dempsey, Hadley Wickham

Tags: conf posit::conf

Garrick Aden-Buie

Software Engineer

Garrick Aden-Buie is a Software Engineer for Shiny at Posit. He builds broadly accessible tools that help everyone do data science in R with R Markdown and Shiny. Before Shiny, he helped build Posit Academy, an online, immersive, data science apprenticeship for professional teams. Garrick is a passionate R user and educator and has enjoyed sharing his work via numerous open source projects including regexplain, xaringanExtra and epoxy.

Jenny Bryan

Jenny is a software engineer on the tidyverse team. She is a recovering biostatistician who takes special delight in eliminating the small agonies of data analysis. Jenny is known for smoothing the interfaces between R and spreadsheets, web APIs, and Git/GitHub. She’s been working in R/S for over 20 years and is a member of the R Foundation. She also serves in the leadership of rOpenSci and Forwards and is an adjunct professor at the University of British Columbia.

Michael Chow

Data Scientist and Software Engineer at Posit, PBC

Michael is a data science tool builder at Posit, where he works on open source tools for data analysis. He received a Ph.D. in Cognitive Psychology from Princeton University, and is interested in what drives expert data science performance. When not wrangling data, you can find him in Philly writing tiny poems, baking bread, and embroidering.

Mine Çetinkaya-Rundel

RStudio + Duke University

Mine Çetinkaya-Rundel is Professor of the Practice at Duke University and Developer Educator at RStudio. Mine’s work focuses on innovation in statistics and data science pedagogy, with an emphasis on computing, reproducible research, student-centered learning, and open-source education as well as pedagogical approaches for enhancing retention of women and under-represented minorities in STEM. Mine works on integrating computation into the undergraduate statistics curriculum, using reproducible research methodologies and analysis of real and complex datasets. Mine works on the OpenIntro project, whose mission is to make educational products that are free, transparent, and lower barriers to education. As part of this project she co-authored four open-source introductory statistics textbooks. She is also the creator and maintainer of datasciencebox.org and she teaches the popular Statistics with R MOOC on Coursera. Mine is a Fellow of the ASA and Elected Member of the ISI as well as the winner of the 2021 Robert V. Hogg Award for For Excellence in Teaching Introductory Statistics.

Rachael Dempsey

Community Manager at Posit, PBC

I love connecting people across the data science community to share what they're accomplishing with data and help others do the same through community discussions, industry meetups, and more.

Hadley Wickham

Chief Scientist at Posit, PBC

Hadley is Chief Scientist at Posit PBC, winner of the 2019 COPSS award, and a member of the R Foundation. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. His work includes packages for data science (like the tidyverse, which includes ggplot2, dplyr, and tidyr)and principled software development (e.g. roxygen2, testthat, and pkgdown). He is also a writer, educator, and speaker promoting the use of R for data science.