Company, events, and community

Talk recordings and workshop materials from posit::conf(2023)

Written by Posit Team
2023-12-15
posit conf 2023 logo with a group of cartoon people on their laptop or devices chatting.

What an amazing posit::conf(2023)! Inspired by Hadley Wickham’s introduction of community members with ChatGPT-created poems, here is a short recap:

In September’s warm embrace, posit::conf lit up the place. In Chicago, we gathered near, Others joined from far, yet dear.

Workshops sparkled, keynotes shone, Ideas bloomed, and knowledge was sown. Sticker drops, meetups, cheer, Announcements we awaited to hear.

Gratitude to all who made it bright, Instructors, volunteers, shining light. Attendees, near and far, You are the conference’s brightest star.

Looking forward to Seattle’s call, posit::conf(2024), awaits us all. A reunion of minds, hearts, and learning, In a world of data, ever turning.

We’re thrilled to share the resources from posit::conf(2023), including workshop materials and recordings of the talks. You can find them here:

Talk Recordings

Keynote and talk recordings are available on YouTube!

VIEW PLAYLIST

Browse the talk titles, speakers, and abstracts below. All videos have English captions for better access to our content, thanks to our volunteer team.



Why Design is Worth the Time
Laura Gast, USO
Abstract Have you ever submitted a report or other product and wished you’d been given just a bit more time to clean up the look of it? Does it make your skin crawl to hear ‘the data speaks for itself, don’t waste your time making that deliverable ’pretty’’? As a compliment to the many sessions this week in which you’ll hear great methods for how to make your work more beautiful, in this talk, we’ll walk through some of the scientific research that shows why taking the time to make design improvements is critical to communicating your point with data‚ for dashboards, reports, and even simple tables.
YouTube Link


Documenting Things: Openly for Future Us - posit::conf(2023)
Julia Stewart Lowndes, Openscapes
Abstract This talk shares practical tips and tangible stories for how intentional approaches to documenting things is helping big distributed teams tackle hard challenges and change organizational culture via NASA Openscapes, NOAA Fisheries Openscapes, & beyond. I’ll share about documenting things, and how intentional approaches to documentation and onboarding are helping big distributed teams tackle hard challenges and change organizational culture. The goal is to provide concrete tips to help you document things effectively & hear stories of how putting a focus on documentation can be help teams be efficient, productive, and less lonely. I’ll give a short lightning talk (inspired by Jenny Bryan’s Naming Files talk) followed by stories from NASA Openscapes, NOAA Fisheries Openscapes and beyond.
YouTube Link


Scale Your Data Validation Workflow With {pointblank} and Posit Connect
Michael Garcia, Medable
Abstract For the Data Services team at Medable, our number one priority is to ensure the data we collect and deliver to our clients is of the highest quality. The {pointblank} package, along with Posit Connect, modernizes how we tackle data validation within Data Services.

In this talk, I will briefly summarize how we develop test code with {pointblank}, share with {pins}, execute with {rmarkdown}, and report findings with {blastula}. Finally, I will show how we aggregate data from test results across projects into a holistic view using {shiny}.

YouTube Link


Developing a Prototyping Competency in a Statistical Science Organization
Daniel Woodie, Eli Lilly & Company
Abstract The introduction of new tools, methods, and processes can be a struggle within a statistical science organization. Being deliberate and investing in the creation of a prototyping competency can help in accelerating progress. Prototyping allows organizations to quickly experiment with new ideas, reduce the risk of failure, identify potential issues early, and iterate until the desired outcome is achieved.

I will highlight the key areas we have focused on accelerating, our framework for developing this competency, how we use Shiny, and the lessons we’ve learned along the way. Developing a prototyping competency is crucial for statistical science organizations that wish to stay competitive and innovative in today’s rapidly changing landscape.

YouTube Link


From Data Confusion to Data Intelligence
Elaine McVey and David Meza, EvE Bio and NASA
Abstract Data science teams operate in a unique environment, much different than the IT or software development life cycle. Hope from executives for the impact of data science is extremely high! Understanding of how to make data science efforts successful is very low! This creates an interesting set of organizational challenges for data and analytics teams. These are particularly clear when data science is being introduced at new companies, but plays out at organizations of all sizes. So, how do we navigate this dynamic? We will share some strategies for success.
YouTube Link


The ‘I’ in Team: Peer-to-Peer Best Practices for Growing Data Science Teams
Liz Roten, Metropolitan Council
Abstract R users don’t always come in sets. Often, you may be the only user on in the cubicle-block. But, one miraculous day, your manager finally fills the void and you welcome more folks on your team. Suddenly, the little R system you created to suit your needs, like a custom package, code styling, and file organization, isn’t just for you.

Want to suddenly overhaul that one package you wrote two years ago? It probably won’t work when your colleagues try to update it.

Your new teammates are data.table fans, but you prefer the tidyverse. Do you need to refactor? Are style choices, like indentation important when collaborating, or are you just being persnickety?

In this talk, you will learn how to bring new teammates on board and blend your respective styles without pulling your hair out.

YouTube Link


Oops I’m a Manager - On More Effective 1-on-1 Meetings
Andrew Holz, Posit Software, PBC
Abstract As a team leader (accidental or not), it’s easy to get caught up in the daily grind and overlook the importance of 1-on-1s. Bad idea. 1-on-1s are critical for building trust, providing feedback, and ensuring that everyone is on the same page.

Keys to good 1-on-1s start with a small amount of prep and a running shared document of notes and takeaways. Another key is to rotate types of 1-on-1s. Possibilities include “heads down” on recent work, “heads up” looking further out, and career-focused sessions. After some tips on the right sort of questions and uncovering sneaky issues, I will also touch on effective feedback.

I will share resources and hope to include Seussian visuals and a few poetic lines to help the key points stick.

YouTube Link


Serenity Now, Productivity Later: Focus Your Project Stack with The Gonzalez Matrix
Patrick Tennant, Meadows Mental Health Policy Institute
Abstract How should you respond when your boss has too many good ideas for data science projects? In this talk, I’ll review our use of an adapted version of the Eisenhower Matrix that lays out our projects according to the effort required and the value they will produce. Given the functionally unlimited number of data science projects a team could do, learn how we keep our team focused on valuable work while reducing the stress of a never-ending list of projects.
YouTube Link


From Novices to Experts: Building a Community of Engaged R Users
Natalia Andriychuk, Pfizer
Abstract At Pfizer, we have over 1500 users with R installed on their machines, along with an R community on MS Teams comprising over a thousand colleagues globally. How can we effectively engage with Pfizer R users and celebrate the successes of this community, while sharing best practices? Additionally, how do we avoid isolated groups duplicating efforts to solve R-related problems across different parts of the organization?

To address these challenges, we established the Pfizer R Center of Excellence (CoE) in early 2022. We focus our efforts on bringing together a rapidly growing community of colleagues, providing technical expertise, and offering best-practice guidance. A well-established, maintained and engaged R community promotes an inclusive and supportive learning environment that drives innovation within organizations. Our aim is to help colleagues thrive in their R journey, regardless of their expertise level.

During my talk, I will share the techniques we used to build a supportive R community, the tools employed to increase community engagement, and the successes and challenges encountered in building an engaging community of R users.

YouTube Link


Open Source Solutions to Next-Generation Submissions, After 30 Years of Industry Experience
Mike K Smith, Pfizer R&D UK Ltd
Abstract The pharmaceutical industry is undergoing rapid change, driven by a desire from both industry and regulatory agencies to move to more interactive visualizations and web applications to review data and make decisions. These changes would have been unthinkable 30 years ago when I started working at Pfizer.

In this talk, I’ll consider the drivers for these changes, how open-source tools can help achieve this, and why collaboration across the industry is vital to achieving this goal. I’ll contrast this with my experience of 30 years working in the pharma industry - when the R language had only just been released, when the internet was new, and when submissions to agencies were printed out, loaded onto trucks, and shipped to their doors.

YouTube Link


The Need for Speed - AccelerateR-ing R Adoption in GSK
Ben Arancibia, GSK
Abstract How does a risk-averse Pharma Biostatistics organization with 900+ people switch from using proprietary software to using R and other open-source tools for delivering clinical trial submissions? First slowly, then all at once. GSK started the transition of using R for its clinical trial data analysis in 2020 and now uses R for our regulatory-reviewed outputs. The AccelerateR Team, an agile pod of R experts and data scientists, rotates through GSK Biostatistics study teams sitting side by side to answer questions and mentor during this transition.

We will share our experience from AccelerateR and how other organizations can use our learnings to scale R from pilots to full enterprise adoption and contribute to open source industry R packages.

YouTube Link


Succeed in the Life Sciences with R/Python and the Cloud
Colby Ford
Abstract This talk covers best practices and lessons learned surrounding the use of R and Python by technical teams in the cloud, focusing on Posit Workbench, Azure ML, and Databricks.

In the life sciences, whether it’s pharma, biotech, research, or another type of organization, we are unique in that we blend scientific knowledge with technical skills to extract insights from large, complex datasets. In the cloud, we can architect solutions to help us scale, automate, and collaborate. Interestingly, the use of R and Python by bioinformatics, genomics, biostatistics, and data science teams can be challenging in a cloud-first world where all the data is somewhere other than your laptop (like a data lake).

In this talk, I will share best practices and lessons learned surrounding the use of R and Python by technical teams in the cloud. We’ll focus on the use of Posit Workbench and RStudio on various cloud services such as Azure ML and Databricks.

Tuple, The Cloud Genomics Company: https://tuple.xyz

YouTube Link


Reproducible Manuscripts with Quarto
Mine Çetinkaya-Rundel, Posit Software, PBC
Abstract In this talk, we present a new capability in Quarto that provides a straightforward and user-friendly approach to creating truly reproducible manuscripts that are publication-ready for submission to popular journals. This new feature, Quarto manuscripts, includes the ability to produce a bundled output containing a standardized journal format, source documents, source computations, referenced resources, and execution information into a single bundle that is ingested into journal review and production processes. We’ll demo how Quarto manuscripts work and how you can incorporate them into your current manuscript development process as well as touch on pain points in your current workflow that Quarto manuscripts help alleviate.
YouTube Link


From Journalist to Coder: Creating a Web Publication with Quarto
Brian Tarran, Royal Statistical Society
Abstract This is the story of how a Royal Statistical Society writer discovered Quarto, learned how to code (a bit), and built realworlddatascience.net, an online publication for the data science community.

In March 2022, I was tasked by the Royal Statistical Society with creating a new online publication: a data science website for data science professionals. I’ve been a print journalist for 20 years and have worked on websites in that time, but my coding ability began and ended with wrapping tags around text and images. That is until I discovered Quarto. In this talk, I describe how I explored, learned, and fell in love with the Quarto publishing system, how I used it to build a website – Real World Data Science (realworlddatascience.net) – and how the open source community mindset helped shape my thinking about what a new publication could and should be!

YouTube Link


What’s New in Quarto?
Charlotte Wickham, Posit Software, PBC
Abstract It’s been over a year since Quarto 1.0, an open-source scientific and technical publishing system, was announced at rstudio::conf(2022). In this talk, I’ll highlight some of the improvements to Quarto since then. You’ll learn about new formats, options, tools, and ways to supercharge your content. And, if you haven’t used Quarto yet, come to see some reasons to try it out.

YouTube Link


Dynamic Interactions: webR to Empower Educators & Researchers with Interactive Quarto Docs
James Balamuta, University of Illinois Urbana-Champaign
Abstract Traditional Quarto documents often lack interactivity, limiting the ability of students and researchers to fully explore and engage with the presented topic. In this talk, we propose a novel approach that utilizes webR, a WebAssembly-powered version of R, to seamlessly embed R code directly within the browser without the need for a server. We demonstrate how this approach can transform static Quarto documents into dynamic examples by leveraging webR’s capabilities through standard Quarto code cells, enabling real-time execution of R code and dynamic display of results. Our approach empowers educators and researchers alike to harness the power of interactivity and reproducibility for enhanced learning and research experiences.
YouTube Link


From Concept to Impact: Building and Launching Shiny Apps in the Workplace
Tiger Tang, CARFAX
Abstract Learn to build and launch a Shiny app like you are working on a start-up!

Unlock the potential of Shiny apps for your organization! Join Tiger as he shares insights from implementing Shiny apps at his workplace, handling over 160,000 internal requests. Discover a practical mindmap to find, build, and enhance Shiny app use cases, ensuring robustness and improved user engagement.

Materials: https://tigertang.org/posit_conf_2023/

YouTube Link


Building a Flexible, Scaleable Self-Serve Reporting System with Shiny
Natalie O'Shea, BetterUp
Abstract Working in the high-touch world of consulting, our team needed to develop a reporting system that was flexible enough to be tailored to the specific needs of any given partner while still reducing the highly manual nature of populating client-ready slide decks with various metrics and data visualizations. Utilizing the extensive resources developed by the R user community, I was able to create a flexible, scalable reporting system that allows users to populate templated Google slide decks with metrics and professional-grade visualizations using data pulled directly from our database at the time of query. This streamlined approach enables our consultants to spend less time copy-pasting data from one channel to another and instead focus on what they do best: surfacing business-relevant insights and recommendations for our partners.

By sharing my approach to customizable self-serve reporting in Shiny, I hope attendees will walk away with new ideas about how to combine parameterized reporting and dashboard development to get the best of both worlds. Additionally, I hope to end by sharing how this project was pivotal in making the business case for procuring Posit products for my broader organization.

YouTube Link


How Data Scientists Broke A/B Testing (And How We Can Fix It)
Carl Vogel, Babylist
Abstract As data scientists, we care about making valid statistical inferences from experiments. And we’ve adapted well-established and well-understood statistical methods to help us do so in our A/B tests. Our stakeholders, though, care about making good product decisions efficiently. I’ll describe how the way we design A/B tests can put these goals in tension and why that often causes misalignment between how A/B tests are intended to be used and how they are actually used. I’ll also talk about how I’ve used R to implement alternative experimental approaches that have helped bridge the gap between data scientists and stakeholders.
YouTube Link


How to Win Friends and Influence People (With Data)
Joe Powers, Intuit
Abstract Too many great data science products never go into production. To persuade leaders and colleagues to adopt your data science offering, you must translate your insights into terms that are relevant and accessible to them. Attempts to persuade these audiences with proofs and model performance stats will often fall flat because the audience is left feeling overwhelmed.

This talk will demonstrate the data simulation, visualization, and story-telling techniques that I use to influence leadership and the community-building techniques I use to earn the trust and support of fellow analysts. These efforts were successful in persuading Intuit to adopt advanced analytic methods like sequential analysis that cut the duration of our AB tests by over 60%.

YouTube Link


{slushy}: A Bridge to the Future
Becca Krouse, GSK
Abstract Scaling the use of R can present complications for environment management, especially in regulated industries with a focus on traceability. One solution is controlled (aka “frozen”) environments, which are carefully curated and tested by tech teams. However, the speed of R development means the environments quickly become outdated and users are unable to benefit from the latest advances. Enter {slushy}: a team-friendly tool powered by {renv} and Posit Package Manager. Users can quickly mimic a controlled environment, with the easy ability to time travel between snapshot dates. Attendees will learn how {slushy} bolstered our R adoption efforts, and how this strategy enables tech teams and users to work in parallel towards a common future.
YouTube Link


How I Learned to Stop Worrying and Love Public Packages
Joe Roberts, Posit Software, PBC
Abstract The popularity of R and Python for data science is in no small part attributable to the vast collection of extension packages available for everything from common tasks like data cleaning to highly-specialized domain-specific functions. However, with that ease of sharing packages comes a larger target for bad actors trying to exploit them. We’ll explore these security risks along with approaches you can take to mitigate them using Posit Package Manager.
YouTube Link


CRAN-ial Expansion: Taking Your R Package Development to New Frontiers with R-Universe
Mo Athanasia Mowinckel, Center for Lifespan Changes in Brain and Cognition
Abstract Say goodbye to installation headaches and hello to a universe of possibilities with R-Universe! Take your R package development to new frontiers by organizing and sharing packages beyond the bounds of CRAN. R-Universe’s reliable package-building process strengthens installation and usage instructions, resulting in fewer support requests and an easy installation experience for users. With webpages and an API for exploring packages, R-Universe creates a streamlined and tidy ecosystem for R-package constellations. Also, you can build a custom toolchain for your users, relieving your workload and empowering users to help themselves. Join me to learn how to explore the vastness of R-Universe and expand your package development possibilities!
YouTube Link


Package Management for Data Scientists
Tyler Finethy, Posit, PBC
Abstract In my talk, “Package Management for Data Scientists,” we will discuss software dependencies for R and Python and the common issues faced during package installations. I will begin with an overview of package management, highlighting its crucial role in data science. We’ll then focus on practical strategies to prevent dependency errors and address effective troubleshooting when these problems occur. Lastly, we will look towards the future, discussing potential package management improvements, focusing on reproducibility and accessibility for those new to the field.
YouTube Link


tidymodels: Adventures in Rewriting a Modeling Pipeline
Ryan Timpe, The LEGO Group
Abstract An overview of the benefits unlocked on our data science team by adopting tidymodels.

Data science sure has changed over the past few years! Everyone’s talking about production. RStudio is now Posit. Models are now tidy.

This talk is about embracing that change and updating existing models using the tidymodels framework. I recently completed this change, letting go of our in-production code and revisioning it with tidymodels. My team ended up with a faster, more scalable pipeline that enabled us to better automate our workflow and increase our scale while improving our stakeholders’ experiences.

I’ll share tips and tricks for adopting the tidymodels framework in existing products, best practices for learning and upskilling teams, and advice for using tidymodel packages to build more accessible data science tools.

Materials: https://www.ryantimpe.com/files/tidymodels_adventures_positconf2023.pdf

YouTube Link


Reliable Maintenance of Machine Learning Models
Julia Silge, Posit PBC
Abstract Maintaining machine learning models in production can be quite different from maintaining general software projects, because of the unique statistical characteristics of ML models.

In this talk, learn about model drift, the different ways the word “performance” is used with models, what you can monitor about a model, how feedback loops impact models, and how you can use vetiver to set yourself up for success with model maintenance. This talk will help practitioners who are already deploying models, but this is also useful knowledge for practitioners earlier in their MLOps journey; decisions made along the way can make the difference between resilient models that are easier to maintain and disappointing or misleading models.

Materials: https://github.com/juliasilge/ml-maintenance-2023

YouTube Link


Using R with Databricks Connect
Edgar Ruiz, Posit Software, PBC
Abstract Spark Connect, and Databricks Connect, enable the ability to interact with Spark stand-alone clusters remotely. This improves our ability to perform Data Science at-scale. We will share the work in sparklyr, and other products, that will make it easier for R users to take advantage this new framework.
YouTube Link


Conformal Inference with Tidymodels
Max Kuhn, Posit Software, PBC
Abstract Conformal inference theory enables any model to produce probabilistic predictions, such as prediction intervals. We’ll demonstrate how these analytical methods can be used with tidymodels. Simulations will show that the results have good coverage (i.e., a 90% interval should include the real point 90% of the time).
YouTube Link


Making a (Python) Web App is easy!
Marcos Huerta, Carmax
Abstract Making Python Web apps using Dash, Streamlit, and Shiny for Python

This talk describes how to make distribution-free prediction intervals for regression models via the tidymodels framework.

By creating and deploying an interactive web application you can better share your data, code, and ideas easily with a broad audience. I plan to talk about several Python web application frameworks, and how you can use them to turn a class, function, or data set visualization into an interactive web page to share with the world. I plan to discuss building simple web applications with Plotly Dash, Streamlit, and Shiny for Python.

Materials:

Corrections: In my live remarks, I said a Dash callback can have only one output: that is not correct, a Dash callback can update multiple outputs. I was trying to say that a Dash output can only be updated by one callback, but even that is no longer true as of Dash 2.9. https://dash.plotly.com/duplicate-callback-outputs

YouTube Link


Shiny for Python Machine Learning Apps with pandas, scikit-learn and TensorFlow
Chelsea Parlett-Pelleriti, Chapman University
Abstract With the introduction of Shiny for Python in 2022, users can now use the power of reactivity with their favorite Python packages. Shiny can be used to build interactive reports, dashboards, and web apps, that make sharing insights and results both simple and dynamic. This includes apps to display and explore popular Machine Learning models built with staple Python packages like pandas, scikit-learn, and TensorFlow. This talk will demonstrate how to build simple Shiny for Python apps that interface with these packages, and discuss some of the benefits of using Shiny for Python to build your web apps.
YouTube Link


Shiny New Tools for Scaling your Shiny Apps
Joe Kirincic, Medical Mutual of Ohio
Abstract So you have a Shiny app your org loves, but as adoption grows, performance starts getting sluggish. Profiling reveals your cool interactive plots are the culprit. What can you do to make things snappy again? We can increase the number of app instances, sure, but suppose that isn’t an option for us. Another approach is to shift the plotting work from the server onto the client.

In this talk, we’ll learn how to leverage two Javascript projects, DuckDB-WASM and Observable’s Plot.js, in our Shiny app to create fast, flexible interactive visualizations in the browser without burdening our app’s server function. The end result is an app that can scale to more users without needing to increase compute resources.

YouTube Link


The Road to Easier Shiny App Deployments
Liam Kalita, Jumping Rivers
Abstract We’re often helping developers to assess, fix and improve their Shiny apps, and often the first thing we do is see if we can deploy the app. If you can’t deploy your Shiny app, it’s a waste of time. If you can deploy it successfully, then at the very least it runs, so we’ve got something to work with.

There are a bunch of reasons why apps fail to deploy. They can be easy to fix, like Hardcoded secrets, fonts, or missing libraries. Or they can be intractable and super frustrating to deal with, like manifest mismatches, resource starvation, and missing libraries.

At the end of this talk, I want you to know how to identify, investigate and proactively prevent Shiny app deployment failures from happening.

YouTube Link


What an Early 2000s Reality Show Taught Me about File Management
Reiko Okamoto, National Research Council Canada
Abstract Clutter, whether it’s physical or digital, destroys our ability to focus; home organization ideas can be extended to create an workspace where analysts feel inspired to work with data.

Ideas from home organization shows are surprisingly applicable to file management. Using a room divider to establish dedicated zones for different activities in a studio apartment is analogous to creating self-contained projects in RStudio. Likewise, swapping mismatched hangers with matching ones to tidy a closet resembles the adoption of a file naming convention to make a directory easier to navigate.

In this talk, I will share good practices in file management through the lens of home organization. We all know that clutter, whether it is in our physical space or on our machine, destroys our ability to focus. These practices will help R users of all levels create a serene, relaxing environment where they feel inspired to work with data.

https://reikookamoto.github.io/; https://github.com/reikookamoto/posit-conf-2023-neat

YouTube Link


Getting the Most Out of Git
Colin Gillespie, Jumping Rivers
Abstract Did you believe that Git will solve all of your data science worries? Instead, you’ve been plunged HEAD~1 first into merging (or is that rebasing?) chaos. Issues are ignored, branches are everywhere, main never works, and no one really knows who owns the repository.

Don’t worry! There are ways to escape this pit of despair. Over the last few years, we’ve worked with many data science teams. During this time, we’ve spotted common patterns and also common pitfalls. While one size does not fit all, there are golden rules that should be followed. At the end of this talk, you’ll understand the processes other data science teams implement to make Git work for them.

YouTube Link


Documenting Things: Openly for Future Us
Julia Stewart Lowndes, Openscapes and UC Santa Barbara
Abstract This talk shares practical tips and tangible stories for how intentional approaches to documenting things is helping big distributed teams tackle hard challenges and change organizational culture via NASA Openscapes, NOAA Fisheries Openscapes, & beyond.

I’ll share about documenting things, and how intentional approaches to documentation and onboarding are helping big distributed teams tackle hard challenges and change organizational culture. The goal is to provide concrete tips to help you document things effectively & hear stories of how putting a focus on documentation can be help teams be efficient, productive, and less lonely. I’ll give a short lightning talk (inspired by Jenny Bryan’s Naming Files talk) followed by stories from NASA Openscapes, NOAA Fisheries Openscapes & beyond.

Materials:

YouTube Link


How You Get Value as a 1-Person Connect Team
Sean Nguyen, S2G Ventures
Abstract Sean, a sole Posit Connect developer, shares his experience in delivering business impact. He narrates his transition from crafting one-off reports to developing and deploying robust data science web applications using Python and R with Posit Connect. Despite its common association with large enterprise teams, Sean demonstrates how Posit Connect can be effectively utilized in smaller settings. He presents his work on creating and deploying end-to-end machine learning pipelines in Python, hosting them as APIs, and seamlessly integrating with Shiny apps via Posit Connect. This talk imparts practical strategies and techniques to foster user and executive adoption of Posit Connect within lean (and large) organizations.
YouTube Link


Teaching Data Science in Adverse Circumstances: Posit Cloud and Quarto to the Rescue
Aleksander Dietrichson, Universidad de San Martin
Abstract The focus of this presentation is on the challenges faced by teachers of data science whose students are not quantitatively inclined and may face some adversity in terms of technology resources available to them and potential language barriers. I identify three main areas of challenges and show how atUniversidad Nacional de San Martín (Argentina) we addressed each of the areas through a combination of original curriculum redesign, production of course materials appropriate for the students in question; and the use of OS, and some Posit products, i.e.:posit.cloud and Quarto. I show how these technologies can be used as a pedagogical tool to overcome the challenges mentioned, even on a shoestring budget.
YouTube Link


You Can Lead a Horse to Water . . . Changing the Data Science Culture for Veterinary Scientists
Jill MacKay, University of Edinburgh
Abstract A retrospective look at supporting data science skills in a research-focussed veterinary school

This is a talk about environment management, but not in the way you’re thinking. In many industries, domain-specific experts need enough understanding of data science to support their work and communicate with data scientists, but often have insufficient training in these skills, and limited time with which to obtain data science skills and practice them. This is particularly challenging for those who are interdisciplinary and have limited control over their workload, such as medics and field scientists. In this talk, an educational scientist describes the previous 10 years of supporting veterinary scientists to adopt open science practices surrounding data science. What worked, what failed miserably, and reflections on why it can be so hard to get a horse to drink.

Materials:

YouTube Link


Visualizing Data Analysis Pipelines with Pandas Tutor and Tidy Data Tutor
Sean Kross, Fred Hutchinson Cancer Center
Abstract The data frame is a fundamental data structure for data scientists using Python and R. Pandas and the tidyverse are designed to center building pipelines for the transformation of data frames. However, within these pipelines it is not always clear how each operation is changing the underlying data frame. To explain each step in a pipeline data science instructors resort to hand-drawing diagrams to illustrate the semantics of operations such as filtering, sorting, and grouping.

In this talk, I will introduce Pandas Tutor and Tidy Data Tutor, step-by-step visual representation engines of data frame transformations. Both tools illustrate the row, column, and cell-wise relationships between an operation’s input and output data frames.

YouTube Link


R! You Going?!
SherAaron Hurt, The Carpentries
Abstract 3 things to remember when starting your journey to become a data scientist

Everyone will have a different journey when becoming a data scientist. However, there are a few tips to consider to make the journey less daunting and more enjoyable. Listen, as I tell my story as a data scientist and offer resources and tips to build confidence for those who are new to their journey. The tools are available however, it is not always easy to find them.

keywords: openscience, The Carpentries, R programming language, GPS, data science journey, data science resources

Materials:

YouTube Link


dbtplyr: Bringing Column-Name Contracts from R to dbt
Emily Riederer, Capital One
Abstract starts_with(language): Translating select helpers to dbt. Translating syntax between languages transports concepts across communities. We see a case study of adapting a column-naming workflow from dplyr to dbt’s data engineering toolkit.

dplyr’s select helpers exemplify how the tidyverse uses opinionated design to push users into the pit of success. The ability to efficiently operate on names incentivizes good naming patterns and creates efficiency in data wrangling and validation.

However, in a polyglot world, users may find they must leave the pit when comparable syntactic sugar is not accessible in other languages like Python and SQL.

In this talk, I will explain how dplyr’s select helpers inspired my approach to ‘column name contracts,’ how good naming systems can help supercharge data management with packages like {dplyr} and {pointblank}, and my experience building the {dbtplyr} to port this functionality to dbt for building complex SQL-based data pipelines.

Materials:

YouTube Link


In-Process Analytical Data Management with DuckDB
Hannes Mühleisen, DuckDB Labs, Centrum Wiskunde & Informatica, Radboud Universiteit Nijmegen
Abstract This talks introduces DuckDB, an in-process analytical data management system that is deeply integrated into the R ecosystem.

DuckDB is an in-process analytical data management system. DuckDB supports complex SQL queries, has no external dependencies, and is deeply integrated into the R ecosystem. For example, DuckDB can run SQL queries directly on R data frames without any data transfer. DuckDB uses state-of-the-art query processing techniques like vectorised execution and automatic parallelism. DuckDB is out-of-core capable, meaning that it is possible to process datasets far bigger than main memory. DuckDB is free and open source software under the MIT license.

In this talk, we will describe the user values of DuckDB, and how it can be used to improve their day-to-day lives through automatic parallelisation, efficient operators, and out-of-core operations.

Materials:

YouTube Link


duckplyr: Tight Integration of duckdb with R and the tidyverse
Kirill Müller , cynkra
Abstract The duckplyr R package combines the convenience of dplyr with the performance of DuckDB. Better than dbplyr: Data frame in, data frame out, fully compatible with dplyr.

duckdb is the new high-performance analytical database system that works great with R, Python, and other host systems. dplyr is the grammar of data manipulation in the tidyverse, tightly integrated with R, but it works best for small or medium-sized data. The former has been designed with large or big data in mind, but currently, you need to formulate your queries in SQL.

The new duckplyr package offers the best of both worlds. It transforms a dplyr pipe into a query object that duckdb can execute, using an optimized query plan. It is better than dbplyr because the interface is ““data frames in, data frames out””, and no intermediate SQL code is generated.

The talk first presents our results, a bit of the mechanics, and an outlook for this ambitious project.

Materials: https://github.com/duckdblabs/duckplyr/

YouTube Link


Siuba and duckdb: Analyzing Everything Everywhere All at Once
Michael Chow, Posit Software, PBC
Abstract Every data analysis in Python starts with a big fork in the road: which DataFrame library should I use?

The DataFrame Decision locks you into different methods, with subtly different behavior::

  • different table methods (e.g. polars .with_columns() vs pandas .assign())
  • different column methods (e.g. polars .map_dict() vs pandas .map())

In this talk, I’ll discuss how siuba (a dplyr port to python) combines with duckdb (a crazy powerful sql engine) to provide a unified, dplyr-like interface for analyzing a wide range of data sources‚ whether pandas and polars DataFrames, parquet files in a cloud bucket, or pins on Posit Connect.

Finally, I’ll discuss recent experiments to more tightly integrate siuba and duckdb.

YouTube Link


Why Design is Worth the Time
Laura Gast, USO
Abstract Have you ever submitted a report or other product and wished you’d been given just a bit more time to clean up the look of it? Does it make your skin crawl to hear “the data speaks for itself, don’t waste your time making that deliverable ‘pretty’”?

As a compliment to the many sessions this week in which you’ll hear great methods for how to make your work more beautiful, in this talk, we’ll walk through some of the scientific research that shows why taking the time to make design improvements is critical to communicating your point with data‚ for dashboards, reports, and even simple tables.

YouTube Link


Adding a Touch of glitr: Developing a Package of Themes on Top of ggplot
Aaron Chafetz and Karishma Srikanth, USAID
Abstract Explore how our team at the US Agency for International Development (USAID) created our own data viz branding R package on top of ggplot2 and how you can too.

How do you create brand cohesion across your large team when it comes to data viz? Inspired by the BBC’s bbplot, our team at the US Agency for International Development (USAID) developed a package on top of ggplot2 to create a common look and feel for our team’s products. This effort improved not just the cohesiveness of our work, but also trustworthiness. By creating this package, we reduced the reliance on using defaults and the time spent on each project customizing numerous graphic elements. More importantly, this package provided an easier on-ramp for new teammates to adopt R. We share our journey within a federal agency developing a style guide and aim to guide and inspire other organizations who could benefit from developing their own branding package and guidance.

Materials:

YouTube Link


HTML and CSS for R Users
Albert Rapp, Ulm University
Abstract You can get the most out of popular R tools by combining them with easy-to-learn HTML & CSS commands.

It’s easy to think that R users do not need HTML and CSS. After all, R is a language designed for data analysis, right? But the reality is that these web standards are everywhere, even in R. In fact, many great tools like {ggtext}, {gt}, {shiny}, and Quarto unlock their full potential when you know a little bit of HTML & CSS. In this talk, I will demonstrate specific examples where R users can benefit from HTML and CSS and show you how to get started with these two languages.

Materials:

YouTube Link


Styling and Templating Quarto Documents
Emil Hvitfeldt, Posit Software, PBC
Abstract Quarto is a powerful engine to generate documents, slides, books, websites, and more. The default aesthetics looks good, but there are times when you want and need to change how they look. This is that talk.

Whether you want your slides to stand out from the crowd, or you need your documents to fit within your corporate style guide, being able to style Quarto documents is a valuable skill.

Once you have persevered and created the perfect document, you don’t want the effort to go to waste. This is where templating comes in. Quarto makes it super easy to turn a styled document into a template to be used over and over again.

YouTube Link


A hackers guide to open source LLMs
Jeremy Howard, fast.ai
Abstract In this deeply informative video, Jeremy Howard, co-founder of fast.ai and creator of the ULMFiT approach on which all modern language models (LMs) are based, takes you on a comprehensive journey through the fascinating landscape of LMs. Starting with the foundational concepts, Jeremy introduces the architecture and mechanics that make these AI systems tick. He then delves into critical evaluations of GPT-4, illuminates practical uses of language models in code writing and data analysis, and offers hands-on tips for working with the OpenAI API. The video also provides expert guidance on technical topics such as fine-tuning, decoding tokens, and running private instances of GPT models.

As we move further into the intricacies, Jeremy unpacks advanced strategies for model testing and optimization, utilizing tools like GPTQ and Hugging Face Transformers. He also explores the potential of specialized datasets like Orca and Platypus for fine-tuning and discusses cutting-edge trends in Retrieval Augmented Generation and information retrieval. Whether you’re new to the field or an established professional, this presentation offers a wealth of insights to help you navigate the ever-evolving world of language models.

(The above summary was, of course, created by an LLM!)

For the notebook used in this talk, see https://github.com/fastai/lm-hackers.

YouTube Link


R Not Only In Production
Kara Woo, Insight RX
Abstract I will share what our team has learned from successfully integrating R in all areas of our company’s operations. InsightRX is a precision medicine company whose goal is to ensure that each patient receives the right drug at the optimal dose. At InsightRX, R is a first-class language that’s used for purposes ranging from customer-facing products to internal data infrastructure, new product prototypes, and regulatory reporting. Using R in this way has given us the opportunity to forge fruitful collaborations with other teams in which we can both learn and teach.

Join me as I share how the skills of data science and engineering can complement each other to create better products and greater impact.

YouTube Link


It’s All About Perspective: Making a Case for Generative Art
Meghan Santiago Harris, Prostate Cancer Clinical Trials Consortium - Memorial Sloan Kettering
Abstract This talk explores how to create art in the R language while highlighting some similarities between the skills required for creating generative art and those needed to perform data science tasks in R.

Because the field of data science is inherently task-oriented, it is no wonder that most people struggle to see the utility of generative art past the bounds of a casual hobby. This talk will invite the participant to learn about generative art while focusing on “why” people should create it and its potential place in data science. This talk is suitable for all disciplines and artistic abilities. Furthermore, this talk will aim to expand the participant’s perspective on generative art with the following concepts:

  • What is generative art and how can it be created in R or Python
  • Justifications for generative art within Data Science
  • Examples of programming skills that are transferrable between generative art and pragmatic data science projects

Materials:

YouTube Link


How the R for Data Science (R4DS) Online Learning Community Made Me a Better Student
Lydia Gibson, California State University East Bay
Abstract Through my participation in R4DS Online Learning Community, I have advanced my R and data science skills, making me a better student than I otherwise would have been through just my studies. As a non-traditional MS Statistics student with an undergraduate background in economics, I had absolutely no experience with the R programming language prior to pursuing my Master’s degree. In July 2021, with hopes of getting a headstart on learning R before beginning my degree program, I joined the R4DS Slack Workspace. Along with helping to improve my programming skills, R4DS has connected me with scholarships, mentorship, and other opportunities, and I think that it would be beneficial for other students to know about this great resource.
YouTube Link


Solving a Secure Geocoding Problem (That Hardly Anybody Has)
Tesla DuBois, Fox Chase Cancer Center/Temple University
Abstract Due to data security concerns, the strictest health researchers won’t send patient addresses to remote servers for geocoding. The only existing methods for offline geocoding are expensive, cumbersome, or require working with code - all limiting factors for many researchers. So, a couple of classmates and I made a standalone desktop application using shell, Docker, PostGIS, and Python to geocode addresses through a simple GUI without ever sending them off the local machine. Come for the technical ins and outs and stay for the anecdotes about how my R background played into the daunting, frustrating, but ultimately successful task of creating a data science tool using unfamiliar technologies.
YouTube Link


Small Package, Broad Impact: How I Discovered the Ultimate New Hire Hack
Trang Le, Bristol Myers Squibb
Abstract Onboarding new hires can be a challenging process, but taking a problem-focused approach can make it more meaningful and rewarding. In this talk, I will share how I discovered the ultimate new hire hack by creating two small packages that gave me the confidence I needed when I started at BMS. Through building these packages, I not only learned R things like using bslib and making font files available for published dashboards, but also gained a deep understanding of my company’s internal systems and workflows, and connected with my team via lots of questions. The resulting packages are still heavily used today.

Join me to discover how small packages can have a broad impact and what hiring managers can do to help.

YouTube Link


Data Science in Production: The Way to a Centralized Infrastructure
Oliver Bracht, eoda GmbH
Abstract In this talk, the success story of Covestro’s posit infrastructure is presented. The problem of the leading German material manufacturer was that no common development environment existed. With the help of eoda and Posit, a replicable, centralized development environment for R and Python was created. Although R and Python represent the core of the infrastructure, multiple languages and tools are unified. In addition to the collaboration of Covestro’s data science teams, compliance guidelines could also be better fulfilled. The staging architecture hereby provides developers with a concept for testing and going live with their products. This project presents a best practice approach to a data science infrastructure using Covestro as an example.
YouTube Link


Matching Tools to Titans: Tailoring Posit Workbench for Every Cloud
James Blair, Posit PBC
Abstract In an era of diverse cloud platforms, leveraging tools effectively is paramount. This talk highlights the adaptability of Posit Workbench within leading cloud platforms. Delve into strategic integrations, understand key challenges, and uncover practical solutions. By the end, attendees will be equipped with insights to harness Posit Workbench’s capabilities seamlessly across varied cloud environments.
YouTube Link


Connect on Kubernetes: Content-level Containerization
Kelly O'Briant, Posit Software, PBC
Abstract Running Connect with off-host content execution in Kubernetes is very cool and allows you to enable some powerful and sophisticated workflows. The question is, do you really need it? How do you evaluate and decide? Let’s have a candid conversation about whether Connect content execution on Kubernetes is right for you and your organization.

Moving to Kubernetes will introduce complexity, so it’s important to have a strong motivating reason for making the switch. This talk will introduce new Connect features that are made possible by content-level containerization.

YouTube Link


Github Copilot integration with RStudio, it’s finally here!
Tom Mock, Posit - Product Manager for Workbench
Abstract This talk closes issue #10148, “Github Copilot integration with RStudio”, the most upvoted feature request in RStudio’s history. Code generating AI tools like Github Copilot‚ promise an “AI pair programmer that offers autocomplete-style suggestions as you code”. For the first time, we’ll show a native integration of Copilot into RStudio, helping to build on that promise by providing AI-generated “ghost text” autocompletion with R and other languages. I’ll also provide a comparison of Copilot’s “ghost text” to a chat-style interface in RStudio via the {chattr} package from the Posit MLVerse team.

To make the most of these new features, I’ll walk through some examples of how sharing additional context, comments, code, and other “prompt engineering” can help you go from code-generating AI tools that feels like an annoying backseat driver to an experienced copilot. We’ll close with a robust end-to-end example of how these new RStudio integrations and packages can help you be a more productive developer.

YouTube Link


Using R, Python, and Cloud Infrastructure to Battle Aquatic Invasive Species
Uli Muellner and Nicholas Snellgrove, Epi-interactive
Abstract Invasive species are a huge threat to lake ecosystems in Minnesota. With over 10,000 water bodies across the state, having up-to-date data and decision support is critical. Researchers at the University of Minnesota have created four complex R and Python models to support lake managers, all pulled together and presented with the most recent infestation data available.

Come along with us to see how we connected these models in the AIS Explorer, a decision support application built in Shiny to help prioritize risks and placing watercraft inspectors, using tools like OCPU and cloud toolings like Lambda, EventBridge and AWS S3.

YouTube Link


FOCAL Point: Utilizing Python, R, and Shiny to Capture, Process, and Visualize Motion
Justin Markel, PING
Abstract One of the fastest movements in modern sports is a golf swing. Capturing this motion using a high-speed camera system creates many unique challenges in processing, analyzing, and visualizing the thousands of data points that are generated. These spatial coordinates can be quickly translated through Python scripts to well-known, industry-specific performance metrics and graphics in Shiny. Down the line, R utilities aid more complicated analyses and optimizations, driving new product innovations.

This talk will cover our company’s process of implementing these tools into our workflow and highlight key program features that have helped successfully combine these applications for users with a variety of technical backgrounds.

YouTube Link


Combining R and Python for the Belgian Justice Department
Thomas Michem, MichaniX
Abstract We build a great case on how to combine R and Python in a production environment.

So the justice department’s back office monitors the smooth processing of all traffic fines in Belgium. They gather that data from all police departments and check if any anomalies occur.

The back-office monitors that using a shiny application where they can see traffic signs showing the status of the whole operation and the status is built using Python scripts that perform anomaly detection if the number of fines is in line with what they expect daily. And the results of those checks are delivered to a front-end shiny application with Python flask API.

YouTube Link


Validating and Testing R Dataframes with Pandera via reticulate - R-Python Interoperability
Niels Bantilan, Union.ai
Abstract Data science and machine learning practitioners work with data every day to analyze and model them for insights and predictions. A major component of any project is data quality, which is a process of cleaning, and protecting against flaws in data that may invalidate the analysis or model. Pandera is an open source data testing toolkit for dataframes in the Python ecosystem: but can it validate R dataframes?

This talk is composed of three parts: first I’ll describe what data testing is and motivate why you need it. Then, I’ll introduce the iterative process of creating and refining dataframe schemas in Pandera. Finally, I’ll demonstrate how to use it in R with the reticulate package using a simple modeling exercise as an example.

YouTube Link


Towards the Next Generation of Shiny UI
Carson Sievert, Posit Software, PBC
Abstract Create awesome looking and feature rich Shiny dashboards using the bslib R package.

Shiny recently celebrated its 10th birthday, and since its birth, has grown tremendously in many areas; however, a hello world Shiny app still looks roughly like it did 10 years ago. The bslib R package helps solve this problem making very easy to apply modern and customizable styling your Shiny apps, R Markdown / Quarto documents, and more. In addition, bslib also provides dashboard-focused UI components like expandable cards, value boxes, sidebar layouts, and more to help you create delightful Shiny dashboards.

Materials:

YouTube Link


The Power of Prototyping in R Shiny: Saving Millions by Building the Right Tool
Maria Grycuk, Appsilon
Abstract The development of software can be costly and time-consuming. If the end users are not involved in the process from the start the tool we built may not meet their needs. In this presentation, I will discuss how prototyping in Shiny can help you build the right tool and save you from spending millions of dollars on a tool no one will use. I will explore the advantages of using Shiny for prototyping, particularly its ability to rapidly build interactive applications. I will also discuss how to design effective prototypes, including techniques for gathering user feedback and using that feedback to refine your tool. I will emphasize the importance of presenting real-life data, particularly when building data-driven tools.
YouTube Link


ShinyUiEditor: From Alpha to Powerful Shiny App Development Tool
Nick Strayer, Posit Software, PBC
Abstract Since its alpha debut at last year’s conference, the ShinyUiEditor has experienced continuous development, evolving into a powerful tool for crafting Shiny app UIs. Some key enhancements include the integration of new bslib components and the editor’s ability to create or navigate to existing server bindings for inputs and outputs.

In addition to new features, the editor is now available as a VSCode extension enabling it to integrate smoothly into more developers’ workflows. This talk will showcase how these new capabilities empower users to efficiently create visually appealing and production-ready applications with ease.

YouTube Link


How to Help Developers Make Apps Users Love
Michał Parkoła , Appsilon
Abstract There are many resources that can help you design better apps.

But what if your org creates many apps?

Scaling good design to larger groups dials the challenge up to 11.

In this talk, I will share how we approach the problem at Appsilon.

YouTube Link


Sustainable Growth of Global Communities: R-Ladies’ Next Ten Years
Riva Quiroga, Programming Historian
Abstract In this talk we share how good programming practices inspire the way we manage the R-Ladies community in order to make it sustainable.

R-Ladies’ first ten years were about growing the community: from being just one chapter in 2012 to becoming a global organization in 2016, and now fostering more than 230 chapters worldwide. But how can we face the challenges of growing an organization based solely on volunteer work?

In this talk, we discuss how good programming practices –such as modularity, refactoring, and testing– inspire the way we see the sustainable management of an ever-growing community. To that end, we will present our most recent efforts at creating and documenting workflows, distributing the workload, and automating tasks that allow volunteers to focus their time where it is most needed.

After watching this talk, you will get some ideas on how to support volunteers in your own community or project, and on how to use GitHub Actions to automate workflows and tasks.

Learn more and join at: https://rladies.org/

YouTube Link


How to Keep Your Data Science Meetup Sustainable
Ted Laderas, DNAnexus, Inc.
Abstract Many data science meetup organizers struggle with burnout. It can be daunting to plan a meetup schedule, especially with the added burden of work and life.

In this talk, I want to highlight some strategies for keeping your data science meetup sustainable. Specifically, I want to highlight the role of self-care in growing and sustaining your group, as well as low-key activities like a data scavenger hunt, watching videos together, styling plots together, and sharing useful tidyverse functions.

By making it easy for your members to contribute and empowering them, it takes a lot of the burden off you as an organizer. You don’t need to reinvent the wheel for meetups or have famous guests for each one. Let’s start the conversation and make your meetup last.

YouTube Link


Side Effects of a Year of Blogging
Millie Symns, Thinx Inc.
Abstract A big part of being in the R community is sharing your knowledge in different forums, no matter how big or small. So what better way to do that than a blog? And what better way than using R and Posit products to build and maintain that blog and website? This was the route I took to challenge myself in putting myself out there more in the community to find my voice, share my knowledge and learn new things.

In this talk, I will reflect on lessons learned and gains I have spent the past year blogging and sharing my website for all to see. The side effects include professional and personal benefits - from a clear profile of my skills to the progression of the development of my art. You may leave inspired to try the challenge for yourself.

YouTube Link


Black Hair and Data Science Have More in Common Than You Think
Kari Jordan, The Carpentries
Abstract Data science is often difficult to define because of its many intersections, including statistics, programming, analytics, and other domain knowledge. Would you believe it if I told you Black hair and data science have much in common?

This talk is for those considering learning about, studying, or pursuing data science. In it, Dr. Kari L. Jordan draws parallels between approaches to caring for Black hair and approaches to learning data science. We start with the roots and end by picking the right tools and products to maintain our coiffure.

YouTube Link


It’s a Great Time to be an R Package Developer!
Jenny Bryan and Hadley Wickham, Posit Software, PBC
Abstract (Due to unforeseen circumstances, Hadley Wickham presented this talk “slide karaoke” style, from materials prepared by Jenny Bryan.)

In R, the fundamental unit of shareable code is the package. As of March 2023, there were over 19,000 packages available on CRAN. Hadley Wickham and I recently updated the R Packages book for a second edition, which brought home just how much the package development landscape has changed in recent years (for the better!).

In this talk, I highlight recent-ish developments that I think have a great payoff for package maintainers. I’ll talk about the impact of new services like GitHub Actions, new tools like pkgdown, and emerging shared practices, such as principles that are helpful when testing a package.

YouTube Link


Becoming an R Package Author (or How I Got Rich* Responding to GitHub Issues)
Matt Herman, Council of State Governments Justice Center
Abstract The transition from analyzing data in R to making packages in R can feel like a big step. Writing code to clean data or make visualizations seems categorically different from building robust “software” on which other people rely.

In this talk, I’ll show why that distinction is not necessarily true by discussing my personal experience from learning R in graduate school to reporting bugs on GitHub to becoming a co-author of the tidycensus package and a practicing data scientist. The positive and supportive R community on GitHub, Twitter, and elsewhere contributes to why anyone who writes R code can become a package author.

YouTube Link


Commit to Change: How to Increase Accessibility in Your Favorite Open Source Projects
Rose Franzen, Children's Hospital of Philadelphia
Abstract Explore accessibility for data scientists by uncovering some common barriers in open source tools with simple fixes that anyone can implement.

Dive into the often-overlooked world of accessibility for developers and data scientists! Uncover common accessibility barriers in open source tools, and learn simple steps to address them. Whether you’re a seasoned maintainer or a total novice, you’ll walk away with clear action items to implement right away. Join the movement of individuals championing the next frontier of disability inclusion in technology, shaping a more equitable future for all!

Materials: https://github.com/franzenr/commit-to-change

YouTube Link


Field Guide to Writing Your First R Package
Fonti Kar, University of New South Wales
Abstract I recall writing my first package being an intimidating task. In my talk, I will share how a Biologist’s mindset can make R package writing more approachable. This talk is an encouragement and a gentle stroll through the package building process. I will show you ways to be curious when you get stuck and how to prepare for the unexpected. I hope sharing my perspective will help others see package development as wonderful as the natural world and dispel any hesitation to start!
YouTube Link


Data Visualization with Seaborn
Michael Waskom, Flatiron Health
Abstract Seaborn is a Python library for statistical data visualization. After nearly a decade of development, seaborn recently introduced an entirely new API that is more explicitly based on a formal grammar of graphics. My talk will introduce this API and contrast it with the classic seaborn interface, sharing insights about the influence of the grammar of graphics on the ergonomics and maintainability of data visualization software.
YouTube Link


Grammar of Graphics in Python with Plotnine
Hassan Kibirige, A Plus Associates, Posit PBC(Contractor)
Abstract {plotnine} brings the elegance of {ggplot2} to the Python programming language. Learn about The Grammar of Graphics and get a feel of why it is an effective way to create Statistical Graphics.

ggplot2 is one of the most loved visualisation libraries. It implements a Grammar of Graphics system, which requires one to think about data in terms of columns of variables and how to transform them into geometric objects. It is elegant and powerful. This is a talk about plotnine, which brings the elegance of ggplot2 to the Python programming language. It is an invitation to learn about the Grammar of Graphics system and to appreciate it. It will include some tips on how to avoid common frustrations as you learn the system.

Materials:

YouTube Link


Diversify Your Career with Shiny for Python
Gordon Shotwell, Posit Software, PBC
Abstract A few years ago my company made a sudden shift from R to Python which was quite bad for my career because I didn’t really know Python. The main issue was that I couldn’t find a niche that allowed me to use my existing knowledge while learning the new language.

Shiny for Python is a great niche for R users because none of the Python web frameworks can do what Shiny can do. Additionally, almost all of your knowledge of the R package is applicable to the Python one.

This talk will provide an overview of the Python web application landscape and articulate what Shiny adds to this landscape, and then go through the five things that R users need to know before developing their first Shiny for Python application.

YouTube Link


Thanks, I Made It with Quartodoc
Isabel Zimmerman, Posit Software, PBC
Abstract When Python package developers create documentation, they typically must choose between mostly auto-generated docs or writing all the docs by hand. This is problematic since effective documentation has a mix of function references, high-level context, examples, and other content.

Quartodoc is a new documentation system that automatically generates Python function references within Quarto websites. This talk will discuss pkgdown’s success in the R ecosystem and how those wins can be replicated in Python with quartodoc examples. Listeners will walk away knowing more about what makes documentation delightful (or painful), when to use quartodoc, and how to use this tool to make docs for a Python package.

YouTube Link


We Converted our Documentation to Quarto
Melissa Van Bussel, Statistics Canada
Abstract 🚀 Elevate your Quarto projects to new heights with these practical tips and tricks!💡

“Wiki”, “User Guide”, “Handbook” – whatever you call yours, we converted ours to Quarto!

A year ago, my team’s documentation, which had been created using Microsoft Word, was large and lacked version control. Scrolling through the document was slow, and, due to confidentiality reasons, only one person could edit it at a time, which was a significant challenge for our team of multiple developers. After realizing we needed a more flexible solution, we successfully converted our documentation to Quarto.

In this talk, I’ll discuss our journey converting to Quarto, the challenges we faced along the way, and tips and tricks for anyone else who might be looking to adopt Quarto too.

Slides: https://melissavanbussel.quarto.pub/posit-conf-2023; Code for slides: https://github.com/melissavanbussel/posit-conf-2023; My YouTube: https://www.youtube.com/c/ggnot2; My website: https://www.melissavanbussel.com/; My Twitter: https://twitter.com/melvanbussel; My LinkedIn: https://www.linkedin.com/in/melissavanbussel/

YouTube Link


Extending Quarto
Richard Iannone, Posit PBC
Abstract What are Quarto shortcode extensions? Think of them as powerful little programs you can run in your Quarto docs. I won’t show you how to build a shortcode extension during this talk but rather I’m going to take you on a trip across this new ecosystem of shortcode extensions that people have already written. For example, I’ll introduce you to the fancy-text extension for outputting nicely-formatted versions of fancy strings such as LaTeX and BibTeX; you’ll learn all about the fontawesome, lordicon, academicons, material-icons, and bsicons shortcode extensions that let you add all sorts of icons. This is only a sampling of the shortcode extensions I will present, there will be many other inspiring examples as well.
YouTube Link


Never again in outer par mode: making next-generation PDFs with Quarto and typst
Carlos Scheidegger, Posit, PBC
Abstract Quarto 1.4 will introduce support for Typst. Typst is a brand-new open-source typesetting system built from scratch to support the lessons we have learned over almost half a century of high-quality computer typesetting that TeX and LaTeX have enabled. If you’ve ever had to produce a PDF with Quarto and got stuck handling an inscrutable error message from LaTeX, or wanted to create a new template but were too intimated by LaTeX’s arcane syntax, this talk is for you. I’ll show you why we need an alternative for TeX and LaTeX , and why it will make Quarto even better.
YouTube Link


Unlock the Power of DataViz Animation and Interactivity in Quarto
Deepsha Menghani, Microsoft
Abstract Plot animated and interactive visualizations with Plotly and Crosstalk in Quarto using R. In thi sintro to Plotly & Crosstalk in R: Using code examples, learn to integrate dashboard elements into Quarto with animated plots, interactive widgets (checkboxes), and linked plots via brushing.

This talk showcases how to use packages, such as Plotly and Crosstalk, to create interactive data visualizations and add dashboard-like elements to Quarto. Using a fun dataset available through the “Richmondway” package, we examine the number of times Roy Kent uses salty language throughout all seasons of “Ted Lasso.” We illustrate this using animated plots, interactive selection widgets such as checkboxes, and by linking two plots with brushing capabilities.

Materials:

YouTube Link


Using Data to Protect Traditional Lifeways
Angie Reed, Penobscot Indian Nation
Abstract The spirit of Penobscot Nation’s work to protect the health of their relative, the Penobscot River, is embodied in the Penobscot water song which says “Water, we love you, thank you so much water, we respect you.” Because the Penobscot River is not a natural resource - she is a relative, family - this song describes the foundation of our efforts to protect her health and well-being. The identity of Penobscot people cannot be disconnected from the river, and protecting this traditional lifeway is at the heart of our work.

For over a decade we have used R to manage, transform, analyze, and visualize data, and the free, open-source Posit products help us leave a legacy of good data management and the ability to share results with Penobscot Nation citizens. You will learn more about how our use of R has helped us achieve more stringent protections for the Penobscot River and how we engage young people in every step of this work. We are also part of a larger network of tribal environmental professionals, working together to learn R and share data and insights. We will give you information about how you can volunteer to help expand the network of folks providing technical assistance on any R and RStudio related topics.

YouTube Link


Democratizing Access to Education Data
Erika Tyagi, Urban Institute
Abstract Learn how the Urban Institute is making high-quality data more accessible through the Education Data Portal.

Every year, government agencies release large amounts of data on schools and colleges, but this information is scattered across various websites and is often difficult to use. To make these data more accessible, the Urban Institute built the Education Data Portal, a freely available one-stop shop for harmonized data and metadata for nearly all major federal education datasets. In this talk, we’ll demonstrate how the portal works and share lessons we’ve learned about making data accessible to users with varying technical skills and preferred programming languages.

The Urban Institute’s Education Data Portal: https://educationdata.urban.org

YouTube Link


Take it in Bits: Using R to Make Eviction Data Accessible to the Legal Aid Community
Logan Pratico, Legal Services Corporation
Abstract One in five low-income renter households in the US experienced falling behind on rent or being threatened with eviction in 2021. Yet most are unrepresented when facing eviction in court. The complex and fast-paced legal system obscures access to timely information, leaving tenants without assistance.

In this talk, I discuss the Civil Court Data Initiative’s use of R alongside AWS Cloud and SQL to analyze disaggregate eviction records. I focus on the integration of RMarkdown with Amazon Athena and EC2 to create weekly eviction reports across 20 states for legal aid groups working to assist tenants. The upshot: accessible eviction data to help legal aid providers better address local legal needs.

YouTube Link


Open Source Property Assessment: Tidymodels to Allocate $16B in Property Taxes
Nicole Jardine and Dan Snow, Cook County Assessor's Office
Abstract How the Cook County Assessor’s Office uses R and tidymodels for its residential property valuation models.

The Cook County Assessor’s Office (CCAO) determines the current market value of properties for the purpose of property taxation. Since 2020, the CCAO has used R, tidymodels, and LightGBM to build predictive models that value Cook County’s 1.5 million residential properties, which are collectively worth over $400B. These predictive models are open-source, easily replicable, and have significantly improved valuation accuracy and equity over time.

Join CCAO Chief Data Officer Nicole Jardine and Director of Data Science Dan Snow as they walk through the CCAO’s modeling process, shares lessons learned, and offer a sneak peek at changes planned for the 2024 reassessment of Chicago.

Materials: https://github.com/ccao-data

YouTube Link


Hitting the Target(s) of Data Orchestration
Alexandros Kouretsis, Appsilon
Abstract We are living at a time when the size of datasets can be overwhelming. Add to this that their process involves linking together different computing systems and software, and integrating dynamically changing reference data, and for sure, you have a problem. Reproducibility, traceability, and transparency have left the building.

Here is where Posit Connect along with the vast R ecosystem comes to save the day, allowing the creation of reproducible pipelines. I will share with you my first-hand experience in this presentation. In particular, how we used Targets in Posit Connect combined with AWS technologies in a bioinformatics pipeline. The result? An effective and secure workflow orchestration that is scalable and advances knowledge.

YouTube Link


Using R to develop production modeling workflows at Mayo Clinic
Brendan Broderick, Mayo Clinic
Abstract Developing workflows that help train models and also help deploy them can be a difficult task. In this talk I will share some tools and workflow tips that I use to build production model pipelines using R. I will use a project of predicting patients who need specialized respiratory care after leaving the ICU as an example. I will show how to use the targets package to create a reproducible and easy to manage modeling and prediction pipeline, how to use the renv package to ensure a consistent environment for development and deployment, and how to use plumber, vetiver, and shiny applications to make the model accessible to care providers.
YouTube Link


Automating the Dutch National Flu Surveillance for Pandemic Preparedness
Patrick van den Berg, RIVM
Abstract The next pandemic may be caused by a flu strain, and with thousands of patients with the flu in Dutch hospitals annually it is important to have accurate and current data. The National Institute for Public Health and the Environment of the Netherlands (RIVM) collects and processes flu data to achieve pandemic preparedness. However, the flu reporting process used to be very laborious, stealing precious time from epidemiologists. In our journey of automating this data pipeline we learned that collaboration was the most important factor in getting to a working system. This talk will be at the cross-section of data science and epidemiology and will provide you with a valuable opportunity to learn from our experiences.
YouTube Link


Running R-Shiny without a Server
Joe Cheng, Posit Software, PBC
Abstract A year ago, Posit announced ShinyLive, a deployment mode of Shiny that lets you run interactive applications written in Python, without actually running a Python server at runtime. Instead, ShinyLive turns Shiny for Python apps into pure client-side apps, running on a pure client-side Python installation.

Now, that same capability has come to Shiny for R, thanks to the webR project.

In this talk, I’ll show you how you can get started with ShinyLive for R, and why this is more interesting than just cheaper app hosting. I’ll talk about some of the different use cases we had in mind for ShinyLive, and help you decide if ShinyLive makes sense for your app.

YouTube Link


Magic with WebAssembly and webR
George Stagg, Posit, PBC
Abstract Earlier this year the initial version of webR was released and users have begun building new interactive experiences with R on the web. In this talk, I’ll discuss webR’s TypeScript library and what it is able to do. The library allows users to interact with the R environment directly from JavaScript, which enables manipulation tricks that seem like magic. I’ll begin by describing how to move objects from R to JS and back again, and discuss the technology that makes this possible. I’ll continue with more advanced manipulation, such as invoking R functions from JS and talk about why you might want to do so. Finally, I’ll describe how messages are sent over webR’s communication channel and explain how this enables webR to work with Shinylive.
YouTube Link


AI and Shiny for Python: Unlocking New Possibilities
Winston Chang, Posit, PBC
Abstract In the past year, people have come to realize that AI can revolutionize the way we work. This talk focuses on using AI tools with Shiny for Python, demonstrating how AI can accelerate Shiny application development and enhance its capabilities. We’ll also explore Shiny’s unique ability to interface with AI models, offering possibilities beyond Python web frameworks like Streamlit and Dash. Learn how Shiny and AI together can empower you to do more, and do it faster.
YouTube Link


Large Language Models in RStudio
James Wade, Dow
Abstract Large language models (LLMs), such as ChatGPT, have shown the potential to transform how we code. As an R package developer, I have contributed to the creation of two packages – gptstudio and gpttools – specifically designed to incorporate LLMs into R workflows within the RStudio environment.

The integration of ChatGPT allows users to efficiently add code comments, debug scripts, and address complex coding challenges directly from RStudio. With text embedding and semantic search, we can teach ChatGPT new tricks, resulting in more precise and context-aware responses. This talk will delve into hands-on examples to showcase the practical application of these models, as well as offer my perspective as a recent entrant into public package development.

YouTube Link


epoxy: Super Glue for Data-driven Reports and Shiny Apps
Garrick Aden-Buie, Posit Software, PBC
Abstract R Markdown, Quarto, and Shiny are powerful frameworks that allow authors to create data-driven reports and apps. But truly excellent reports require a lot of work in the final steps to get numerical and stylistic formatting just right.

{epoxy} is a new package that uses {glue} to give authors templating superpowers. Epoxy works in R Markdown and Quarto, in markdown, LaTeX, and HTML outputs. It also provides easy templating for Shiny apps for dynamic data-driven reporting.

Beyond epoxy’s features, this talk will also touch on tips and approaches for data-driven reporting that will be useful to a wide audience, from R Markdown experts to the Quarto and Shiny curious.

YouTube Link


Can I Have a Word?
Ellis Hughes, GSK
Abstract Since its release, {gt} has won over the hearts of many due to its flexible and powerful table-generating abilities. However, in cases where office products were required by downstream users, {gt}’s potential remained untapped. That all changed in 2022 when Rich Iannone and I collaborated to add Word documents as an official output type. Now, data scientists can engage stakeholders directly, wherever they are.

Join me for an upcoming talk where I’ll share my excitement about the new opportunities this update presents for the R community as well as future developments we can look forward to.

YouTube Link


Motley Crews: Collaborating with Quarto
Susan McMillan, Wyl Schuth, and Michael Zenz, AIM
Abstract Adoption of Quarto for document creation has transformed the collaborative workflow for our small higher-education analytics team. Historically, content experts wrote in Word documents and data analysts used R for statistics and graphics. Specialization in different software tools created challenges for producing collaborative analytic reports, but Quarto has solved this problem. We will describe how we use Quarto for writing and editing text, embedding statistical analysis and graphics, and producing reports with a standard style in multiple formats, including web pages.
YouTube Link


Parameterized Quarto Reports Improve Understanding of Soil Health
Jadey Ryan, Washington State Department of Agriculture
Abstract Learn how to use R and Quarto parameterized reporting in this four-step workflow to automate custom HTML and Word reports that are thoughtfully designed for audience interpretation and accessibility.

Soil health data are notoriously challenging to tidy and effectively communicate to farmers. We used functional programming with the tidyverse to reproducibly streamline data cleaning and summarization. To improve project outreach, we developed a Quarto project to dynamically create interactive HTML reports and printable PDFs. Custom to every farmer, reports include project goals, measured parameter descriptions, summary statistics, maps, tables, and graphs.

Our case study presents a workflow for data preparation and parameterized reporting, with best practices for effective data visualization, interpretation, and accessibility.

Talk materials: https://jadeyryan.com/talks/2023-09-25_posit_parameterized-quarto/

YouTube Link


It’s Abstractions All the Way Down …
JD Long, RenaissanceRe
Abstract Abstractions rule everything around us. JD Long talks about abstractions from the board room to the silicon.

Over 20 years ago Joel Spolsky famously wrote, “All non-trivial abstractions, to some degree, are leaky.” Unsurprisingly this has not changed. However, we have introduced more and more layers of abstraction into our workflows: Virtual Machines, AWS services, WASM, Docker, R, Python, data frames, and on and on. But then on top of the computational abstractions we have people abstractions: managers, colleagues, executives, stakeholders, etc.

JD’s presentation will be a wild romp through the mental models of abstractions and discuss how we, as technical analytical types, can gain skill in traversing abstractions and dealing with leaks.

Materials: https://github.com/CerebralMastication/Presentations/tree/master/2023_posit-conf

YouTube Link


dplyr 1.1.0 Features You Can’t Live Without
Davis Vaughan, Posit Software, PBC
Abstract Did you enjoy my clickbait title? Did it work? Either way, welcome!

The dplyr 1.1.0 release included a number of new features, such as:

  • Per-operation grouping with .by
  • An overhaul to joins, including new inequality and rolling joins
  • New consecutive_id() and case_match() helpers
  • Significant performance improvements in arrange()

Join me as we take a tour of this exciting dplyr update, and learn how to use these new features in your own work!

YouTube Link


What’s New in the Torch Ecosystem
Daniel Falbel, Posit, PBC
Abstract torch is an R port of PyTorch, a scientific computing library that enables fast and easy creation and training of deep learning models.

In this talk, you will learn about the latest features and developments in torch, such as luz, a higher-level interface that simplifies your model training code, and vetiver, a new integration that allows you to deploy your torch models with just a few lines of code. You will also see how torch works well with other R packages and tools to enhance your data science workflow. Whether you are new to torch or already an experienced user, this talk will show you how torch can help you tackle your data science challenges and inspire you to build your own models.

YouTube Link


Why You Should Add Logging To Your Code (and make it more helpful)
Daren Eiri, Director of Data Science
Abstract Learn how the log4r package can help you better understand the errors your code may produce, and how to also get promptly alerted for severe errors by leveraging cloud monitoring solutions like Azure Monitor or AWS CloudWatch

When an error happens in your API, Shiny App, or quarto document, it is not always clear what line of code you need to look at, and the error messages aren’t always helpful. By walking through a simple API example, I show how you can use logging packages like log4r to provide error messages that make sense to you. I also show how you can use cloud-based data collection platforms like Azure Monitor or AWS CloudWatch to set up alerts, so you can get notified by email or text message for those severe errors that you need to be immediately aware of.

Gain more visibility into the health of your code by incorporating logging and pushing your logs to the cloud.

Materials: https://dareneiri.github.io/positconf2023/

YouTube Link


The People of Posit: Bringing Personality to R Packages
JP Flores and Sarah Parker, University of North Carolina at Chapel Hill
Abstract The R programming language offers the versatility to perform statistical analyses, create publication-ready plots, and render high-quality reports and presentations. Despite having this environment of indispensable tools, it can be daunting for a beginner-level programmer to get started. Luckily, the Posit community is one of a kind and values inclusivity, collaboration, and empathy. By putting a face to the R packages we use on a daily basis, we hope to make every programmer feel included and capable. We want to inspire attendees to create their own projects or packages, connect with others inside and outside of their field of expertise, and challenge themselves to learn something new, knowing the community is right there to support them.

Materials: http://www.sarmapar.com/people_of_posit/

YouTube Link


CI/CD Pipelines - Oh, the Places You’ll Go!
Trevor Nederlof, Posit Software, PBC
Abstract Data scientists are creating incredibly useful data products at an accelerating rate. These products are consumed by others who expect them to be accurate reliable and timely, often promises unfulfilled. In this talk, we will explore how to use common CI/CD pipeline tools already within reach of attendees to automatically test and deploy their apps, APIs, and reports.
YouTube Link


Why You Should Stop Networking and Start Making Friends
Libby Heeren, Freelance Data Scientist, Data Humans Podcast
Abstract When we think about making connections, we think about networking. I’d like you to forget about networking and start thinking about making friends. I’ll share my perspective as a community builder and host of the Data Humans podcast on how I cultivated a community of practice for myself and how I became a force multiplier who increases engagement.

You’ll learn how I made genuine human connections, the practical steps to making data friends, the power of vulnerability, and why we all benefit when we show up as our whole selves.

YouTube Link


Coding Tools for Industry R&D – Development Lessons from an Analytical Lab
Camila Saez Cabezas, Dow, Inc.
Abstract Are you considering or curious about developing code-based tools for scientists? Whether you are an experienced developer or a fellow Posit Academy graduate who might be stepping into this role for the first time, the aim of my story is to inspire you and help you navigate this process. While developing custom R functions, packages, and Shiny apps for diverse analytical capabilities and users in R&D, I learned why it’s important to collect certain information at the start before writing any tidying, analysis, visualization, and web application code.

In this talk, I will share the essential technical questions that help me define and plan for success.

YouTube Link


What I Wish I Knew Before Becoming a Data Scientist
Kaitlin Bustos, SharpestMinds
Abstract In this talk, I’m sharing my personal journey as a data scientist and the key lessons learned along the way. I’ll emphasize the importance of finding a positive community of like minded-allies, persevering through setbacks as success is not linear, and exploring by embracing the broad nature of the data science field. By sharing my experiences and acknowledging the challenges I’ve faced attendees will gain a fresh perspective on what it takes to succeed in a data science career and inspire them to pursue their passions in the field.

Overall, this talk aims to provide a glimpse into the reality of a data science career. Attendees will take away a sense of motivation and empowerment to find their own unique path to success.

YouTube Link


Quickly get your Quarto HTML theme in order
Greg Swinehart, Posit, PBC
Abstract A 5-minute talk to discuss how I’ve used Quarto and Bootstrap variables to quickly make Shiny’s new website look as it should. The Quarto user I have in mind works at an organization with specific brand guidelines to follow. I‚ will discuss how to set up your theme, show some key Quarto settings, and how Bootstrap‚ Sass variables are your best friend.
YouTube Link


USGS R Package Development: 10-year Reflections
Laura DeCicco, USGS
Abstract Ten years ago, the first set of git commits was submitted to a new R software package repository “dataRetrieval” with the goal to provide an easy way for R users to retrieve U.S Geological Survey (USGS) water data. At that time, the perception within the USGS was the use of R was exclusive to an elite group of “very serious scientists.” Fast forward, we now find many newer USGS hires having a solid grasp of the language from the start along with the use of R in a wide variety of applications.

In this talk, I’ll discuss my experiences maintaining the dataRetrieval package, how it’s shaped my career, impacted USGS R usage, and why data providers should consider sponsoring their own R packages wrapping their data API services.

YouTube Link


Speeding Up Plots in R/Shiny
Ryszard Szymański, Appsilon
Abstract A slow plots can ruin the user experience of our dashboard. This talk covers techniques for speeding up the rendering process of our visualisations.

Slow dashboards lead to a poor user experience and cause users to lose interest, or even become frustrated. A common culprit of this situation is a slowly rendering plot.

During the talk, we will dive deeper into how plots are rendered in Shiny, identify common bottlenecks that can occur during the rendering process, and learn various techniques for improving the speed of plots in R/Shiny dashboards.

These techniques will range from more efficient data processing to library-specific optimisations at the browser level.

Materials: I’d like to include a link to my linkedin profile: https://www.linkedin.com/in/ryszard-szyma%C5%84ski-310a7017a/

YouTube Link


Shiny Developer Secrets: Insights From Over 1200 Applicants and What You MUST Know to Shine
Vedha Viyash, Appsilon
Abstract Over 1,200 candidates applied for the R/Shiny developer role at Appsilon in the last year, and I will be sharing some insights that we have gained from going through the qualitative and quantitative feedback collected from every round of the interview process.

I will be sharing some key takeaways that would help you focus on things that will make you a better Shiny developer. From reactivity to software testing, there are multiple skills that make up a good Shiny developer and you will get to know the major gaps and how to focus on them.

YouTube Link


Insights in 5-D! (Using magic small-multiples layouts)
Matt Dzugan, Matt Dzugan
Abstract Using Small-Multiples (faceted graphs) is an effective way to compare patterns across many dimensions. In this talk, I’ll walk you through some ways to lay out your individual facets according to the underlying data. For example, maybe each facet represents a city or point on a 2D plane - we’ll explore ways to organize facets in a grid that mimics the data itself - unlocking your ability to explore patterns in 4+ dimensions. Other solutions to this problem rely on manually-curated lists that map common layouts to a grid, but in this talk, we’ll explore solutions that work on EVERYTHING. I’ll show you how to incorporate this technique into your viz and how I built the libraries since there are some interesting data science concepts at play.
YouTube Link


20 Questions with AI Chat Bots
Winston Chang, Posit Software, PBC
Abstract
YouTube Link

Workshop Materials

On the first two days of posit::conf(2023), we had two days of workshops presented by Posit employees and data science experts. All the workshop materials are free to use and share under the CC BY-SA 4.0 License. We didn’t record the workshops, but we hope the materials are useful to you.

You can find all the workshop materials at “Workshops at posit::conf(2023)”.

Day 1 - Sep 17 Day 2 - Sep 18
2-day workshops
1 DevOps for Data Scientists
  Rika Gorn
  https://github.com/posit-conf-2023/devops

  
  
2 Causal Inference with R
  Malcolm Barrett & Travis Gerke
  https://r-causal.github.io/causal_workshop_website

  
  
3 Introduction to Data Science with R
  Posit Academy
  https://intro-tidyverse-2023.netlify.app

  
  
4 Tidy Time Series and Forecasting in R
  Rob Hyndman
  https://posit-conf-2023.github.io/forecasting

  
  
1-day workshops
1 Advanced tidymodels
  Max Kuhn
  https://workshops.tidymodels.org
Advanced tidymodels
  Max Kuhn
  https://workshops.tidymodels.org
2 Introduction to tidymodels
  Hannah Frick & Simon Couch & Emil Hvitfeldt
  https://workshops.tidymodels.org
Deploy and Maintain Models with vetiver
  Julia Silge
  https://posit-conf-2023.github.io/vetiver
3 Getting Started with Shiny for Python
  Joe Cheng & Gordon Shotwell
  https://posit-dev.github.io/shiny-python-workshop-2023
Enhancing Communication & Collaboration with Jupyter Notebooks & Quarto
  Hamel Husain
  https://hamelsmu.github.io/posit-python-quarto
4 Getting Started with Shiny for R
  Garrick Aden-Buie & Colin Rundel
  https://posit-conf-2023.github.io/shiny-r-intro
Shiny Dashboards
  Garrick Aden-Buie & Colin Rundel
  https://posit-conf-2023.github.io/shiny-r-dashboard
5 Web Design for Shiny Developers
  Maya Gans & David Granjon
  https://webdesign4shiny.rinterface.com
Shiny in Production: Tools and Techniques
  Eric Nantz & Mike Thomas
  https://posit-conf-2023.github.io/shiny-r-prod
6 Introduction to Quarto with R & RStudio
  Andrew Bray
  https://posit-conf-2023.github.io/quarto-r
Advanced Quarto with R & RStudio
  Andrew Bray
  https://posit-conf-2023.github.io/quarto-r
7 Introduction to Data Science with Python
  Posit Academy
  https://github.com/chendaniely/positconf2023-academy_python
Machine Learning and Deep Learning with Python
  Sebastian Raschka
  https://github.com/posit-conf-2023/python-modeling
8 Package Development Masterclass
  Hadley Wickham
  https://github.com/posit-conf-2023/pkg-dev-masterclass
Package Development Masterclass
  Hadley Wickham
  https://github.com/posit-conf-2023/pkg-dev-masterclass
9 Fundamentals of Package Development
  Andy Teucher
  https://posit-conf-2023.github.io/pkg-dev
Fundamentals of Package Development
  Andy Teucher
  https://posit-conf-2023.github.io/pkg-dev
10 What They Forgot to Teach You About R
  Shannion Pileggi & David Aja
  https://pos.it/wtf
From R User to R Programmer
  Emma Rand & Ian Lyttle
  https://posit-conf-2023.github.io/programming-r
11 Big Data with Arrow
  Nic Crane & Steph Hazlitt
  https://posit-conf-2023.github.io/arrow
Teaching Data Science Masterclass
  Mine Cetinkaya-Rundel
  https://posit-conf-2023.github.io/teach-ds-masterclass
12 Designing Data Visualizations to Successfully Tell a Story
  Cedric Scherer
  https://posit-conf-2023.github.io/dataviz-storytelling
Engaging and Beautiful Data Visualizations with ggplot2
  Cedric Scherer
  https://posit-conf-2023.github.io/dataviz-ggplot2
13 It's Not Just Code: Managing an Open Source Project
  Tracy Teal
  https://posit-conf-2023.github.io/managing-os-project
Steal Like an Rtist: Creative Coding in R
  Ijeamaka Anyene & Sharla Gelfand
  https://github.com/posit-conf-2023/creative-coding
14 Data Science Workflows with Posit Tools - R Focus
  Ryan Johnson & Katie Masiello
  https://katie.quarto.pub/ds-workflows-r
Data Science Workflows with Posit Tools - Python Focus
  Gagandeep Singh & Sam Edwardes
  https://posit-conf-2023.github.io/ds-workflows-python

Keep the community vibes going

We’ll share news about posit::conf(2024) soon - subscribe to “Events” on our subscription page to stay up to date! In the meantime: