Hadley Wickham on effective R and Python data science workflows

During a recent coffee chat with Wes McKinney and Hadley Wickham, a participant asked a great question:
How can Posit Workbench or Connect help us integrate Python scripts and R models into a cohesive and efficient workflow?
Hadley Wickham, Posit PBC’s Chief Science Officer, highlights that a key goal is to put R and Python on equal footing.
Teams working with both R and Python often run into collaboration roadblocks: different workflows and development environments, trouble sharing models and data, difficulty getting results to stakeholders, and headaches integrating models into other systems. So, let’s unpack how Posit’s professional products, Workbench and Connect, along with our open-source ecosystem, help create a smoother, more collaborative data science experience.
The goal of both Workbench and Connect is to put R and Python on equal footing…Regardless of whether you’re using R or Python, you can develop your scripts in the same editing environment, and you can publish them to…the same environment using the same tools.
― Hadley Wickham
Posit Workbench for Data Science Development
Pre-configured Environments: Posit Workbench provides data scientists with pre-configured environments that are ready to use, eliminating the need for individual environment management. Besides RStudio, this includes support for popular Python editors like Jupyter notebooks, JupyterLab, and VS Code.
Centralized Management: Workbench offers centralized management of these R & Python development environments, making it easier for technology teams to provide and maintain them. This ensures that patches are applied and vulnerabilities are addressed across the whole team.
Scalable Computing Resources: Workbench integrates with orchestration tools like Kubernetes and Slurm, and allows easy access to scalable computing resources, enabling data teams to tackle large problems with just a few clicks.
Collaboration: Workbench enhances collaboration by providing a centralized platform for development. For example, users can easily share projects, run each others’ code and access the same databases and compute resources.
Posit Connect for Deploying and Sharing Python Models
Easy Sharing: Posit Connect makes it easy to share data science work, including R & Python-based models. This can be in the form of Jupyter notebooks, Plotly dashboards, Streamlit and other interactive applications.
API Deployment: Python models can be hosted as APIs (using frameworks like Plumber, Flask or FastAPI) through Posit Connect, allowing other applications – whether they come from R, Python or something else – to interact with these models.
Automation and Scheduling: Connect allows for the automation of tasks like updating data sources and distributing reports, freeing up data scientists for higher-value activities.
Open Source
How do we make sure the R & Python users can collaborate as effectively as possible? And to me, that’s mostly happening on the open source side … nanoparquet, …pins, …vetiver, …Great Tables, … having these tools where you end up with a shared vocabulary and a shared tool set and shared conventions around storing and using data that hopefully ease some of those boundaries a little.
― Hadley Wickham
Posit actively contributes to open-source tools that bridge the gap between R and Python. Examples include:
- Nanoparquet: A file format that allows for efficient sharing of data between R and Python. (parquet for Python, nanoparquet for R)
- Pins Package: Pins enable easy saving and sharing of datasets, models, and other data science objects with your colleagues. (pins for Python, pins for R)
- Model Monitoring Tools: Like vetiver for monitoring models (for R and Python) which work with both R and Python.
- Great Tables: A package that supports both R and Python for creating formatted tables. (gt for R, and Great Tables for Python)
- Quarto: Quarto, popular with R and Python users, helps developers publish their technical work into dozens of supported frameworks, like reports, PDFs, dashboards, slides, and much more.
- Shiny: Shiny is among the most popular R frameworks for building web applications, and supports Python users as well.
By sustaining these tools and platforms, Posit aims to create “a shared vocabulary and a shared tool set and shared conventions around storing and using data,” making it easier for teams and individuals using different frameworks to work together effectively.
We’d love to share more
Learn more about Posit Connect, Posit Workbench, and Posit’s support of open-source data science. If you’re interested in seeing how Posit’s professional tools can make your Python and R data science teams more effective, please consider scheduling a call to set up a demo.