Open source packages - Quarto, Shiny, and more

Announcing vetiver for MLOps in R and Python

portrait of Julia Silge in front of gray and tan textured background
Written by Julia Silge
Written by Isabel Zimmerman
2022-06-09
The vetiver logo next to a circular diagram of the MLOps cycle. In this cycle, we collect data, understand and clean the data, train and evaluate a model, deploy the model, and monitor the deployed model. Monitoring can then lead back to collecting more data.

We are thrilled to announce the release of vetiver, a framework for MLOps tasks in R and Python. The goal of vetiver is to provide fluent tooling to version, share, deploy, and monitor a trained model.

Data scientists have open source tools that they love using to prepare data for modeling and train models, but there is a lack of fluent open source tooling for MLOps tasks like putting a model in production or monitoring model performance. Using vetiver for MLOps lets you use the tools you are comfortable with for exploratory data analysis and model training/tuning, and provides a flexible framework for the parts of a model lifecycle not served as well by current approaches.

As of today, the vetiver framework supports models trained via scikit-learn, PyTorch, tidymodels, caret, mlr3, XGBoost, ranger, lm(), and glm(). We are interested in what other modeling frameworks to support, so please let us know what you would like to use vetiver with!

Getting started

You can install the released version of vetiver for R from CRAN:

```{r}
install.packages("vetiver")
```

You can install the released version of vetiver for Python from PyPI:

```{python}
pip install vetiver
```

See our documentation for more on how to:

Why use vetiver?

The vetiver framework for MLOps tasks is built for data science teams that use R and/or Python, with a native, fluent experience for both. We especially had “bilingual” data science teams in mind as we designed vetiver’s approach, enabling teams that use both languages (or an individual who uses both) to deploy models with consistent and unified practices.

The vetiver framework provides data scientists with a first deployment experience that is as painless as possible, while being flexible and extensible for more advanced users. At RStudio, we have experienced how tools that are built to help beginners succeed and do the “right thing” are also typically good tools for data practitioners as they mature and advance. In vetiver specifically, functions handle both recording and checking the model’s input data prototype, to avoid common failure modes when deploying models. Other functions support predicting from a remote API endpoint so that you can treat a deployed model much the same as a local R or Python model in memory.

Get in touch

We are so happy about releasing vetiver for R and Python, and we want to know how to make it better. Join our discussion on RStudio Community to chat with us about deploying your models, and let us know what you would like to see from vetiver!

portrait of Julia Silge in front of gray and tan textured background

Julia Silge

Data Scientist & Software Engineer at Posit, PBC
Julia Silge is a data scientist and software engineer at RStudio PBC where she works on open source tools for machine learning and MLOps. She holds a PhD in astrophysics and has worked as a data scientist in tech and the nonprofit sector, as well as a technical advisory committee member for the US Bureau of Labor Statistics. She is a coauthor of Tidy Text Mining with R, Supervised Machine Learning for Text Analysis in R, and Tidy Modeling with R. An international keynote speaker and a real-world practitioner focusing on data analysis and machine learning, Julia loves text analysis, making beautiful charts, and communicating about technical topics with diverse audiences.

Isabel Zimmerman

Software Engineer at Posit
Isabel is a software engineer at Posit, PBC where she works on the Python experience in the Positron IDE. When not thinking about computers, she enjoys reading and teaching her old dog new tricks.