How to keep data up-to-date with 6 pins workflows (aka avoid data-final.csv & data-final-final.csv)

Ever chase a CSV through a series of emails or had to decide between data-final.csv and data-final-final.csv?

Pins (both for R & Python) is a package that a bunch of people at the Data Science Hangout wish they knew about earlier. It allows you to publish and share objects (data, models, etc.) across projects and with your colleagues.

 

Many people find this useful for:

  • Scheduling reports that need to be updated with the newest data each week
  • Reusing data across multiple projects or content (Shiny app, Jupyter Notebook, Quarto doc, etc.)

 

Helpful resources:

 

Timestamps:

1:15 – Posit Team Overview

2:18 – Introduction to pins (scenarios where you might want to consider using pins)

4:42 – Installing pins

6:24 – Workflow #1: Pinning an R Object to Posit Connect (from RStudio)

10:23 – Workflow #2: Pinning a Python Object to Posit Connect (from JupyterLab)

15:19 – Workflow #3: Reading in a Python pin in an R Session

16:07 – Workflow #4: Reading an R pin into a Python session

17:50 – Workflow #5: Pin versioning

21:50 – Workflow #6: Automating the pin writing process (through job scheduling on Connect)

 

Subscribe to more inspiring open-source data science content.

We love to celebrate and help people do great data science. By subscribing, you'll get alerted whenever we publish something new.