Easier data and asset sharing across projects and teams with {pins} and Databricks
Sharing data assets can be challenging for many teams. Some may rely on emailed files to keep analyses up to date, making it difficult to keep current or know what version of the data is used. {pins} improves sharing data and other assets across projects and teams. It enables us to publish, or ‘pin’, to a variety of places, such as Amazon S3, Posit Connect and Dropbox.
Given recent customer feedback, the ability to publish, or ‘pin’ to Databricks Volumes has been added to R. The same capability is also currently in the works for the Python version of {pins}.
Edgar Ruiz, Software Engineer at Posit PBC showcases the acceleration of predictions by distributing a ‘pinned’ model using pins and Spark in Databricks. He walks through integrating {pins} with Databricks in your team’s projects and covers novel uses of pins inside the Databricks ecosystem.
GitHub repo: https://github.com/edgararuiz/talks/tree/main/end-to-end
Here are a few additional resources that you might find interesting:
- Pins for R: https://pins.rstudio.com/
- Pins for Python: https://rstudio.github.io/pins-python/
- More information on how Posit and Databricks work together: https://posit.co/use-cases/databricks/
- Customer Spotlight: Standardizing a safety model with tidymodels, Posit Team & Databricks at Suffolk Construction: https://youtu.be/yavHEWpgrCQ
- Q&A Recording: https://youtube.com/live/HDTDmEaK5zQ?feature=share
Do you use both Databricks and Posit, but not together yet? Chat more with our team here.
How to join future events:
We host workflow demos the last Wednesday of every month. You can add them to your calendar here.