Grow your data science skills at posit::conf(2024)

August 12th-14th in Seattle

01 Sep 2022

Quantifying the hours saved

Tiger Tang

Manager, Data Science at CARFAX
We were joined by Tiger Tang, Manager, Data Science at CARFAX.
Watch this hangout
Clock showing 12pm on a laptop monitor on a desk framed by a computer terminal square

Episode notes

We were joined by Tiger Tang, Manager, Data Science at CARFAX. Tiger (Chongtai) Tang is dedicated to building the Data Science team specializing in NLP and forecasting. He is a big fan of Shiny and he has a passion for the Data Science community.

We recommend checking out Tiger’s 2022 RStudio Conference Talk as well, “Saving 1000 hours with RStudio”

 

How do you sell RStudio in your workplace?

Build a work automation process

 

How do you build an automation process?

Look at a typical report & identify all the manual steps

These manual steps can usually be tied into 3 portions: getting the data, wrangling/analysis/visualization, and communication
You can replace these portions with R code, with the help of various R packages

 

What are the three types of automation?

1. Attended automation – reports that still need human involvement: use R code in RStudio
2. Unattended automation – don’t need human input, but need to happen at the same time: use RStudio + RStudio Connect
3. Hybrid – combination of the previous 2, human input will come from the stakeholder & they can kick off processes to get answers: use Shiny + RStudio Connect

 

Ok, time to sell it to decision makers

 

What are the benefits?

 

    • Reproducibility

 

    • Less human error

 

  • Cost-benefit (hours saved)

 

Why weren’t they interested when you shared these?

The benefits listed above are great for selling to an R user who is concerned about the day-to-day workflow, but not decision-makers who are more concerned about ROI.

 

Update your strategy in highlighting the benefits:
1. Cost-benefit (hours saved) – If we go through with this, we might be able to save 1,000 hours per year
2. Less human error
3. Reproducibility (as a free add-on travel insurance)

 

It’s still the same benefits, just in a slightly different order. Start with one that does not require too much context to understand.

 

What do you need to do the actual automation?

 

Understand the current process and document it. This includes understanding:

    • the business reasons

 

    • the occurrence (daily, weekly, ad-hoc)

 

    • how you will communicate

 

    • how often to update the process so it will not become obsolete

 

  • you are not always the original report owner, so you need to know when to stop and call for additional help

 

From this you will understand the complexity, impact, and stability & can help you decide which project to start with as well.

 

Top 3 recommendations for automation:
1. Always start with components: For ex, if you have a process that involves: SQL, Excel, and Outlook – code them individually because the same team will need this again and you can reuse the code.

2. Test, Test, Test: Capture all the scenarios possible.

3. Be practical and stay on target: Not everything needs to be fully automated. It’s not about building something cool with R but building something impactful with R.

 

Structure for you:

1. Identify tasks in your workplace
2. Build a proposal with the benefits that matter to decision makers in your workplace
3. Build a requirement doc that identifies the right task to start with
4. Code by component and do plenty of tasks, while staying on target
5. Share the progress from time to time

Subscribe to more inspiring open-source data science content.

We love to celebrate and help people do great data science. By subscribing, you'll get alerted whenever we publish something new.