2022-04-21
Photo by Wesley Pribadi on Unsplash
Have you ever created a Shiny app that works great on your computer, but had issues once you shared it with others? This is a common experience for many data scientists. To successfully put an app into “production” — where others need it to be accessible, speedy, safe, and accurate — there are several challenges that we may have to overcome. With the right environment and approach, our Shiny app can smoothly deliver data-driven insights for all users.
This post leans heavily on great talks by Joe Cheng and Kelly O’Briant and the book Mastering Shiny by Hadley Wickham. If you want to see more on the topic, please check them out.
Drawing from Joe Cheng’s talk, when we say putting a Shiny app in production, we mean that users are accessing, running, and relying on your app “with real consequences if things go wrong”. For a Shiny app to be successful, it must meet certain criteria:
For example, California’s COVID Assessment Tool (CalCAT) is built on Shiny. The CalCAT team created a public-facing app to serve millions of Californian residents. The team also maintains an internal Shiny app that requires authentication to view confidential data. With RStudio Connect and RStudio Package Manager, they can manage any influx of traffic to keep the app running smoothly.
However, even with a proper production environment, various challenges may arise. Let’s explore what these could be.
Shiny apps are created by R users. An R user can quickly and iteratively create a Shiny app with no knowledge of HTML, CSS, or JavaScript.
However, since R users are not necessarily software engineers, we may only realize we’re creating a production app when others need to access it regularly. We may not be aware of best practices for putting things in production. Without this awareness, we may not know to ask for the necessary resources or time to improve the performance of our apps.
IT and management may have questions when we try to put Shiny apps into production. Perhaps they had never heard of Shiny before and are unaware of what resources and environment it requires. Related to the challenge listed above, they may not want data scientists to create production artifacts due to security concerns. Regardless of the reasons, we need to make the case for R and Shiny when communicating with these teams.
We mentioned that apps need to be up, safe, correct, and snappy. To accomplish this, we need to develop our apps carefully. Data scientists need a process for optimizing their code. This means identifying what needs improvement, understanding what to do next, and testing thoroughly.
These challenges play out differently from organization to organization. However, we do have certain tools that can help address cultural, organizational, and technical barriers to putting Shiny in production.
A “sandbox” should be part of our development infrastructure. The sandbox is a place to stage our work that is identical to our production environment. As opposed to the production environment, this is a spot where we expect (and want!) things to break. This lets us find and work out bugs before putting our app out in the real world.
A sandbox not only provides a low-risk way for developers to see how things would work in production, but it also gives us a way to showcase our skills to management. Once we publish our app to our sandbox, we can demonstrate the Shiny app’s functionality to get approval on the tool.
To highlight an example, the Dutch National Institute for Public Health and the Environment (RIVM) deployed the “Clusterbuster” Shiny app to help hundreds of doctors and epidemiologists battle COVID-19 in the Netherlands.
First, everybody needed to agree on a tool. Using a prototype like the one below, the RIVM team showed that Shiny could create an aesthetically pleasing app with a positive user experience. This convinced the IT and management teams to move forward with Shiny.
RStudio Connect, RStudio’s enterprise-level publishing platform, can provide data scientists with a sandbox. Data scientists can create a staged version of their app and decide who has access to test it out. Then, we can publish to a separate Connect environment for production.
RStudio has various R packages that support the development of production-ready Shiny apps. Data scientists can implement a workflow with these packages to draw on best practices from the software engineering world.
The optimization loop consists of:
Once we have completed the loop, we benchmark again to determine if our app is fast enough for our needs.
Ultimately, our goal as data scientists is to communicate insights that drive value to our organization. Once our Shiny app is out there, how do we show that it’s making an impact? We can put numbers behind our work to demonstrate how effectively we’re reaching others.
The RStudio Connect Server API tracks our app’s user activity. We can access data on visits, session duration, and more. Not only that, but we can specify goals that send us an alert when we aren’t doing as expected. Evaluating our app helps us make decisions on how to improve and tailor our users’ experience.
We can create a dashboard for stakeholders to explore the API data in real-time. This one shows the most popular apps and most active viewers:
Metrics like these help us demonstrate the impact that our app is making. This helps management understand the value of Shiny apps in production.
Want to learn more about how to successfully put Shiny in production?
Want to see other examples of Shiny apps in the real world?