The adoption of open source tools into production can be a slow and challenging process. The end result, however, is usually an environment where data is more readily available and better decisions can be made. We invite you to join Jeff Hollister & David Smith to learn about two examples of EPA’s journey in adopting popular open source tools. This webinar will be an honest look at the challenges they faced and the progress they have made throughout this journey, along with how the EPA is currently leveraging open source tools to do better data science.

Part One: Two steps forward, one step back: a reluctant open data science transformation

Jeff Hollister – Research Ecologist, US EPA

Open science relies on improving the accessibility of all aspects of the research process including code, data, and manuscripts. The tools and concepts that help facilitate this may include open source software, proper licensing, collaboration platforms, and novel modes of publishing. At the US EPA, researchers embrace open science and utilize many of these tools. Jeff will present a user’s perspective on several of these topics, implementation pain points, and how he and his colleagues overcame these challenges.

Part Two: The EPA’s RStudio environment on Amazon Cloud

Dave Smith – Information Access and Analytical Services, EPA Office of Mission Support

Often, analysis on traditional desktop workstations runs into computational and storage limitations, which can then necessitate costly hardware upgrades. The EPA has been exploring potential for leveraging the elasticity of the cloud to overcome some of these limitations in supporting data science work. This effort has involved use of Docker containerization, automation and other approaches on AWS Cloud to dynamically provision data science environments on demand. In Dave’s portion of the presentation, he will present on this approach.

