2021-07-15
Art earlier shared an in-depth perspective on Open Source Data Science in Investment Management as a guest contributor to this blog, so I was curious to learn more about his experience, both as an CEO encouraging his teams to use open source data science, and as an R user himself.
Art shared that he started using R, a major language for open source data science, when he became frustrated with the limitations of Excel. As he describes it,
“One of the things that really bugged me was my current self had no idea what my past self did, when I opened a spreadsheet from a year or two prior,”
and he was forced to puzzle through the obscure formulas and the critical dependencies between spreadsheets. He started using R more and more, because he found he was “getting answers faster, and with reusable code.”
Video: How did you get started with open source data science, and why?
From Art’s perspective, it is absolutely appropriate, because it is “a great way to boost productivity, by empowering all the interested parties in the organization”. Art related that because of the reach and availability of open source, there were many different people at his organization working on analytic problems. Open source “lets a thousand flowers bloom”, but critically this can be done in a managed, curated way that addresses IT’s concerns, using platforms like RStudio Team to support the full data science production life cycle.
Video: Is open source software appropriate for enterprise-level data science?
Finally, I asked Art for his advice on how to build support for open source with an organization. While he says this is much easier than it used to be, as open source software has become more accepted, his primary advice was:
Video: How do you build support for open source software within an organization?