Grow your data science skills at posit::conf(2024)

August 12th-14th in Seattle

Benefit Corporation Annual Report

2021 Annual Report

A Message from our CEO

RStudio endeavors to create free and open-source software for data science, scientific research, and technical communication in a sustainable way, because it benefits everyone when the essential tools to produce and consume knowledge are available to all, regardless of economic means.

We believe corporations should fulfill a purpose beneficial to the public and be run for the benefit of all stakeholders including employees, customers, and the community at large.

As a Delaware Public Benefit Corporation (PBC) and a Certified B Corporation®, RStudio’s open-source mission and commitment to a beneficial public purpose are codified in our charter, requiring our corporate decisions to balance the interests of community, customers, employees, and shareholders.

B Corps™ meet the highest verified standards of social and environmental performance, transparency, and accountability. RStudio measures its public benefit by utilizing the non-profit B Lab®’s “Impact Assessment”, a rigorous assessment of a company’s impact on its workers, customers, community, and environment. In 2019, RStudio met the B Corporation certification requirements set by the B Lab. The Certification process uses credible, comprehensive, transparent, and independent standards of social and environmental performance. Details of this assessment are available at bcorporation.net/directory/rstudio. In accordance with B Lab practices, our next certification will be done in December 2022.

As a PBC, RStudio publishes an annual report that describes the public benefit we have created, along with how we seek to provide public benefits in the future. This is the third of these reports. For the reader’s convenience, it includes information from prior report(s) that has not changed, along with material updates. The first report for 2019 published in 2020 may be found here. The second report published in 2021 may be found here.

To fulfill its beneficial purposes, RStudio intends to remain an independent company over the long term. With the support of our customers, employees, and the community, we remain excited to contribute useful solutions to the important problems of knowledge they face.

J.J. Allaire
CEO, RStudio, PBC


 

Introduction

RStudio’s mission is to create free and open-source software for data science, scientific research, and technical communication. We do this to enhance the production and consumption of knowledge by everyone, regardless of economic means, and to facilitate collaboration and reproducible research, both of which are critical to the integrity and efficacy of work in science, education, government, and industry.

RStudio also produces a modular platform of commercial software products that enable teams to adopt R, Python, and other open-source data science software at scale; along with online services to make it easier to learn and use them over the web.

Together, RStudio’s open-source software and commercial software form a virtuous cycle: The adoption of open-source data science software at scale in organizations creates demand for RStudio’s commercial software; and the revenue from commercial software, in turn, enables deeper investment in open-source software, which benefits everyone.

In 2021, RStudio spent between 40% and 50% of its engineering resources on open-source software including part-time contributors, and led contributions to over 320 open-source projects. RStudio-led projects targeted a broad range of areas including the RStudio IDE; infrastructure libraries for R; numerous packages and tools to streamline data manipulation, exploration and visualization, modeling, and machine learning; and integration with external data sources. RStudio also sponsors or contributes to more than a dozen open-source and community projects led by others, including NumFocusThe Carpentriesthe R Consortium and the cross-language Apache Arrow project led by Ursa Computing (now Voltron Data.)

Additional company and product highlights from 2021 can be found on RStudio’s January 2022 blog post: 2021 at RStudio: A Year in Review

RStudio’s approach is not typical. Traditionally, scientific and technical computing companies created exclusively proprietary software. While it can provide a robust foundation for investing in product development, proprietary software can also create excessive dependency that is not good for data science practitioners and the community. In contrast, RStudio provides core productivity tools, packages, protocols, and file formats as open-source software so that customers aren’t overly dependent on a single software vendor. Additionally, while our commercial products enhance the development and use of our open-source software, they are not fundamentally required for those without the need or the ability to pay for them.

Today, millions of people download and use RStudio open-source products in their daily lives. Additionally, more than 1,500 organizations that have the need and ability to pay for our on-premises commercial software, and thousands of individuals and businesses who pay for our cloud products, help us to sustain this work. It is an inspiration to consider that we are helping many participate in global economies that increasingly reward data literacy and that our tools help produce insights essential to making the modern world a better place.


 

RStudio Open-Source Projects

Some of the significant open-source projects led or substantially supported by RStudio include the following popular software for data science:

Tidyverse hex sticker on dark background

Tidyverse

The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.

The tidyverse consists of 27 R packages including ggplot2, dplyr, tidyr, and readr.

There are approximately 6.5 full time equivalent (FTE) RStudio employees developing Tidyverse and related open-source products as of December 2021.

Text: Tidymodels Cumulative downloads of Tidyverse projects since 2017-12-31. Line graph starting at 0 in 2018 and increasing to over 600 million in 2022.
Text: Tidyverse Cumulative commits to Tidyverse projects since 2017-12-31. An area graph of cumulative commits going from 0 in 2018 from both RStudio and other committeres to around 10,000 total in 2022. RStudio employees make up around two-thirds of those commits.
Tidymodels hex sticker on dark background.

Tidymodels

Tidymodels is a cohesive collection of packages that perform tasks relevant to statistical modeling and machine learning. Tidymodels packages share a common syntax and design philosophy, and are designed to work seamlessly with Tidyverse packages.

There are currently 35 tidymodels packages, an increase of 8 from 2020. Popular tidymodels packages include parsnip, rsample, recipes, tune, and yardstick.

There are 3.5 full time equivalent (FTE) RStudio employees developing Tidymodels and related open-source products as of December 2021.

A Tidymodels 2021 update may be found here.

Text: Tidymodels Cumulative downloads of Tidymodels projects since 2017-12-31. A line graph of cumulative downloads starting at 0 in 2018 to 40 million in 2022.
Text: RStudio Cumulative commits to tidyverse projects since 2017-12-31. Area graph starting at 0 in 2018 and increasing to over 10,000 in 2022.
 
Shiny hex sticker on blue background

Shiny®

Shiny is a popular R package and web application framework that makes it easy to tell data stories in interactive point-and-click web applications. Shiny applications can be shared with others via an open-source Shiny Server, the hosted shinyapps.io service, or with RStudio Connect. Shiny and related packages include shinyshinytestshinyloadtestshinydashboardleaflet, and crosstalk.

There are 7 full-time equivalent (FTE) employees developing the open-source Shiny and Shiny Server products as of December 2021.

Text: Shiny Cumulative downloads of Shiny projects since 2017-12-31. Line graph starting at 0 in 2018 to over 80 million in 2022.
Text: Shiny Cumulative commits to Shiny projects since 2017-12-31. Area graph starting around 0 in 2018 for both RStudio employees and other committers. In 2022, it is almost at 6000, with RStudio employees making up around three-quarters of the commits.
 
R Markdown hex sticker on a dark green background

R Markdown

R Markdown is an authoring format for computational documents, which are fully reproducible reports whose analysis can be re-executed on new data with the click of a button. R Markdown documents can be shared as Notebooks, slideshows, web pages, email attachments, print documents, and more.

Popular packages in the R Markdown ecosystem include rmarkdownknitrflexdashboardblogdownbookdowndistillrticles, and xaringan.

There are 4 full-time equivalent (FTE) RStudio employees developing R Markdown and related open-source products as of December 2021.

Text: R Markdown Cumulative downloads of R Markdown projects since 2017-12-31. Line graph starting at 0 in 2018 to over 200 million in 2022.
Text: R Markdown Cumulative commits to R Markdown projects since 2017-12-31. Area graph starting at around 0 in 2018 for both RStudio and other committers to around 12,000 in 2022. RStudio employees make up around three-quarters of those commits.
 
reticulate, Tensorflow, sparklyr and googlesheets4 hex stickers on a blue background with hex pattern

Connectivity Packages

RStudio increases the efficiency of R users by making open-source R packages that connect data scientists to spreadsheets, databases, distributed storage frameworks for big data, machine learning platforms, and the programming environments of other languages, like python.

Connectivity packages include: sparklyr, TensorFlow for Rgooglesheets4odbc, and reticulate.

There are 4 full-time equivalent RStudio-funded developers creating connectivity-related open-source packages as of December 2021.

Text: Connectivity Cumulative downloads of Connectivity projects since 2017-12-31. Line graph starting at 0 in 2018 to 60 million in 2022.
Text: Connectivity Cumulative commits to Connectivity projects since 2017-12-31. Area graph starting at 0 in 2018 for both RStudio employees and other committers and increasting to 15,000 in 2022, with RStudio employees making up about five-eighths of the commits.
 
devtools, test that, use this, roxygen2 hex stickers on blue background with hex pattern

R Infrastructure Tools (r-lib)

R-lib is a large collection of R packages that make it easier to build, find, and use effective tools for data analysis.

There are currently 111 R-lib packages. Popular packages include devtoolstestthatroxygen2pkgdown, and usethis.

There are 2 full-time equivalent (FTE) RStudio employees developing r-lib and related open-source packages as of December 2021.

Text: R Infrastructure Tools Cumulative downloads of R Infrastructure Tools projects since 2017-12-31. Line graph starting at 0 in 2018 and increasing to over a billion in 2022.
Text: R Infrastructure Tools Cumulative commits to R Infrastructure Tools projects since 2017-12-31. Area graph starting at 0 for both RStudio employees and other committers and increasing to over 30,000 in 2022.
 
rstudio hex sticker on blue background

RStudio® Integrated Development Environment (IDE)

RStudio is a multi-language IDE designed for Data Science with R and Python. It augments the standard code console with an editor that can display Notebooks, launch apps, highlight code syntax, spot code errors, and directly execute code. Built into the IDE are also tools for debugging, plotting, browsing files, and managing project histories and workspaces. Together these tools make data scientists and developers much more efficient.

There are 5 full time equivalent (FTE) employees developing the RStudio IDE open-source desktop and server products as of December 2021.

Text: RStudio IDE Cumulative downloads of RStudio IDE projects since 2017-12-31. Line graph starting at 0 in 2018 increasing to over 30 million in 2022.
Text: RStudio Cumulative commits to RStudio IDE projects since 2017-12-31. Area chart starting at 0 in 2018 for both RStudio employees and other committers and increasing to 15,000 in 2022. RStudio employees make up almost all of those commits.

 

Donations to Open-Source Software and Community Initiatives

In addition to the open-source software that we make freely available, and our support for Ursa Computing, RStudio recognizes the importance of contributing financially to other valuable open-source and community initiatives. To date, RStudio has given over $1.2M to projects led by others. Current commitments include contributing to NumFOCUS, the R Consortium, the R Foundation, the Linux Foundation, and to authors and maintainers of fourteen smaller open-source projects.

 


 

B Lab® Impact Assessment

Overview

The B Lab Impact Assessment (see https://bimpactassessment.net/) is measured on a 200-point scale, with a minimum score of 80 required for a company to be eligible for B Lab certification. RStudio completed its first Impact Assessment in the fall of 2019 and received an overall score of 86.1. To put this score in context, the average score of “ordinary” (non-certified BCorp) businesses of our size is 53.4, while the median score for companies on the B Lab’s list of “Best for the World” honorees is 131. [Source: bcorporation.net/en-us/find-a-b-corp/company/rstudio.]

The Impact Assessment is composed of questions in five Impact Areas: Governance, Workers, Community, Environment, and Customers. RStudio’s score in each category can be found in our report from January 2020, available here. In accordance with B Lab practices, we will complete the B Lab Impact Assessment again in the fall of 2022. Results will be published next year, in RStudio’s Benefit Report for 2022.

 

Progress in 2021

RStudio seeks to improve our internal governance, increase our workforce diversity and employee development efforts, expand our stewardship of the environment, deepen our engagement in our communities, and serve customers so that our public benefit will continue to improve each year.

In our initial assessment, we received high marks for incorporating as a benefit corporation, the health, wellness, safety, and financial security of our employees, and for educating and serving customers. We identified formal goal setting, career development, diversity, equity & inclusion, civic engagement & giving, and air & climate as areas for improvement.

In 2021 we made notable progress in the following areas:

 

Governance

A company’s positive governance impact is measured by the extent to which the company is accountable to stakeholders, and the extent to which its decision-making is transparent to all constituents. As noted last year, RStudio scored 16.1 points out of a possible 21.9+ points in the Governance Impact Area, including 10 points awarded for the specific legal structures we have put in place as a Benefit Corporation that preserve our mission and consider our stakeholders regardless of company ownership.

RStudio continues to share financial and other company performance information transparently with its shareholders and employees. In 2021 additional financial and support metrics provided new insights into customer retention, growth, and satisfaction and the company added Anti Bribery and Corruption to its required employee training. The company also added an experienced Chief Product Officer to its executive team in 2021, to position the company for future growth. We continue to have a relatively broad pool of shareholders, including many current and former employees.

To improve our governance impact in 2022, RStudio will further add to the metrics shared with stakeholders, including social/environmental results, and provide additional employee training, including training on the company’s updated code of ethics.

 

Workers

A company’s positive impact on workers is measured by the extent to which it maintains a compensation and benefit structure beneficial to its employees, supports ongoing career development, and fosters a positive work environment. As noted last year, RStudio scored 30.5 out of a possible 43.2 points in the Workers impact area of the B Lab assessment, attributable in large part to our generous benefit offerings, including 12 weeks of paid leave for all new parents, a 401k matching program, and an annual profit-sharing plan open to all regular employees. RStudio’s flexible work practices, particularly our remote model and unlimited PTO policies, were also significant factors in our impact in this area and served both employees and the company well during the Covid-19 pandemic in 2020 and 2021.

Despite missing out again on valuable in-person company gatherings in 2021 because of the pandemic, the company implemented its first full-scale organizational survey to gauge engagement and satisfaction and develop improved career development guidelines, which will strengthen future assessments of RStudio’s impact in this area.

In addition to ongoing surveys, RStudio plans to implement management training and career development guidelines to foster a positive work environment in 2022.

 

Community

Community impact is measured by the extent to which a company creates jobs within local communities; fosters inclusion and diversity within the organization; demonstrates civic engagement through philanthropy and advocacy; and favors suppliers that share B Corp values. As noted last year, RStudio scored 11.9 out of a possible 20+ points in the Community impact area. Our inclusive hiring practices, equitable pay ratios (e.g., between the highest- and lowest-paid workers), charitable giving history, and strong job-growth rates are some of the factors behind this positive impact.

Some elements of the community impact measures, especially those that analyze RStudio’s economic impact on “local” geographies, may be difficult for us to achieve given our remote workforce model. On the other hand, we can significantly strengthen our impact on the community by furthering the diversity within our team – for example, by increasing the percent of women employees and managers, broadening the age distribution of our workers, and continuing to actively source talent from underrepresented or minority social, racial, and ethnic groups.

In 2021 RStudio strengthened its Community impact most significantly by increasing the percentage of managers identifying as women and underrepresented social groups and by implementing voluntary demographic reporting for all employees to improve accuracy.

 

Environment

A company’s positive environmental impact is measured by the extent to which its products, services, suppliers, and decisions promote positive environmental outcomes. As noted last year, RStudio scored 3.4 out of a possible 8.9+ points across all Environment impact area questions.

As a software company, we do not conduct any physical manufacturing, and our marketing, sales, and support models are almost entirely digital – eliminating many of the most common sources of environmental hazards found in business operations. Beyond this environmentally-neutral base, RStudio’s positive environmental impact is largely based on our remote-first work culture, which drastically reduces the footprint of our physical workspace, as well as the pollution generated by daily commuting.

To improve our environmental impact, we began measuring GHG (greenhouse gas) from company travel/events and purchased carbon offsets in 2020, equivalent to the total emissions produced by company air travel since our inception. We further reduced our impact in 2021 by nearly eliminating travel for in-person events, including company meetings, trade shows, and conferences during the pandemic. This included rstudio::conf 2021, which we held as a virtual event instead.

In 2022 we will return to traveling for some business and host rstudio::conf in person as well as online while continuing to work towards our goal of becoming Climate Neutral Certified.

 

Customers

Customer impact is measured by the degree to which a company’s products and/or services deliver social, educational, or environmental value to customers, as well the extent to which company practices serve customer interests in areas such as quality control, data privacy, and customer satisfaction. As noted last year, RStudio earned 24.1 out of a possible 25+ points in the customer impact area, with 20.6 of these points awarded for the strong orientation toward education, knowledge-sharing, and skill-building in our products and community contributions.

While our scores in the customer impact area remain strong, we continue to provide stakeholders with new ways to assess customer feedback and customer satisfaction by working with industry analysts and the enterprise software-focused review site TrustRadius to capture authentic Customer reviews. Our average TrustRadius rating improved from 8.8 to 9.1 with over 100 reviews at the end of 2021. In 2021 we also launched RStudio Enterprise Community Meetups and Data Science Hangouts for open and friendly opportunities to meet others using RStudio open source or professional products.


 

Conclusion

RStudio endeavors to create public benefits through the hard work of our employees and partners and in collaboration with the open-source data science community we serve. As a public benefit corporation we will continue to pursue and report improvements in internal governance, workforce diversity and employee development efforts, our stewardship of the environment, engagement in our communities, and, of course, substantial contributions to open-source software for science.

Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
RStudio and Shiny are registered trademarks of RStudio, PBC.