Commercial enterprise offerings

Introducing Portable Linux R Binary Packages

2023-01-13
An illustration with the text "manylinux" on the left and a pattern of interconnected hexagons on the right. Many of the hexagons contain a blue "R" and a penguin icon.

As part of our long-standing mission to enable reproducible data science, since 2017, Posit has built and maintained Linux binary packages for over 22,000 CRAN packages across multiple Linux distributions and R versions. We make these packages freely available to the community through Posit Public Package Manager.

However, many packages used in data science rely on calling out to code available in other system libraries outside of R or Python. In the diverse world of Linux distros, it is often a challenge to identify which system libraries are required, and to ensure the proper compatible libraries have been built and installed on your system. This is further complicated in shared compute environments where IT intervention may be required to install additional system libraries when needed.

Today, we are previewing a potential solution to these problems: Portable Linux R Binary Packages.

What are portable binary packages?

At their core, portable binary packages are R packages where the system library dependencies are distributed and bundled with the package, rather than externally installed via the system package manager (e.g. apt or dnf). These are also universal packages that can be reused on a wide range of Linux distros (RHEL 9, Ubuntu, Debian, etc.), rather than being specific to one.

We took inspiration from solutions already widely used in the Python packaging community, primarily the manylinux and auditwheel projects. These tools define a standard, compatible set of system libraries to build packages against, and provide for inclusion of those libraries into a Python wheel file itself. Our portable binary packages leverage auditwheel to bring those same concepts to R packages.

Why should I consider using them?

Portable Linux R binary packages provide several advantages over traditional distro-specific binary packages:

  • Improved user experience: Users no longer have to install runtime system library dependencies for R packages separately. Those working in shared environments with limited permissions no longer have to ask administrators to install system libraries just to use a new R package.
  • Improved admin experience: Administrators no longer have to track down and install system library dependencies manually.
  • Broad Linux support: Portable binary packages should be compatible with more Linux distributions than we currently support, such as Amazon Linux 2023. In most cases, as long as your system is using a recent version of the GNU C Library (glibc), the packages should function correctly. These packages currently support glibc 2.28 and higher. Note that some popular distros such as Alpine Linux use the musl libc library instead of glibc, and remain unsupported.
  • Improved reproducibility: Packages use the system libraries available at runtime, which will change over time with system updates. Although distros are fairly stable, it is possible for a system library update or OS upgrade to potentially cause a failure or subtle differences in results, even if you haven’t updated your R package. Bundling specific versions of these libraries ensures long-term reproducibility.
  • Portability across R configurations: Some Linux users build R using different tools and libraries that may affect compatibility with other pre-built binary packages. Posit-built binary packages are compiled against a standardized build of R, which we make available and encourage use of to help avoid package compatibility issues. The new portable packages should be more broadly compatible with custom R builds.

Are these packages secure and safe to use?

Posit takes security seriously, and has made significant investments in maintaining a secure package build infrastructure. We build and publish over 100,000 distro-specific binary packages every month, across five R versions and nearly a dozen Linux distributions. All packages are built with the latest security updates applied to our entire toolchain, and rebuilt whenever any critical updates are needed.

These new portable packages bundle system libraries based on the RHEL 8 distribution, updated daily with the latest OS-provided security updates.

How can I try them out?

Portable Linux R binary packages are now available in public preview to install and test as we continue to resolve any issues. These have been tested internally, but during this preview phase you may still encounter some missing packages or incompatibilities.

Portable packages are currently provided for the majority of CRAN packages as part of a new manylinux_2_28 distribution via Posit Public Package Manager. To try the new packages, go to the Setup page and select “manylinux glibc 2.28+ (preview)” from the Linux Distribution dropdown and follow the instructions for reconfiguring your repository URL in R.

this is an image of something

If configured successfully, when installing a package, you should see the R output, for example:

* installing *binary* package ‘dplyr’ …

If you see “installing *source* package…”, R may not be properly configured to install Linux binary packages. See our configuration instructions and ensure that your R User Agent Header is properly configured.

Customers running their own Posit Package Manager servers should also see the new “manylinux glibc 2.28+ (preview)” distribution now available. No upgrade of the Package Manager server is required. Users can access the packages using the instructions above.

Questions and Feedback

We encourage you to try these out for yourself and eagerly await your feedback, especially if you encounter any compatibility issues or missing packages. Please post a message on our Posit Community forum with your feedback and any additional questions or ideas.