Grow your data science skills at posit::conf(2024)

August 12th-14th in Seattle

Working with IT

Talk to your IT Department about Posit

Man gesturing with hands while talking in conference room with a laptop, notebook and phone on the table

Who should I reach out to?

Who exactly will manage the Posit Pro Products varies by organization. Often they are part of the internal DevOps or hosting team. Some large organizations have roles specific to managing data science and analytics tools, or to managing Linux servers.

In large organizations, the different functions are split out, and there will often be separate subteams for the actual server hosting and product install, authentication, database access configuration, networking, and security.

Here are a few examples of IT roles that other data scientists have worked with:

  • Linux Admin
  • DevOps Engineer
  • Cloud Engineer
  • Software Engineering/Infrastructure

If you’re still unsure, we can help you identify the best team to work with. Our sales team can also help organize a call with one of our solutions engineers to talk with your IT department about best practices for managing Posit Team. Depending on how you’re managing your server, your team will also need to think about how your data science workflows will integrate into existing IT workflows.

How do I phrase what they will need to do?

Posit Pro Products are Linux-based server software products. In order to successfully install, configure, and manage them, you’ll need to provision servers (on-premises or in the cloud) and do a variety of Linux SysAdmin activities. Here are the Posit Pro Product requirements.

Offering an architecture review to your IT contact with one of Posit’s solution engineers is helpful for highlighting any requirements and how that will fit into your current environment. Contact our team for an initial discussion and to help set that up.

Depending on the complexity of your configuration, you may need to do additional tasks like:

  • Deploying multiple servers with load balancing
  • Deploying in docker + kubernetes
  • Configuring authentication against corporate SSO using LDAP/AD, SAML, or OAuth
  • Configuring database connections
  • Setting up proxies or reverse proxies, e.g. using Nginx or Apache
  • Automated, scripted deployment, e.g. using Chef, Puppet or Ansible

In some organizations these tasks may be spread across several teams.

How do I talk about open-source technologies?

An incredible variety of the world’s computation runs on top of open source software. Open source means that the code for these programming languages is developed in public and is available for public review. This does not mean that these bits of code are ill-maintained or unloved.

R and Python have been around since the early 1990s and have millions of users every year. Both R and Python are complete programming languages that are able to do a wide array of complex statistical calculations, machine learning tasks, dashboarding and reporting, and more.

They form the foundation of data science practices for many different kinds of organizations that include governmental organizations, major pharmaceutical companies, banks and other financial institutions.

The reality is that most organizations are already supporting open-source software. The 2021 State of Enterprise Open Source report survey shares that 90% of IT leaders (1,250 surveyed) are using enterprise open source today.

It can be helpful to highlight other companies in your industry that are using R/Python today and speaking publicly about it.

  • R Consortium Members: Biogen, Genentech, Microsoft, alteryx, esri, Google, GSK, Janssen, Mango, Merck, Oracle, Procogia, ThinkR
  • ThinkR list
  • Posit customer stories
  • Posit conference, webinar, and meetup recordings
  • Posit has 5,200+ active commercial software customers (59% of the Fortune 100 use Posit commercial software)
  • Millions of people use the Posit open-source software every week

Job postings are another great way to see what technologies other companies use.

Is it secure? Do we need to do a security review?

Many organizations have very stringent security requirements based on Software-as-a-Service products, where you share your data with the vendor to use their tools. All Posit products (Pro and Open Source) run entirely in your environment. Your data never leaves servers you control and configure. For many organizations, the security review starts and ends with that fact.

On top of that, Posit’s Pro Products support all industry standard security and authentication tooling, including the latest SSL/TLS standards, authentication against LDAP/AD or corporate SSO, and the ability to be deployed in any networking configuration including completely offline/air-gapped.

If you do have a security review process or form to complete, we’re happy to have a conversation about requirements as well.

How can we trust R packages?

This is a fascinating question that deserves a detailed response. Many in the R community are actively working on this challenging question, just as people in other open-source ecosystems tackle these challenges.

While not extensive, we offer these 4 considerations for users or admins wondering about package security:

  1. Posit Package Manager allows you to control exactly what packages are brought into your organization through curated sources.
  2. Posit provides R packages to Posit Package Manager through an upstream Posit service designed specifically for this task. The connection between this service and Posit Package Manager is encrypted. Daily updates to CRAN are reviewed by our team before they are made available through this service. The review process checks for consistent package metadata and also updates the package checksum file, used by the R client to ensure downloaded package files are correct. We highly recommend that the connection between your R clients and Posit Package Manager be encrypted by hosting your Posit Package Manager instance over HTTPS.
  3. CRAN requires all submitted R packages to pass a series of checks prior to accepting them into the CRAN repository. These checks include installing the package alongside other CRAN packages and running package unit tests. While these tests do not specifically target malicious code, the tests provide a significant hurdle to uploading malicious packages to CRAN.

R code is almost always executed as a non-privileged user. The majority of R code, especially code run in Posit Workbench or Posit Connect, is executed on behalf of a restricted service or user accounts. Posit Workbench, for example, runs under an AppArmor profile that is inherited by the R processes it invokes on behalf of non-privileged users. Similarly, Posit Connect provides an extensive sandboxing process to run user code in an isolated environment. Additionally, while Posit Package Manager provides a means for users to download packages originating on the internet, most R code is executed in offline environments, often dedicated analytic sandboxes. These measures not only prevent malicious code, but also keep analysts from accidentally interfering with one another.

What about review boards?

If you are part of a large organization, your IT department probably has a review board (for example: Architecture Review Board, Decision Review Board) whose purpose is to review and make decisions about new tools.

The review board is responsible for:

  • Reviewing new software initiatives and approving expenditures. Does this tool increase or decrease costs? What line items will this go under? What is the long-term cost projected to be? What is the cost of support?
  • Supporting the organization’s strategic vision. Does the tool help satisfy a customer’s needs? Does it help us remain competitive? Can it help us attract better talent? Does it make existing systems more efficient and agile?
  • Complying with existing systems architectures. Does the tool integrate with other supported tools? Will it be used in development and/or production? Does it duplicate the capabilities of other supported tools?
  • Managing risk and ensuring security. Does the tool comply with our formal security policies? Do the software licenses meet our legal requirements?
  • Defining roles and responsibilities for support. What groups own the tool? What support is offered with the tool? What internal resources will be required to maintain it? Who will provide training?

If your organization is already friendly toward data science tools but has not made it an official part of the organization, a formal review process is still valuable. The review process gives IT a formal stake in the ground when it comes to supporting R for the long term. It also makes future decisions about growth and investment much easier.

How will Posit fit into our existing architecture?

As part of the enterprise product licensing cycle, our sales team will schedule a call with a member of our Solutions Engineering team. The call is specifically for your IT leads to talk about how to deploy our products in accordance with your IT setup, and how to integrate them with your user authentication and scaling strategies.

Questions that we’ll cover with your team include:

  • Where are the servers, on-premises or in the cloud?
  • Which flavor of Linux will be used on the servers?
  • What is the scaling strategy?
  • What type of user authentication will be configured?
  • What is your package management strategy?

From the discussion, we will create an architecture diagram together on the call reflecting your environment – with all the relevant admin guide information to complete the installation.