Company News and Events

Johnson & Johnson x Posit Live Event March 2025 Q&A

Promotional material for a YouTube web event hosted by Johnson & Johnson and posit, visually represented by a person in a laboratory environment working with chemical formulas.

Imagine a pharmaceutical giant like Johnson & Johnson (J&J) charting a bold new course in its clinical trial operations. For the past five to six years, they’ve been on a fascinating journey, strategically embracing the power of open-source. This isn’t just a minor adoption. It’s a fundamental shift that has seen their statistical programming teams and statisticians become proficient in R.

Recently, Sumesh Kalappurakal, Tadeusz Lewandowski, Nicholas Masel, and Mark Bynens shared their remarkable story, highlighting J&J’s significant impact on the open-source drug development community, from the early days of R in Pharma to their ongoing support of numerous related initiatives.

The event was followed by a lively Q&A session. We share the responses to unanswered questions below.

Building an R infrastructure for clinical trials
Embracing the open-source shift
Upskilling statistical teams for open source

Webinar Q&A

Building an R infrastructure for clinical trials

Do you plan to share details about your R infrastructure? If not, what is the barrier to discussing your R platform specifications?

Infrastructure leverages AWS and containerized Posit Workbench and Posit Connect for deploying Shiny apps. Validated R packages are incorporated into two container releases per year. There’s also flexibility for users to explore packages from CRAN in a non-regulated environment. This setup helps balance regulatory compliance and open source flexibility.

Please also look at the Data Hangout: link.

Containers, as in Docker or are there other ‘container’ software(s)?

Yes, Docker.

How do you mitigate or manage the risk of using libraries and packages in a large corporation that handles confidential information, especially in the context of R?

We conduct a thorough selection process for R packages based on our specific requirements. Each package is carefully reviewed, and we document the associated risks along with appropriate mitigation strategies. These packages are incorporated within a validated containerized environment, ensuring their availability in our GxP-compliant statistical computing framework, where all activities are fully traceable.

Please share your experience in R package validation from the perspective of QA/authorities’ review.

J&J used guidelines from the R Consortium Validation Hub and partnered with a vendor and internal IT team for R package validation using a CI/CD pipeline. Containers with validated packages are released biannually. Education and collaboration with QA, legal, and IT teams were critical to understanding and accepting open source practices.

Were your R packages published on CRAN/Bioconductor, or elsewhere? How long did it take to get on CRAN?

The majority of the packages are available on CRAN, while a select few are hosted on GitHub with the aim of being published on CRAN in the future. Typically, the review and acceptance process for packages submitted to CRAN takes several weeks.

How did you overcome the challenges with R Packages, considering the security constraints around open-source packages?

We worked closely with our IT colleagues to ensure compliance with our policies.

How do you maintain the version controls of the open-source software?

Containerization. Packages are installed within a validated container, typically based on a specific snapshot date that determines the version. Each new container release corresponds with an updated snapshot date and updated package versions. When new open-source software becomes available, the package undergoes re-validation (running the test scripts for compatibility) and is integrated into the subsequent release cycle.

How does J&J ensure data integrity and reproducibility using open-source R in clinical trials?

Data integrity and reproducibility are ensured by containerization, validated environments, and thorough documentation. Containers can be spun up at any time, reproducing exact software and package versions. Outputs are traceable to specific packages and input data, and environments remain stable for reproducibility.

How extensive is the support for CDISC standards in your packages?

We do extensively support it.

Embracing the open-source shift

Can you please highlight your WHY about transitioning to R? You haven’t touched on what benefits you’ve gleaned versus the old SAS models.

1) The transition to R is primarily driven by its flexibility, extensive package ecosystem, and robust support from its vibrant community. These features collectively enhance data analysis, integration, and collaboration, making R a great choice for modern analytical needs. R’s open-source nature facilitates continuous updates and access to the latest analytical techniques.

2) Both R and SAS are excellent tools for data analysis, each excelling in their own right.

Our goal is to make statistical programmers agnostic to any analytical software and be multilingual.

What are the main comparative advantages for large industries to adopt an open sciences policy?

Advantages include reducing silos, leveraging common methodologies across companies, promoting standardization (e.g., safety outputs), and collaboratively improving IT solutions. Ultimately, this collaborative approach can create more robust and universally applicable solutions, significantly changing how the industry operates.

What business outcomes have you been able to achieve in your R journey?

There are significant strategic benefits from adopting open source: community support, standardization, and leveraging collective efforts. Although open source involves enterprise-level costs, the overall business value is high, especially from the community and innovation standpoint.

What tips do you have for me to help convince my organization to embrace R and Shiny instead of Microsoft PowerBI / Power Apps?

Emphasize specific use cases where R and Shiny excel, such as statistical analysis and interactive reports (e.g., DMC, safety, CSR reports). Highlight the integrated nature of R’s analytic capabilities with Shiny, allowing seamless analysis-to-reporting workflows. Clarify that each tool has its strengths, and R/Shiny particularly excels in analytics-driven use cases.

Has the embrace of R and open-source programming been the statisticians on a project more likely to use R now? Effects on statistician-programmer interactions?

1) Yes, the embrace of R and open-source programming has made statisticians on projects more likely to use R. R’s extensive libraries, active community, and flexibility in data analysis and visualization have contributed to its increased adoption among statisticians.

2) The embrace of R and open-source programming has positively affected statistician-programmer interactions by fostering better collaboration and communication. Statisticians and programmers can now work more seamlessly together, sharing code and leveraging community-contributed libraries, which facilitates more efficient problem-solving and innovation in data analysis projects. This collaborative environment also encourages continuous learning and adaptation to new tools and techniques.

Upskilling statistical teams for open source

Are there trainings on R or open-source that you’d recommend?

There are a variety of general training courses available for R programming. While it is challenging to recommend a specific course without understanding individual needs, it is important to choose one that aligns with your objectives. At Johnson & Johnson, we initially engaged in general online training; however, we recognized the necessity for more tailored training specific to the packages we utilize. As a result, we are currently utilizing Accel2R training along with additional customized modules designed to meet our specific requirements.

Nick mentioned that if the team could not fulfill a TLG in R, they could do so in SAS. Can you elaborate? Was this a training/knowledge issue or a capability issue?

As we embark on this journey, we recognize that there is a learning curve involved. Some teams readily embrace change and adopt the new workflows, while others may prefer to continue with established practices. To ensure continuity and compliance, we are allowing time for all teams to adjust to the new ways of working. Despite implementing effective change management strategies, the challenges we face may not solely stem from capability or training gaps; there may be other influencing factors at play. Therefore, in C&SP we encourage a hybrid approach to facilitate successful adoption throughout the clinical trial operation artifact build.

How important was it to match the formatting of SAS outputs within R? Did you create “basic” versions of what SAS had done in the past?

Matching the formatting of SAS outputs in R was important but challenging. We attempted to match as closely as possible, though some R outputs were new. The CAMIS project (Comparing Analysis Method Implementations in Software (CAMIS): link) is recommended as a reference for handling differences in statistical procedures between SAS and R. We closely collaborate with our statisticians to make the final decision on these formatting.

Please outline the challenges or limitations with RShiny & its related packages.

At present, we do not observe any significant challenges or limitations in utilizing Shiny and other associated packages. However, it is important to acknowledge that there may be unforeseen circumstances when undertaking such endeavors. In this context, we find it beneficial to engage with the R community, which has consistently demonstrated its capacity to provide valuable support in overcoming potential challenges.

You mention the importance of building a community, but how did you deal with people who were anxious or averse to adopting R?

To address anxiety or aversion to adopting R, it is essential to provide supportive training, resources, and encouragement. Emphasizing hands-on practice and starting with simple, relatable tasks can help ease apprehension. Building a community where individuals can share experiences, ask questions, and offer guidance also fosters a welcoming environment. Encouraging a “just do it” mindset, akin to learning to ride a bike or a horse, helps users build confidence through practice and gradual familiarity with R. Adoption was notably easier among newcomers compared to those accustomed to traditional tools. Successful strategies included gradual exposure, hands-on experience with use cases, collaborative problem-solving, and leveraging strong internal and external communities to support newcomers through the transition.

Can you share more details about the training strategies that worked well? How did you overcome time constraints based on the existing workload?

Successful training strategies combined knowledge acquisition with practical application. It was essential to create opportunities for learners to immediately apply what they’ve learned in real projects or meaningful tasks. Committing to using new skills regularly was emphasized as critical to lasting skill development despite busy workloads.

If a big organization wants to start using R and Open Source, and they haven’t used R in the past? What smaller areas do you suggest starting with moving to R?

For a big organization new to R and open-source tools, it’s advisable to start with smaller, low-risk areas such as data visualization, internal reporting, or exploratory data analysis. These areas benefit from R’s powerful visualization packages and allow teams to become familiar with R’s capabilities without affecting critical business processes. Additionally, pilot projects or non-critical research areas can serve as practical learning opportunities for adopting R.

What was the group comparing SAS and R outputs again?

Comparing Analysis Method Implementations in Software (CAMIS): link.

Can you comment on the R Code submission to FDA/EMA? Is that similar to the SAS code submission? What about Internal and external packages? How are they shared?

Check out the R Consortium’s R Submission Working Group and the PHUSE project for Communication of Version Metadata for Open-Source Languages. We learned from those before us and are active in these groups, sharing our lessons learned and helping to refine an industry-recommended approach.

Learn more about J&J’s open-source journey

The engaging Q&A session that followed J&J’s presentation further emphasized the community’s interest in and adoption of open-source solutions. The unanswered questions, now shared, offer additional perspectives and practical considerations for those embarking on similar journeys.

Tags: pharma