Ning Leng

Hinal Patel

Jingyuan Chen

James Black

Company News and Events

Roche x Posit Live Event Sept 2024 Q&A

2024-10-18

Roche has achieved a significant accomplishment in an open-source R for regulatory submissions to the FDA, EMA, and NMPA. In our September live webinar, Ning Leng (ad-interim Global Head of the Data Science Acceleration Enabling Platform), Hinal Patel (Principal Data Scientist), and Jingyuan Chen (Principal Data Scientist) shared their incredible journey in reaching the filing goal. The event highlighted the practicalities of processing, analyzing, and visualizing clinical trial data in R, resulting in a comprehensive eSubmission package.

Following the presentation was an engaging question-and-answer session. Due to high interest, several questions remained. Along with James Black (Senior Director, Insights Engineering, Manager & Product Family Lead), the speakers address them below.

Webinar Q&A

Can you enable the links highlighted in the presentation? R Validation Hub’s white paper, Roche’s approach to software validation, and others.

R Validation Hub case study
White paper on A Risk-based Approach for Assessing R package Accuracy within a Validated Infrastructure

Slide no.10: “PD Data Sciences” – what does “PD” stand for?

It stands for “Pharma Development.”

Are you using the free RStudio or commercial Posit products? Can you explain your rationale for this decision?

This talk discusses our Ocean infrastructure. Posit makes many of the enterprise features we require accessible only in their paid products.

Is the OCEAN infrastructure cloud or on-prem, and can you touch on any decision points for a validated R environment when deciding between the two?

Our platform is predominantly on AWS in a VPC. “On-prem” is a bit of a misnomer in a company the size of Roche, as we run large data centers. Cloud providers are used to working in regulated industries—and with AWS at a scale larger than any Pharmaceutical company.

How much time in drug development should we expect to be reduced?

It’s hard to say. We’re combining the use of open-source R packages with automation and AI-augmented approaches, and we’re confident all are leading to meaningful time savings.

Have you achieved cost savings, time reductions, or other measurable benefits from using R over SAS? Our stats programming leadership is asking us for this.

R has opened up new potential for us, like Shiny, maximizing industry open-source collaborative development, and easier talent recruitment. From a cost-savings perspective, we have conducted pilots that showed benefits compared to our legacy solutions.

Did you have any problems replicating the R programming in SAS because of different methods that the software applies? ‘Rounding’, default options, etc.? How did you overcome this?

Consult the PHUSE CAMIS project for more on this. As long as QC differences can be explained, such differences in software approaches are acceptable.

Why R? Have you considered Python? What are the pros and cons of using R compared to using Python?

We do use Python for building certain tools, but for clinical reporting, we felt R is the right fit. Some of the biggest advantages are the talent pipeline already knowledgeable in R, cutting-edge statistics and graphics, and the wealth of open-source packages targeted to our needs such as via the pharmaverse.

Should R code be executed by a regulatory reviewer? Or is it just to explain the algorithms and derivations?

It’s entirely up to them. One of the benefits of using open source is that it is open for anyone to use, including regulators.

Any license issues with any R packages?

No, this is part of our considerations of usage of any open-source package for GxP purposes.

Regarding package validation, does the FDA approve certain packages or package versions over others?

No, it is on the sponsor to assess, but the R Validation Hub offers useful resources and materials to help.

Could you please share how you manage integrating new open-source R packages within the validated R container? Also, what is the typical turnaround time (TAT) for this process?

Our R package validation process has been summarised as an R Validation Hub case study. The fastest turnaround was 2 business days from request to availability as a validated package in the environment, but we typically do not validate against short timelines.

It seems you have still used a “proprietary” R package. So, this whole E2E R submission is not 100% open-source. Am I correct?

Even the proprietary package admiralroche was majority based on functions from the open-source admiral packages, only with choices to match Roche ADaM implementation standards. Read more in this PHUSE paper.

Are the pharmaverse packages used for TLG preparation validated? Or did you have to validate them? When performing validation, did Roche utilize a Shiny application to compare results (datasets, TLFs) between SAS and R?

Our R package validation process has been summarised as an R Validation Hub case study.

Since Pharmaverse is not validating packages, does that mean that each company using a Pharmaverse package is responsible for executing validation?

Yes, but look out for this Regulatory Repo WG from R Validation Hub.

Has your submission included Japanese PMDA? I have a feeling they are more conservative than other agencies when it comes to using new technology.

No, as per the slidedeck, we haven’t experienced such.

I assume that the packages you develop rely on already existing external R packages. How do you handle package dependencies and versions? Using renv or conda?

We use renv.

Did you submit the datasets as JSON or SAS transport files?

We submitted transport files/XPT created using R via xportr.

Do you use a workflow orchestration tool with R (e.g., Airflow)? If so, can you give some feedback on your experience?

We use Snakemake – it is too early to comment on the experience.

What are your thoughts on using Docker images with the RStudio server to share the analysis? I think it would give more control and stability in reproducing and evaluating the analysis.

One of the submissions working group’s next objectives is to transfer a reproducible Docker container for the R analyses to the FDA. Stay tuned!

Learn more about the use of open source in pharmaceuticals

We thank Ning Leng, Hinal Patel, Jingyuan Chen, James Black, and everybody at Roche for sharing their story and answering the community’s questions.

If you missed the event or would like to rewatch it, please find the recording on YouTube.
If you are interested in the use of open source in clinical trials, schedule a call to speak with our pharma experts.

Tags: admiral clinical trials fda open source pharma pharmaverse renv roche