openstatsware: From the mmrm R package to a lively community

2025-07-29

A dark blue background with white dots. Two hexagons are on the left and right sides. The left hexagon is yellow with the text "open stats ware" written on it in dark blue. The right hexagon is white with the words "mixed model for repeated measures" written on it in black, red, blue, and green.

You might have read about openstatsware or heard a presentation on it before – but maybe you don’t know who is behind it, what it is about, and where it comes from. Today, we sit down with the co-chairs Alessandro Gasparini, Daniel Sabanés Bové, and Ya Wang of the openstatsware working group to learn more about the initiative and its future plans.

How would you describe openstatsware?

Daniel: openstatsware is a scientific working group affiliated with both the American and European associations for statisticians in the pharmaceutical industry, i.e., ASA BIOP and EFSPI/PSI. We are more than 60 members from over 30 organizations, and have regular meetings to discuss topics related to open-source statistical software. We also have workstreams where we build specific R packages together.

Ya: openstatsware promotes best practices in statistical software development, especially in R, and supports collaboration across industry, academia, and regulatory bodies. It maintains resources like guides, presentations, and R packages – such as mmrm for mixed models for repeated measures (MMRM) – which are shared through our website openstatsware.org and GitHub repositories.

Alessandro: All of the above and more! Besides developing software packages collaboratively and discussing current topics related to open-source statistical software, openstatsware has developed a wealth of training material to promote good practices in software engineering, with a special focus on statistical software. This includes presentations and posters that working group members presented at conferences around the world, our own openstatsguide (an opinionated set of minimum viable good practices for high-quality statistical software packages), and several workshops on good software engineering practices for R packages.

How did you get here?

Daniel: I will start at the very beginning – in the spring of 2022, when I had been leading the Statistical Engineering team in Roche Product Development Data Sciences. I heard from a colleague that, unfortunately, our previous solution combining lme4 and lmerTest did not work at all for larger data sets with many time points. So I started to work with Ben Bolker from the glmmTMB team to try to add the required Satterthwaite degrees of freedom approximation to glmmTMB. But it did not work out, and at the beginning of June, I took a few COVID-related home quarantine days and worked on a prototype from scratch directly on top of the TMB package, which provides the C++ framework. Within a long weekend, I had a working prototype, and I felt it was a breakthrough: it was fast, and gave the same results as SAS PROC MIXED. So I started building the first package version of mmrm on top. But I did not want to do it alone. Yilong Zhang had the initial idea to create a working group for MMRM, and I pivoted on the idea and with 10 other initial members from 6 other pharma companies proposed a “Biostatistical Software Engineering” working group to the American Statistical Association Biopharmaceutical Section (ASA BIOP). This was accepted in mid-August 2022, and we had our very first meeting soon after.

Alessandro: In the summer of 2022, I was attending the annual conference of the International Society for Clinical Biostatistics (ISCB). Daniel organized a session on research software engineering (recording), which I contributed to as a panelist, and that’s where we got to meet each other. I was starting a new job at Red Door Analytics in the fall, and with the support of the company, I joined the working group soon after the conference. At the end of 2023, the name openstatsware was born (I still remember the few meetings where we brainstormed a new name for the working group – that was really fun!), and we rebranded the working group and website soon thereafter. And in case you are wondering, openstatsware stands for open-source statistical software, and was picked to reflect the objectives of the working group (engineer high-quality open-source statistical software and develop and disseminate best practices for the process). In April 2024, I took on the role of co-chair with Daniel and Ya, being based in the EU region – with Daniel based in the APAC region and Ya based in the US region, we are truly a global organization.

Ya: In the summer of 2022, I joined the development team for the mmrm R package. With prior experience in R package development and comparative analysis between SAS and R, I was eager to contribute to the development in any way I could. Through this involvement, I discovered that the working group focuses on statistical software engineering and cross-industry collaboration – areas that strongly align with my interests. This naturally led me to join the broader working group as well.

Building on the success of mmrm, several new workstreams have been launched: One focuses on developing a Bayesian MMRM R package to support robust analysis of longitudinal clinical data. Another aims to create high-quality, open-source R tools (packages, apps, and user guides) designed to address key analytic needs in HTA dossier submissions across countries. Expanding beyond R, an additional workstream has developed a Julia package for Bayesian Safety Signal Detection.

What is special about the mmrm R package?

Daniel: With mmrm on CRAN now for almost 3 years, I was surprised that recently, still very cool testimonials have been posted on LinkedIn. One user wrote:

“I use R. If I ever open SAS, that is to use PROC MIXED with type=un and ddfm=KR for MMRM. Recently, I realized that there existed a new R package mmrm. I tested it with two datasets at work and three additional datasets from the book Common Statistical Methods for Clinical Research with SAS Examples, and the mmrm works beautifully.”

Another user wrote:

“I’ve been using SAS for 27 years, and after finding out about the mmrm package, I decided to switch to R”

These statements show that MMRMs were one of the key gaps in the R ecosystem for statistical software in pharma for a long time. Filling this gap has enabled statisticians to use R more confidently. Furthermore, I am convinced that we could deliver a great product here because of the highly collaborative nature of the development team – to put this into a few numbers, on GitHub, we have 29 contributors and five pharma companies have supported the project with developer time. This led to a rapid adoption of the package, with a total of over 100,000 downloads from CRAN by now.

Alessandro: From my perspective, mmrm clearly shows how members of different organizations can come together with a common goal and develop a modern, powerful, and thoroughly validated product that is quickly becoming the gold standard across the industry. This must be one of the best examples out there to showcase the (still ongoing) open-source revolution in the pharmaceutical industry, and I can’t wait to see what comes next.

Ya: The development of the mmrm R package highlights how open-source collaboration can accelerate innovation, eliminate redundant efforts, and cultivate a culture of ongoing improvement across organizational boundaries.

What is your favorite impact of openstatsware so far?

Ya: My favorite impact of openstatsware is how it’s reshaping the culture of statistical software engineering in pharma. Instead of every company working in silos, people are coming together to build tools they actually want to use in an open and transparent way. Projects like mmrm prove that when people share ideas and code, things move faster, the quality goes up, and everyone benefits. It’s not just about better software – it’s about changing how we work together.

Daniel: My favorite is that we have been able to maintain a lively working group for 3 years now: we still meet every 2 weeks for an hour, and our topics have expanded from the initial single mmrm R package to additional packages, workshops and conference sessions organization, guidelines development, validation, CRAN task view refurbishment, and more. New people are still joining openstatsware regularly, and we now have two time slots to accommodate different time zones. I would not have expected that we would still be so active after 3 years, and I am very happy that we have been able to build such a strong community, which can serve as a multiplier for the visibility of the statistical software engineering profession in the pharmaceutical industry.

Alessandro: Since its inception, openstatsware has developed and delivered training courses and presentations around the world: my favourite impact of the working group activities is certainly our contribution to improving the research software engineering skills of statisticians in the life sciences field. Besides that, it’s been fun to see how openstatsware has been a catalyst for the open-source transition in the pharmaceutical industry, together with other great initiatives such as openpharma, pharmaverse, R/Pharma and the R Validation Hub.

What are your plans for the future?

Alessandro: We are teaching the Good Software Engineering Practices for R Packages workshop in Basel this August as part of the 2025 ISCB conference, and I look forward to interacting with the participants. The workshop has been on a worldwide tour over the past couple of years, and we are actively working behind the scenes to bring it to more locations. Beyond that, we have been discussing internally new potential software products, and I hope we get to start working on some of those soon. Keep an eye on our GitHub organization and website for all the latest news.

Ya: We currently have three active package development workstreams: mmrm, brms.mmrm for Bayesian inference in MMRMs, and maicplus for matching-adjusted indirect comparison analyses. Looking ahead, we’re actively exploring additional gaps in the statistical software landscape and launching new workstreams to address them.

Daniel: As mentioned by Alessandro, the workshop has been going really well – it would be awesome if we had additional workshops on other topics, such as Shiny development, or validation of R packages. Similarly, the openstatsguide, our practical, short checklist for high-quality R package development could be expanded into additional guides. I am optimistic that with the continued enthusiasm of our openstatsware community, we can build these kinds of resources and opportunities in the future.

Continue the openstatsware journey

Join us at an upcoming workshop on “Good Software Engineering Practice for R Packages”:

Philadelphia, US, 12 August @ MCP25 – open for registrations here!
Basel, CH, 24 August @ISCB25 – fully booked
Paris, FR, 10 October @SnB25 – open for registrations here!

We’ll also be at R/Pharma events:

In-person R/Pharma Summit at posit::conf(2025), Atlanta, US, 15 September
GenAI Day, Virtual, 19 August
R/Pharma Conference, Virtual, 3-7 November

Tags: mmrm openstatsware pharma

Alessandro Gasparini

Alessandro Gasparini studied statistics at the University of Padua and biostatistics at the University of Milano-Bicocca, in Italy. After a couple of years working as a biostatistician at Karolinska Institutet in Stockholm, Sweden he moved to the University of Leicester in the United Kingdom to pursue a PhD in biostatistics, with a thesis on hierarchical modeling in the settings of electronic health records. After obtaining his PhD in 2019 he moved back to Karolinska Institutet as a post-doctoral researcher to further develop natural history models for breast cancer dynamics. In 2022, he joined Red Door Analytics, where he currently focuses on statistical consulting, methods and software development, and advanced training in biostatistics.

Daniel Sabanés Bové

Daniel Sabanés Bové studied statistics at LMU Munich, Germany and obtained his PhD at the University of Zurich, Switzerland in 2013 for his research work on Bayesian model selection. He started his career in Roche as a biostatistician for 5 years, then continued at Google as a data scientist for 2 years, before rejoining Roche as a statistical software engineering lead for 4 years. In 2024, Daniel co-founded RCONIS (Research Consulting and Innovative Solutions). He is (co-)author of multiple R packages published on CRAN and Bioconductor, as well as the book "Likelihood and Bayesian Inference: With Applications in Biology and Medicine".

Ya Wang

Ya Wang obtained her doctorate degree in Biostatistics from Columbia University, where her research focused on developing statistical methodologies for analyzing epigenetic data. Since joining Gilead in 2018, she has been actively involved in a range of initiatives at the intersection of statistical innovation and software development. Her current work includes the development of R packages, creation of interactive Shiny applications, methodological research, and statistical consultation across various therapeutic areas. She is passionate about open-source software, user-friendly analytics tools, and fostering collaboration in the scientific and healthcare communities.