A life-saving mission: how data science helps NMDP understand a registry of 42 Million to save more lives

Organization

NMDP℠ believes each of us holds the key to curing blood cancers and disorders. As a global nonprofit leader in cell therapy, NMDP creates essential connections between researchers and supporters to inspire action and accelerate innovation to find life-saving cures. With the help of blood stem cell donors from the world’s most diverse registry and our extensive network of transplant partners, physicians, and caregivers, we’re expanding access to treatment so that every patient can receive their life-saving cell therapy. NMDP. Find cures. Save lives.

Size

1,600+ employees

Industry:

Healthcare / Non-Profit

Region

North America

Partner Integrations

Snowflake

Products Used:

Posit Connect, Posit Package Manager

How you can make a difference

  • Join the registry: If you’re between the ages of 18 and 35 and willing to donate blood stem cells if called upon, all it takes is a simple cheek swab to join the registry. You could be someone’s cure!
  • Give a gift: Your gift can add new life-saving donors to the registry, assist patients with out-of-pocket transplant costs, and fund groundbreaking research that helps more patients survive and live longer, healthier lives

140,000+

People NMDP has impacted through cell therapy since 1987

42 million+

Potential donors worldwide that patients have access to through every donor search

185+

Ongoing studies and clinical trials through NMDP’s collaborative research program

“At its core, NMDP is a supply and demand company. Our demand is coming from patients that need life saving cells. And our supply is coming from amazing volunteer donors or these volunteer cord blood units that we have a partnership with across the nation. When a patient places an order and is determining their need for these cells, we then look for the best match of the donor, contact that donor, make sure that they are consented and well to go through the process, and then begin the work of collection.”

Erica Jensen

Senior Vice President of Innovation, Strategy, and Marketing at NMDP

Every year, 18,500 people in the U.S. are diagnosed with life-threatening blood cancers for which a blood stem cell transplant may be their best or only hope of a cure. Since 70% of these patients do not have a fully matched donor in their family, they must rely on an unrelated donor. NMDP plays a vital role in saving lives by facilitating the donation of bone marrow and blood stem cells for cell therapy.

The Challenge: Understanding Donor Availability

One of the significant challenges NMDP faces is donor availability. When a patient needs a matching donor, NMDP reaches out to potential matches, but less than 50% of the donors say “yes” they are willing and able to donate and in some cases that may have been a patient’s only match.

Adam Olson, Senior Data Scientist at NMDP, points out that individuals often join the registry as young adults, around 18 or 19. However, they might not be asked to donate until much later in life, when they could be in a different life stage or dealing with other external challenges. 

To achieve NMDP’s strategic goal of impacting 10,000 lives annually, it’s crucial to gain a deeper understanding of the 42 million individuals on the registry without repeatedly contacting every potential donor to ask, “If called upon, would you be willing and ready to donate?” 

Introducing the Donor Readiness Score (DRS)

To address the critical donor availability challenge, NMDP developed the Donor Readiness Score (DRS) — a machine learning algorithm that predicts the likelihood a donor will be willing and able to donate their blood stem cells when called upon.

How the DRS Works:

  • Inputs: The DRS is built and trained on historical data, specifically focusing on past donor responses and various donor factors and characteristics, such as NMDP registry engagement. 
  • Output: The DRS provides a numerical score between 0 and 1 for every donor on the 42-million-plus registry. A higher score indicates a greater likelihood that the donor will be willing and able to donate. 
  • Application: When a patient search narrows down the pool of suitable genetic matches, the DRS becomes highly valuable, especially for urgent cases. If multiple potential donors are identified, the DRS helps physicians prioritize by contacting those with the highest scores first, speeding up the process for the patient. It can also serve as a “tiebreaker” tool for transplant centers.

Beyond its direct impact on patient searches, the Donor Readiness Score also influences other organizational areas, such as marketing initiatives. For example, the score helps determine whether outreach strategies should be tailored so that searching patients will be able to find their matching, available donor more quickly.

Core Infrastructure and Tools

Posit tools are fundamental to NMDP’s data science function, enabling them to pursue their strategic goal of impacting 10,000 lives annually by 2028 with life-saving cell therapy.

Posit Connect

  • Centralized insights: Posit Connect hosts NMDP’s data science storefront, a centralized hub for all the team’s work — including the DRS and related reporting. Through the storefront, business users across all departments can quickly find the information they’re looking for, such as forecasting for organizational-wide metrics. 
  • Improved Collaboration & Accessibility: Sharing work is more seamless as data scientists can send a single link to colleagues or non-coding business users, eliminating the need for multiple file exchanges or email threads with multiple versions of data.
  • Automation & Reproducibility: Leveraging Posit Connect, NMDP automated daily updates for 42 million DRS scores, ensuring external partners and internal teams consistently have access to the latest data.

Posit Package Manager

The data science team at NMDP highlights Package Manager as the “unsung hero.” It enables NMDP to host internal packages and allows team members to easily download and use them as if they were public open-source projects. This facilitates seamless internal collaboration, tracks different versions of packages for reproducibility, and keeps proprietary data private.

“If we're scoring the entire registry, 42 million donors is a lot of donors. We can leverage our integrations with Posit Connect and Snowflake to use our Snowflake warehouse as a large source of compute and offload some of that heavy lifting. It’s very easily integrated with Posit Connect, which is really nice, because as data scientists, we're often writing R or Python and not just SQL all day. But this connection between Posit Connect and Snowflake enables us to make sure that we're showing the most accurate and up to date data at all times, no matter what the scale of the problem is.”

Adam Wang
Senior Data Scientist at NMDP

Integration of Posit & Snowflake

NMDP leverages the integration between Posit Connect and Snowflake to manage and process their vast amounts of data.

  • Scalable Compute: For tasks that require significant computational power, NMDP can offload this heavy lifting to their Snowflake warehouse via a connection with Posit Connect.
  • Data Flow: Scores generated through Posit Connect are then sent to Snowflake, which updates all other systems, ensuring consistent and up-to-date data across the organization.

Subscribe to more inspiring open-source data science content.

We love to celebrate and help people do great science. By subscribing, you'll get alerted whenever we publish something new.