The Stanford Blood Center collects and distributes blood products to Stanford Hospital. One of these is platelets, a vital clot-forming blood component with a limited shelf life of a few days. Previous work (Guan et al. , 2017) formulated an optimization problem using features aggregated from the available data to solve the problem of reducing waste. An R package was created for a three-day ordering strategy but has not been put into production due to lack of human trust in modelling accuracy. In summer 2019, the Stanford Data Science for Social Good team, decided to make use of additional patient-level data and models to predict platelet consumption rather than relying solely on aggregated data. Modeling the transfusion recipients into different subpopulations allows for finer-grained predictions on a patient level. We make extensive use of R packages, such as the Tidyverse and R Shiny, to conduct exploratory data analysis, build models, and create a user-intuitive dashboard. The Shiny dashboard is designed to display consumption predictions aggregated across all models, consumption predictions for each subpopulation, and historical performance of the model, thereby serving as a valuable tool in building the trust necessary for adopting the algorithmic ordering strategies. Reference Guan, L., Tian, X., et al. (2017). “Big data modeling to predict platelet usage and minimize wastage in a tertiary care system.” PNAS (43) 114: 11368 – 11373. Retrieved from: www.pnas.org/cgi/doi/10.1073/pnas.1714097114
I am a fourth year statistics PhD student at Stanford university, advised by Professor Emmanuel Candes. Prior to the Ph.D program, I was an under-graduate student in physics at Fudan University and master's student in statistics at the University of Chicago. During my master's program I have participated in several statistics consulting projects, and I continue to be interested to applying statistics to real-world problems, or answering scientific questions. I was a Stanford Data Science for Social Good fellow in the summer of 2019. My current research interest is high-dimensional statistics, in particular, estimation and hypothesis testing a high dimensional generalized linear model.