R has extremely rich and diverse modeling capabilities. However, many packages have a variety of interfaces and differing syntactical conventions. Using them in the context of Hadley’s tidy data conventions can be difficult. This talk will discuss the process of making more modular and programming friendly code for modeling activities. Using the `caret` package as an example, a broad roadmap will be discussed for making the transition to more focused packages that use tidy ideas. The concept of writing a modeling _specification_ that can be used in different compute engines will also be discussed.
Max Kuhn is a software engineer at Posit (née RStudio). He is working on improving R's modeling capabilities and maintaining about 30 packages, including caret. He was a Senior Director of Nonclinical Statistics at Pfizer and had been applying models in the pharmaceutical and diagnostic industries for over 18 years. Max has a Ph.D. in Biostatistics. He, and Kjell Johnson, wrote the book Applied Predictive Modeling, which won the Ziegel award from the American Statistical Association. Their second book, Feature Engineering and Selection, was published in 2019, and his book Tidy Models with R, was published in 2022.