Teaching data to tens of thousands of high school students around the world

Introduction to Data Science (IDS) is an interactive high school statistics and probability curriculum created by UCLA and the Los Angeles Unified School District (LAUSD) with a grant from the National Science Foundation. We spoke with Suyen Machado, Director, UCLA K-12 Data Science Education, and Monica Casillas, Associate Director of Professional Development, UCLA K-12 Data Science Education, about how the program has scaled to support hundreds of high schools and tens of thousands of students from all around the world in their pursuit of data literacy.

The Introduction to Data Science (IDS) Project

UCLA and the Los Angeles Unified School District received a National Science Foundation grant in 2010 to create more innovative teaching and learning experiences through technology. This led to the creation of the Introduction to Data Science (IDS) project, which provides high school teachers with dynamic data education resources that help position high school students for college readiness and career success. 

The IDS curriculum today is used by 151 high schools across 74 school districts in the United States, Australia, and Kuwait, reaching more than 42,000 students worldwide since 2014. 

But success wasn’t immediate. 

The original intent was to integrate data skills into existing algebra, biology, and computer science courses. Teachers found it challenging to incorporate new material into their already packed curriculum, however, and the initiative turned towards leveraging the emerging Common Core Standards that increased the call for statistics education in high school. An entirely new course was written to the statistics and probability standards with a focus on technology-based learning that could support modern-day data possibilities.

“The ultimate goal was to bring 21st century data science teaching and learning to K-12 education”, says Suyen Machado, Director, UCLA K-12 Data Science Education.

Designing a new approach to data education

IDS demystifies data for students by making them active participants in data collection and analysis. Students use mobile devices to collect practical information about their lives and community. They track what they eat, how they spend their time, and how they feel. The result is data that is both understandable and meaningful for the students.

Individual data is then aggregated for students to benchmark both against classmates and the general population through comparison with public datasets. Later in the course students design their own data projects and decide as a class which topics to explore further and what are the right questions that should be asked to fully understand the problem. 

This hands-on approach is known as participatory sensing and was championed early on in the project by Dr. Deborah Estrin, a professor of computer science at UCLA who is now at Cornell Tech.

The resulting data is then explored and analyzed with the R programming language in RStudio within Posit Cloud, bringing real-world data science tooling to the classroom.

Helping teachers succeed

Many IDS teachers don’t have significant prior programming experience. Thankfully, IDS provides formal professional development for new teachers to help them get started. 

In the first year, teachers receive nine days of training that focuses primarily on specific R code and learning content from the IDS curriculum, as well as an introduction to the data collection app that students use to generate new data. During the second year, teachers are able to dive more deeply into the tools and build comfort and confidence customizing the material. 

“The value of coding with R quickly becomes apparent to most teachers. They see what a powerful tool that RStudio is compared with using a TI-83 calculator, for example”, says Monica Casillas, Associate Director of Professional Development. “They love it and they run with it.”

Teachers also receive ongoing support from IDS and are themselves becoming a vital cohort of data science educators with opportunities for even greater engagement from the community.

Using Posit Cloud to reach more schools and students

Where are students going to analyze all this data?

This was an early question in the IDS journey. “We needed to have something that was widely used in industry. So we said R. And then we said, okay, so how are the students going to work in R? We started to look at a lot of different options, and we decided to use RStudio”, recalls Suyen.

For the first several years IDS was primarily used by the LAUSD and the district maintained their own server edition of RStudio. This proved challenging to maintain and became a barrier for other school districts to readily adopt the IDS curriculum.

Moving to Posit Cloud solved the infrastructure challenges and the number of participating high schools skyrocketed as the demands to build and maintain their own data systems disappeared. 

Teachers log-in through their web browser, create spaces for specific classes, invite students, and share projects and learning materials that make the curriculum come alive. Posit Cloud provides access to both RStudio for coding in R and Jupyter Notebooks for coding in Python.

Every teacher can customize their environment and, with easy controls to add more memory and computing power, students are able to run more advanced analysis and build confidence with professional data science tools. 

Suyen credits the platform with helping IDS continue to scale their program nationally and internationally. 

Are you a teacher or administrator interested in joining the IDS community? Learn more at https://www.datascienceeducationcenter.org/

"Thanks to Posit Cloud we can sleep at night because we're not worried about servers breaking even as the number of students keeps growing."

Suyen Machado
Director, UCLA K-12 Data Science Education

Subscribe to more inspiring open-source data science content.

We love to celebrate and help people do great science. By subscribing, you'll get alerted whenever we publish something new.