Bigger Data With Ease Using Apache Arrow

Videos
Bigger Data With Ease Using Apache Arrow

Videos

The Apache Arrow project enables data scientists using R, Python, and other languages to work with large datasets efficiently and with interactive speed. Arrow is so fast at some workflows that it seems to defy reality–or at least the limits of R’s capabilities. This talk examines the unique characteristics of the Arrow project that enable it to redefine what is possible in R. The talk also highlights some of the latest developments in the arrow R package, including how you can query and manipulate multi-file datasets, and it presents strategies for speeding up workflows by up to 100x.

Neal Richardson

Currently Director of Engineering at Ursa Labs / RStudio. Previously led product and engineering at Crunch.io. Ph.D. in Political Science from the University of California, Berkeley.