Talks
Best Practices for Unit Testing PySpark
June 2024
Pandas on Spark: Simplicity of Pandas with Efficiency of Spark
June 2024
Best Features of Delta Lake: Love Your Open Tables
June 2024
Building Lakehouse on Delta Lake
September 2023
Why Delta Lake is the best storage format for pandas analyses
June 2023
5 Reasons Parquet Files Are Better Than CSV for Data Analyses
October 2021
Optimizing Delta / Parquet Data Lakes
October 2019
Optimizing Delta Parquet Data Lakes for Apache Spark
April 2019