Open Source Projects
PySpark
Scala Spark
Style guides
Ruby / Rails
Blogs
Talks
Building Lakehouse on Delta Lake
September 2023
Why Delta Lake is the best storage format for pandas analyses
June 2023
5 Reasons Parquet Files Are Better Than CSV for Data Analyses
October 2021
Optimizing Delta / Parquet Data Lakes
October 2019
Optimizing Delta Parquet Data Lakes for Apache Spark
April 2019