Open Source Projects

PySpark

ProjectDownloadsLast commitAll time downloads
quinnPyPI - DownloadsLast commitDownloads
chispaPyPI - DownloadsLast commitDownloads
mackPyPI - DownloadsLast commitDownloads
cejaPyPI - DownloadsLast commitDownloads
beavisPyPI - DownloadsLast commitDownloads
farsantePyPI - DownloadsLast commitDownloads
unicron
erenLast commit

Scala Spark

Style guides

Ruby / Rails

Blogs

Talks

Building Lakehouse on Delta Lake

September 2023

Why Delta Lake is the best storage format for pandas analyses

June 2023

5 Reasons Parquet Files Are Better Than CSV for Data Analyses

October 2021

Optimizing Delta / Parquet Data Lakes

October 2019

Optimizing Delta Parquet Data Lakes for Apache Spark

April 2019