Optimize Spark with DISTRIBUTE BY & CLUSTER BY
Distribute by and cluster by clauses are really cool features in SparkSQL. Unfortunately, this subjectremains relatively unknown to most users – this post aims to change that.
This author has yet to write their bio.Meanwhile lets just say that we are proud Witold Jędrzejewski contributed a whooping 1 entries.
Distribute by and cluster by clauses are really cool features in SparkSQL. Unfortunately, this subjectremains relatively unknown to most users – this post aims to change that.