Below are the notes were taken while attending the data bricks training on Performance Tuning on Apache Spark by Charles Harding.
Most egregious problems are
Spill Skew Shuffle Storage Serialization Spill The wrting of temp files to disk due to lack o memory
Skew An Imbalance in the size of the partitions
Shuffle The act of moving data between executors
Storage A set of problems directly related to how the data is stored on disk