Explain the differences between Repartition and Coalesce. When would you use each?
SQLmedium
2
Convert complex SQL (CTEs, window functions, subqueries) to production-grade PySpark. Discuss when to use spark.sql() vs. DataFrame API, and the implications for testability, partitioning, and execution predictability.
Spark/Big Datamedium
3
How do you drop columns with null values in PySpark?
Spark/Big Datamedium
4
Discuss Primary, Foreign, and Composite Keys.
General/Othermedium
5
How to optimize join of large and small tables in Spark?
SQLmedium
+5 More Questions with Expert Answers
Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.