Explain the differences between Repartition and Coalesce. When would you use each?
SQLmedium
2
Explain Fact and Dimension Tables with examples.
SQLhard
3
Convert complex SQL (CTEs, window functions, subqueries) to production-grade PySpark. Discuss when to use spark.sql() vs. DataFrame API, and the implications for testability, partitioning, and execution predictability.
Spark/Big Datamedium
4
How do you drop columns with null values in PySpark?
Spark/Big Datamedium
5
Discuss Primary, Foreign, and Composite Keys.
General/Othermedium
6
How to optimize join of large and small tables in Spark?
SQLmedium
7
Discuss common transformations used in Spark code.
Spark/Big Datahard
8
Explain Delta Table features – Z-ordering and Time Travel.
Spark/Big Datahard
+13 More Questions with Expert Answers
Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.