Interview questions · medium
Explain the differences between Repartition and Coalesce. When would you use each?
Convert complex SQL (CTEs, window functions, subqueries) to production-grade PySpark. Discuss when to use spark.sql() vs. DataFrame API, and the implications for testability, partitioning, and execution predictability.
How do you drop columns with null values in PySpark?
Discuss Primary, Foreign, and Composite Keys.
How to optimize join of large and small tables in Spark?
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.