Write a SQL query to find top 3 earners in each department.
SQLmedium
3
Write a query to find the top three highest-paid employees in each department using window functions.
SQLmedium
4
Write complex SQL queries involving multiple joins, subqueries, and data aggregation logic.
SQLmedium
5
Architecturally, how would you justify or challenge Hadoop vs. a cloud-native data lake (S3 + EMR/Databricks) for a greenfield enterprise data platform? Discuss scalability ceilings, cost model trade-offs, and operational complexity.
Spark/Big Datahard
6
When would you architecturally choose Dataset[T] over DataFrame in a Scala Spark pipeline, and what are the scalability and portability trade-offs? Include type-safety benefits vs. operational constraints.
Spark/Big Dataeasy
7
Convert complex SQL (CTEs, window functions, subqueries) to production-grade PySpark. Discuss when to use spark.sql() vs. DataFrame API, and the implications for testability, partitioning, and execution predictability.
Spark/Big Datamedium
8
Design a cost-aware resource strategy for a Databricks workload with spiky and batch jobs. Explain Dynamic Resource Allocation, when to disable it, and how min/max executors and spot instances affect cost and SLAs.
Spark/Big Datahard
+20 More Questions with Expert Answers
Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.