Design a Delta table layout for mixed workload: point lookups by user_id, range scans by date, and full partition scans. Compare partitioning vs. Z-ordering—when to use each, and the rewrite cost trade-off.
Spark/Big Datahard
2
What are the pros and cons of using a data lake on AWS, GCP, or Azure?
Cloud/Toolshard
3
How would you model customer transaction data for both analytical and operational use cases?
General/Otherhard
4
What are the key design principles for a cloud-based data warehouse?
SQLhard
5
What considerations are important when designing a dimensional model for a ridesharing app?
SQLhard
6
Compare Hadoop and Spark. Which one would you choose for a real-time application, and why?
Spark/Big Datahard
7
Explain how HDFS (Hadoop Distributed File System) stores data across nodes.
Spark/Big Datahard
8
Explain how to schedule an automated task using Apache Airflow.
Spark/Big Datahard
+19 More Questions with Expert Answers
Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.