Design a star schema for retail analytics (e.g., Adidas). Explain the dimensional modeling choices, SCD strategy, and how you would scale this schema for global multi-currency, multi-region deployments. What are the refresh and storage cost implications?
SQLhard
2
Explain how partitioning and bucketing in Hive/Spark optimize queries. What are the trade-offs in bucket count, partition cardinality, and small-file problem? When does over-partitioning or over-bucketing become counterproductive?
SQLmedium
3
Explain how you would implement partitioning and bucketing for data stored in S3 to improve query performance.
SQLmedium
4
Explain how you would optimize Redshift query performance for a reporting system with large fact tables.
SQLmedium
5
Explain indexing and its impact on database performance.
SQLmedium
6
Explain normalization and its disadvantages.
SQLmedium
7
Explain the Medallion Architecture (Bronze, Silver, Gold).
SQLhard
8
Explain the difference between a clustered and non-clustered index.
SQLmedium
+20 More Questions with Expert Answers
Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.