Compare Glue partition discovery with Hive MSCK/ADD PARTITION. Explain the operational and cost implications of crawler-based vs. partition-projection approaches. When does partition projection become necessary, and what are its limitations?
SQLmedium
242
Explain how partitioning and bucketing in Hive/Spark optimize queries. What are the trade-offs in bucket count, partition cardinality, and small-file problem? When does over-partitioning or over-bucketing become counterproductive?
SQLmedium
243
Explain how to implement cumulative sum in SQL.
SQLmedium
244
Explain how you would implement partitioning and bucketing for data stored in S3 to improve query performance.
SQLmedium
245
Explain how you would use repartition or coalesce effectively to optimize processing when analyzing data only for a specific region.
SQLmedium
246
Explain indexing and its impact on database performance.
SQLmedium
247
Explain offset management, Sync vs. Async commits, partition assignment strategies and Consumer groups, and handling backpressure in Kafka streams.
SQLmedium
248
Explain row_number, rank, and dense_rank with examples.
SQLmedium
+20 More Questions with Expert Answers
Unlock all 1,800+ expert answers, AI mock interviews, resume analyzer, SQL playground, and personalized progress tracking.