SQL questions from Dunnhumby data engineering interviews.
These sql questions are sourced from Dunnhumby data engineering interviews. Each includes an expert-level answer.
Explain Common Table Expressions (CTEs) and their benefits.
Explain SQL Window Functions with examples.
Explain the use of the MERGE statement in SQL.
How do you handle NULL values in SQL? Mention functions like COALESCE and ISNULL.
How do you optimize a long-running SQL query?
How would you handle duplicate records in an SQL table?
Explain how you would use repartition or coalesce effectively to optimize processing when analyzing data only for a specific region.
How can you delete partitions from a table in Hive using a command?
If manual partitions are created in a Hive data-warehouse table directory, and you query records from those partitions, will you see the data? If not, how can this be fixed?
What is the difference between static and dynamic partitioning in Hive?
Write a SQL query to find distinct IDs from a table where the count is more than 1 and greater than 200.
You need to create a workflow where Task B runs only if Task A is successful, and Task C should always run regardless of Task A or B's status. How would you define this dependency using Airflow?
You need to design a Kafka topic for a logging service. How would you decide the number of partitions and the key for partitioning to balance throughput and ordering requirements?
Your Kafka consumer shows significant lag during peak hours. What strategies would you employ to reduce lag and ensure timely data processing?
Download the complete interview prep bundle with expert answers. Study offline, on your commute, anywhere.