Interview questions
Preparing for a data engineering interview at Dunnhumby? This page contains 48 real interview questions sourced from verified Dunnhumby interview experiences. Questions are sorted by frequency — the ones asked most often appear first.
Dunnhumby data engineering interviews typically focus on Spark/Big Data, SQL, and General/Other. The interview bar skews toward harder problems (17 hard vs. 14 easy), suggesting emphasis on depth and system-level thinking.
Use the difficulty filters above to focus your preparation. For each question, attempt your own answer first, then compare with our expert solution. You can also practice these questions in our AI Mock Interview Coach for real-time feedback.
What is the difference between repartition and coalesce in Apache Spark?
What is the difference between narrow and wide transformations in Apache Spark? Explain with examples.
Explain the difference between Spark's map() and flatMap() transformations.
How does Spark's Catalyst Optimizer work? Explain its stages.
What is the difference between Managed and External tables in Hive/Spark?
Explain the concept of Broadcast Join in Spark. When should it be used?
Explain the difference between shallow copy and deep copy in Python.
Write a Python function to find the first non-repeating character in a string.
Have you worked on Data Warehousing projects?
What is the difference between OLTP and OLAP?
What is the difference between SQL and NoSQL databases?
Explain Common Table Expressions (CTEs) and their benefits.
Explain SQL Window Functions with examples.
Explain the use of the MERGE statement in SQL.
How do you handle NULL values in SQL? Mention functions like COALESCE and ISNULL.
How do you optimize a long-running SQL query?
How would you handle duplicate records in an SQL table?
Unlock all 1,800+ expert answers, AI mock interviews, resume analyzer, SQL playground, and personalized progress tracking.