Real interview questions asked at KPMG. Practice the most frequently asked questions and land your next role.
KPMG data engineering interviews test your ability across multiple domains. These questions are sourced from real KPMG interview experiences and sorted by frequency. Practice the ones that matter most. This set leans toward senior-level depth (10 of 21 are tagged hard). Recurring themes are spark, partition, and sql — these patterns appear most often in real interviews and reward the deepest preparation. Many of these questions also surface at Capco and Impetus, so the preparation transfers across companies. Average answer is around 1 minute of reading — plan roughly 1 hour to work through the full set thoughtfully.
This collection contains 21 curated questions: 7 easy, 4 medium, and 10 hard. The distribution skews toward harder problems, reflecting the depth expected in senior-level interviews.
The most frequently tested areas in this set are spark (13), partition (10), sql (7), python (7), join (5), and optimization (5). Focusing on these topics will give you the highest return on your preparation time.
Start with the easy questions to warm up and solidify fundamentals. Medium-difficulty questions form the bulk of real interviews — spend the most time here and practice explaining your reasoning out loud. Hard questions often appear in senior and staff-level rounds; attempt them after you're comfortable with the basics. For each question, try answering before revealing the solution. Use our AI Mock Interview to simulate real interview conditions and get instant feedback on your responses.
Demonstrate the difference between DENSE_RANK() and RANK()
Explain the differences between Data Warehouse, Data Lake, and Delta Lake
Joins and window functions - INNER, LEFT, RIGHT, FULL OUTER, ROW_NUMBER(), RANK(), DENSE_RANK()
If you already have an offer, why are you exploring other roles?
Introduce yourself, highlighting key projects and tech stacks
Why did you leave your previous job?
Are you willing to relocate to Bangalore?
Count occurrences of a specific word in a file
Discuss Logical Plan vs Physical Plan
Discuss the nature and volume of data you manage daily
Explain your day-to-day responsibilities as a Data Engineer
Match countries in a pairwise format
Find the minimum and maximum values in an array
Count occurrences of each character in a string
Alternatives to the Medallion Architecture
Compare ORC and Parquet
Create a DataFrame with default column types
Explain job execution in Spark: stages, tasks, Catalyst Optimizer
Justify the choice of your current tech stack. Why Spark, Hadoop, or cloud platforms?
Split a DataFrame such that even numbers appear in one column and odd numbers in another
Walkthrough Spark's architecture, focusing on driver, executors, and DAGs
Get full access to 1,800+ expert answers, AI mock interviews, and personalized progress tracking.