Spark & Big Data questions from Accenture data engineering interviews.
These spark & big data questions are sourced from Accenture data engineering interviews. Each includes an expert-level answer. This set leans toward the medium-difficulty band most real interviews actually live in (5 of 8). Recurring themes are partition, spark, and optimization — these patterns appear most often in real interviews and reward the deepest preparation. Many of these questions also surface at Yash Technologies and Coforge, so the preparation transfers across companies. Average answer is around 1 minute of reading — plan roughly 1 hour to work through the full set thoughtfully.
This collection contains 8 curated questions: 0 easy, 5 medium, and 3 hard. The distribution skews toward harder problems, reflecting the depth expected in senior-level interviews.
The most frequently tested areas in this set are partition (8), spark (8), optimization (3), join (1), sql (1), and python (1). Focusing on these topics will give you the highest return on your preparation time.
Medium-difficulty questions form the bulk of real interviews — spend the most time here and practice explaining your reasoning out loud. Hard questions often appear in senior and staff-level rounds; attempt them after you're comfortable with the basics. For each question, try answering before revealing the solution. Use our AI Mock Interview to simulate real interview conditions and get instant feedback on your responses.
What is the difference between cache() and persist() in Spark? When would you use each?
What is the difference between groupByKey and reduceByKey in Spark?
Describe the difference between Spark RDDs, DataFrames, and Datasets.
Explain strategies for managing schema changes in PySpark over time.
How do you handle data skewness in Spark?
What is the difference between Spark RDDs, DataFrames, and Datasets?
What is the difference between repartition and coalesce in Spark?
How do you manage schema changes in PySpark when processing data over time?
Get full access to 1,800+ expert answers, AI mock interviews, and personalized progress tracking.