Interview questions
Preparing for a data engineering interview at Citi? This page contains 39 real interview questions sourced from verified Citi interview experiences. Questions are sorted by frequency — the ones asked most often appear first.
Citi data engineering interviews typically focus on Spark/Big Data, General/Other, and SQL. There's a solid mix of fundamental and advanced questions, making it accessible for candidates at multiple experience levels.
Use the difficulty filters above to focus your preparation. For each question, attempt your own answer first, then compare with our expert solution. You can also practice these questions in our AI Mock Interview Coach for real-time feedback.
What is the difference between repartition and coalesce in Apache Spark?
What is the difference between SparkSession and SparkContext in Spark?
What is the difference between partitioning and bucketing in Spark, and when would you use bucketing?
What strategies can you use to handle skewed data in Spark?
What is the difference between Managed and External tables in Hive/Spark?
What is a window function? Explain with an example.
Unlock all 1,800+ expert answers, AI mock interviews, resume analyzer, SQL playground, and personalized progress tracking.