DataEngPrep.tech
QuestionsPracticeAI CoachDashboardPacksBlog
ProLogin

Interview Questions

Real questions from top companies

700+ Easy450+ Medium650+ Hard

This is the most comprehensive collection of real data engineering interview questions available online β€” over 1,863 questions sourced from actual interviews at Amazon, Google, Databricks, Snowflake, Meta, Microsoft, and 90+ other companies. Every question includes a detailed expert answer, approach guidance, and difficulty rating.

Questions span seven core categories: SQL, Spark & Big Data, Python, System Design & Architecture, Cloud & Tools, General & Behavioral, and Data Modeling. Use the filters above to focus on a specific topic or difficulty level, or browse the full collection sorted by how frequently each question appears in real interviews.

For targeted practice, try our AI Mock Interview Coach which simulates a live interview panel and evaluates your answers against FAANG hiring standards.

All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
1

Tell me about yourself and your experience.

Behavioralhardjoinpartition0.7 min read
AltimetrikChryselysFossil GroupGlobant+5
β†’
2

What is the difference between repartition and coalesce in Apache Spark?

Spark/Big Datamediumpartitionpythonspark1 min read
BCGCitiDunnhumbyFragma Data Systems+3
β†’
3

What is the difference between SparkSession and SparkContext in Spark?

Spark/Big Datahardoptimizationpythonspark0.7 min read
AltimetrikAmerican ExpressCitiHexaware+3
β†’
4

Write an SQL query to find the second-highest salary from an employee table.

SQLmediumpartitionsqlwindow0.8 min read
AccentureBCGCognizantIncedo+2
β†’
5

What are traits in Scala, and how are they different from classes?

Python/Codingeasyspark0.8 min read
AltimetrikCapgeminiCoforgeInfosys+1
β†’
6

What is the difference between cache() and persist() in Spark? When would you use each?

Spark/Big Datamediumpartitionspark0.7 min read
AccentureCoforgeFreechargeImpetus+1
β†’
7

What is the difference between groupByKey and reduceByKey in Spark?

Spark/Big Datamediumpartitionspark0.8 min read
AccentureCapcoCoforgeNagarro+1
β†’
8

What is the difference between narrow and wide transformations in Apache Spark? Explain with examples.

Spark/Big Datamediumjoinpartitionpython0.9 min read
CoforgeDelivery HeroDunnhumbyFragma Data Systems+1
β†’
9

What architecture are you following in your current project, and why?

System Design/Architecturehardairflowetljoin3.5 min read
CognizantHCLNagarroThoughtworks+1
β†’
10

Tell me about your family background

Behavioraleasy0.7 min read
Fossil GroupMeesho
β†’
11

What are your salary expectations for this role?

Behavioraleasy0.6 min read
EPAMFragma Data SystemsThoughtworksWipro
β†’
12

Where do you see yourself in your career five years from now?

Behavioraleasy0.6 min read
EPAMFossil GroupPumaWipro
β†’
13

What are Airflow Operators? Give examples.

Cloud/Toolseasyairflowpythonsql0.5 min read
AltimetrikEYFossil GroupTech Mahindra
β†’
14

CDC During Migration - explain approaches for real-time Change Data Capture

System Design/Architectureeasy0.5 min read
MoonfareSnowflake
β†’
15

Demonstrate the difference between DENSE_RANK() and RANK()

SQLmediumpartitionwindow0.5 min read
CapcoImpetusKPMGWipro
β†’
16

Discuss differences between ROW_NUMBER(), RANK(), and DENSE_RANK(), and provide examples from your projects.

SQLmediumwindow0.5 min read
AareteAccentureFossil GroupYash Technologies
β†’
17

Explain the differences between Data Warehouse, Data Lake, and Delta Lake

SQLmediumbigquerypartitionsnowflake0.5 min read
FractalKPMGMatrixMeesho
β†’
18

Explain the differences between Repartition and Coalesce. When would you use each?

SQLmediumjoinpartition0.5 min read
DatameticaFedEx DataworksNihilentPresidio
β†’
19

Explain the differences between a Data Lake and a Data Warehouse.

SQLeasylakehousesnowflakesql0.5 min read
ChryselysFedEx DataworksLumiqNAB
β†’
20

What is the difference between partitioning and bucketing in Spark, and when would you use bucketing?

SQLmediumjoinpartitionspark0.5 min read
CitiCoforgeHCLLTIMindtree
β†’
123...94Next
Questions
Browse All QuestionsSQL QuestionsSpark / Big DataPython / CodingSearch
Interview Guides
All GuidesTop SQL QuestionsTop Spark QuestionsTop Python Questions
Products
AI Interview CoachSQL PlaygroundResume AnalyzerInterview PacksPricing
Company
About UsContact UsDisclaimerTerms of ServicePrivacy Policy
Β© 2026 DataEngPrep.tech. All rights reserved.
AboutBlogContactDisclaimer