DataEngPrep.tech
QuestionsBlogStore
Get PDF Bundle

Interview Questions

Real questions from top companies · medium

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
41

What is a self-join, and when would you use it?

SQLmediumjoin0.4 min read
PresidioSwiggy
→
42

What is normalization and denormalization? When would you use each?

SQLmediumetljoin0.4 min read
PresidioSwiggy
→
43

What is the difference between a view and a materialized view?

SQLmedium0.4 min read
PresidioSwiggy
→
44

Write an SQL query to find duplicate emails in a users table.

SQLmediumpartitionsqlwindow0.5 min read
Daniel WellingtonGoldman SachsSwiggy
→
45

Triggers in ADF, especially tumbling window triggers.

SQLmediumpartitionwindow0.5 min read
AccentureYash Technologies
→
46

What is a window function? Explain with an example.

SQLmediumjoinpartitionwindow0.5 min read
CitiFreecharge
→
7

What is the difference between OLTP and OLAP?

SQLmedium
8

Write a SQL query to find top 3 earners in each department.

SQLmedium
9

Write a query to find the top three highest-paid employees in each department using window functions.

SQLmedium
10

Write complex SQL queries involving multiple joins, subqueries, and data aggregation logic.

SQLmedium
11

Convert complex SQL (CTEs, window functions, subqueries) to production-grade PySpark. Discuss when to use spark.sql() vs. DataFrame API, and the implications for testability, partitioning, and execution predictability.

Spark/Big Datamedium
12

Explain how Adaptive Query Execution changes the economics of Spark tuning. What problems does it solve at runtime, and when might you still need manual intervention (e.g., salting, broadcast hints)?

Spark/Big Datamedium
13

Architect incremental load in ADF + Databricks with idempotency, late-arrival handling, and cost/scalability implications of watermark vs. change data capture.

Spark/Big Datamedium
14

Explain strategies for managing schema changes in PySpark over time.

Spark/Big Datamedium

+14 More Questions with Expert Answers

Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.

Get PDF Bundle — from $21Try Free Sample
Previous12345...24Next