DataEngPrep.tech
QuestionsBlogStore
Get PDF Bundle

Interview Questions

Real questions from top companies

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
21

What strategies can you use to handle skewed data in Spark?

Spark/Big Datamediumjoinpartitionspark0.5 min read
BCGBitwiseCitiHashedIn
→
22

Briefly introduce yourself and walk us through your journey as a Data Engineer so far.

Behavioralhardetljoinpartition0.5 min read
AccentureEPAMYash Technologies
→
23

Can you explain the difference between OLTP and OLAP?

SQLmediumbigquerysnowflakesql0.4 min read
AccentureCognizantEPAMYash Technologies
→
24

Describe a time when you had to optimize a slow SQL query. What steps did you take?

SQLmediumjoinsql0.5 min read
AareteAccentureFossil GroupYash Technologies
→
25

Explain the concept of ACID properties in the context of databases.

SQLeasy0.4 min read
AccentureCognizantEPAMYash Technologies
→
26

Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.

SQLmediumjoin0.5 min read
AccentureCognizantEPAMYash Technologies
→
27

How do you handle NULL values in SQL? Mention functions like COALESCE and NULLIF.

SQLmediumjoinsql0.4 min read
AccentureCognizantEPAMYash Technologies
→
28

What is a Common Table Expression (CTE), and when would you use it?

SQLhardbigqueryoptimizationsnowflake0.4 min read
AccentureCognizantEPAMYash Technologies
→
29

What is the difference between a primary key and a unique key?

SQLhardsparksql0.4 min read
AccentureCognizantEPAMYash Technologies
→
30

What is the difference between WHERE and HAVING clauses in SQL?

SQLmediumsql0.3 min read
AccentureCognizantEPAMYash Technologies
→
31

Write a Python function to check if a string is a palindrome.

Python/Codingmediumjoinpython0.4 min read
CapcoHashedInLTIMindtree
→
32

Describe a scenario where partitioning and bucketing would improve query performance.

SQLmediumjoinpartition0.7 min read
Daniel WellingtonGoldman SachsSwiggy
→
33

Explain Fact and Dimension Tables with examples.

SQLhardjoin0.6 min read
DatameticaDeloitteIncedo
→
34

Explain the types of triggers in ADF, including schedule, tumbling window, and event-based triggers.

SQLmediumpartitionwindow0.5 min read
FedEx DataworksNihilentVirtusa
→
35

How do you remove duplicate rows in BigQuery?

SQLmediumbigquerypartition0.6 min read
EYIncedoTech Mahindra
→
36

Joins and window functions - INNER, LEFT, RIGHT, FULL OUTER, ROW_NUMBER(), RANK(), DENSE_RANK()

SQLhardjoinpartitionwindow0.7 min read
FordKPMGNihilent
→
37

When would you choose a Snowflake schema over a Star schema?

SQLmediumjoinsnowflake0.6 min read
Goldman SachsMicrosoftZS Associates
→
38

Can you explain the architecture of Apache Spark and its components?

Spark/Big Datahardjoinoptimizationpartition3.2 min read
CoforgeFreechargeNihilent
→
39

Describe the difference between Spark RDDs, DataFrames, and Datasets.

Spark/Big Datahardoptimizationpartitionspark0.5 min read
AccentureFragma Data Systems
→
40

Explain the difference between Spark's map() and flatMap() transformations.

Spark/Big Datamediumpartitionspark0.4 min read
Delivery HeroDunnhumbyFragma Data Systems
→
Previous1234...94Next