DataEngPrep.tech
QuestionsBlogStore
Get PDF Bundle

Interview Questions

Real questions from top companies · medium

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
1

What is the difference between repartition and coalesce in Apache Spark?

Spark/Big Datamediumpartitionpythonspark1 min read
BCGCitiDunnhumbyFragma Data Systems+3
→
2

Write an SQL query to find the second-highest salary from an employee table.

SQLmediumpartitionsqlwindow0.8 min read
AccentureBCGCognizantIncedo+2
→
3

What is the difference between cache() and persist() in Spark? When would you use each?

Spark/Big Datamediumpartitionspark0.7 min read
AccentureCoforgeFreechargeImpetus+1
→
4

What is the difference between groupByKey and reduceByKey in Spark?

Spark/Big Datamediumpartitionspark0.8 min read
AccentureCapcoCoforgeNagarro+1
→
5

What is the difference between narrow and wide transformations in Apache Spark? Explain with examples.

Spark/Big Datamediumjoinpartitionpython0.9 min read
CoforgeDelivery HeroDunnhumbyFragma Data Systems+1
→
6

Demonstrate the difference between DENSE_RANK() and RANK()

SQLmediumpartitionwindow0.5 min read
CapcoImpetusKPMGWipro
→
7

Discuss differences between ROW_NUMBER(), RANK(), and DENSE_RANK(), and provide examples from your projects.

SQLmediumwindow0.5 min read
AareteAccentureFossil GroupYash Technologies
→
8

Explain the differences between Data Warehouse, Data Lake, and Delta Lake

SQLmediumbigquerypartitionsnowflake0.5 min read
FractalKPMGMatrixMeesho
→
9

Explain the differences between Repartition and Coalesce. When would you use each?

SQLmediumjoinpartition0.5 min read
DatameticaFedEx DataworksNihilentPresidio
→
10

What is the difference between partitioning and bucketing in Spark, and when would you use bucketing?

SQLmediumjoinpartitionspark0.5 min read
CitiCoforgeHCLLTIMindtree
→
11

What strategies can you use to handle skewed data in Spark?

Spark/Big Datamediumjoinpartitionspark0.5 min read
BCGBitwiseCitiHashedIn
→
12

Can you explain the difference between OLTP and OLAP?

SQLmediumbigquerysnowflakesql0.4 min read
AccentureCognizantEPAMYash Technologies
→
13

Describe a time when you had to optimize a slow SQL query. What steps did you take?

SQLmediumjoinsql0.5 min read
AareteAccentureFossil GroupYash Technologies
→
14

Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.

SQLmediumjoin0.5 min read
AccentureCognizantEPAMYash Technologies
→
15

How do you handle NULL values in SQL? Mention functions like COALESCE and NULLIF.

SQLmediumjoinsql0.4 min read
AccentureCognizantEPAMYash Technologies
→
16

What is the difference between WHERE and HAVING clauses in SQL?

SQLmediumsql0.3 min read
AccentureCognizantEPAMYash Technologies
→
17

Write a Python function to check if a string is a palindrome.

Python/Codingmediumjoinpython0.4 min read
CapcoHashedInLTIMindtree
→
18

Describe a scenario where partitioning and bucketing would improve query performance.

SQLmediumjoinpartition0.7 min read
Daniel WellingtonGoldman SachsSwiggy
→
19

Explain the types of triggers in ADF, including schedule, tumbling window, and event-based triggers.

SQLmediumpartitionwindow0.5 min read
FedEx DataworksNihilentVirtusa
→
20

How do you remove duplicate rows in BigQuery?

SQLmediumbigquerypartition0.6 min read
EYIncedoTech Mahindra
→
123...24Next