DataEngPrep.tech
QuestionsPracticeAI CoachDashboardPacksBlog
ProLogin

Interview Questions

Real questions from top companies

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
1301

Write a query to get the latest rule_id and rule_status.

SQLmediumpartitionsnowflakesql0.3 min read
Gartner
β†’
1302

Write a query to get the names of all employees who are managers with five or more direct reports.

SQLmediumjoin0.3 min read
Chryselys
β†’
1303

Write a query to identify duplicate customer entries based on email and phone number.

SQLmediumjoin0.4 min read
Swiggy
β†’
1304

Write a query to identify unique user sessions.

SQLmediumetlpartition0.2 min read
Wayfair
β†’
1305

Write a query to remove duplicate records from a table while retaining the earliest entry.

SQLmediumpartition0.3 min read
BCG
β†’
1306

Write a query to retain only the latest record and delete others in case of duplicates.

SQLmediumpartition0.3 min read
Fossil Group
β†’
1307

Write a query to select the latest record based on a time_of_insertion column.

SQLmediumpartitionsnowflakesql0.3 min read
Fossil Group
β†’
1308

Write a query to switch values in the Gender column (M to F and F to M).

SQLeasy0.3 min read
Fossil Group
β†’
1309

Write a self join query to get the manager's name for each employee.

SQLmediumjoin0.2 min read
Gartner
β†’
1310

Write an SQL query to find the top 3 performing products in each category

SQLmediumpartitionsqlwindow0.3 min read
Kagina
β†’
1311

Write code to find the third-highest salary in a dataset using Pandas.

SQLmediumspark0.2 min read
Chryselys
β†’
1312

Write optimized SQL queries involving window functions, CTEs, and joins.

SQLmediumjoinpartitionsql0.3 min read
Apple
β†’
1313

Write queries combining Joins and Group By operations.

SQLmediumjoin0.3 min read
Expedia
β†’
1314

You need to create a workflow where Task B runs only if Task A is successful, and Task C should always run regardless of Task A or B's status. How would you define this dependency using Airflow?

SQLhardairflow0.3 min read
Dunnhumby
β†’
1315

You need to design a Kafka topic for a logging service. How would you decide the number of partitions and the key for partitioning to balance throughput and ordering requirements?

SQLhardjoinoptimizationpartition3.6 min read
Dunnhumby
β†’
1316

Your Kafka consumer shows significant lag during peak hours. What strategies would you employ to reduce lag and ensure timely data processing?

SQLmediumpartition0.4 min read
Dunnhumby
β†’
1317

map() vs mapPartitions(): Highlight the difference between map (row-level transformation) and mapPartitions (partition-level transformation).

SQLmediumpartition0.3 min read
Capgemini
β†’
1318

repartition() vs coalesce(): Explain when to use repartition() (increases partitions) vs coalesce() (reduces partitions).

SQLmediumpartition0.3 min read
Capgemini
β†’
1319

A JSON file with evolving schema needs to be ingested into a DataFrame. How would you handle new fields dynamically in PySpark without breaking the job for previous structures?

Spark/Big Dataeasyspark0.3 min read
Dunnhumby
β†’
1320

A data pipeline processes files for different clients stored in separate directories. Explain how you would use dynamic DAG creation to handle client-specific workflows in Airflow.

Spark/Big Datahardairflow0.3 min read
Dunnhumby
β†’

Reading isn't practice. Get AI feedback on your answers.

Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.

Try AI Answer Coach β€” FreeStart a Mock Interview
Previous1...6465666768...94Next
Categories
All QuestionsSQLSpark / Big DataPython / CodingSystem DesignCloud / ToolsBehavioral
By Company
AmazonGoogleDatabricksSnowflakeMicrosoftNetflixUberTCS
Interview Guides
All GuidesTop SQL QuestionsTop Spark QuestionsTop Python QuestionsTop System DesignSQL Window FunctionsETL QuestionsData Modeling
Products
AI Interview CoachAnswer AnalyzerSQL PlaygroundResume AnalyzerInterview PacksPricing
Company
About UsContact UsAI DisclosureDisclaimerTerms of ServicePrivacy Policy
Β© 2026 DataEngPrep.tech. All rights reserved.
AboutBlogContactDisclaimer