DataEngPrep.tech
QuestionsPracticeAI CoachDashboardPacksBlog
ProLogin

Interview Questions

Real questions from top companies

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
1601

Spark Tungsten & Catalyst Optimizer

Spark/Big Datahardjoinoptimizationpartition0.8 min read
Walmart
β†’
1602

Split a DataFrame such that even numbers appear in one column and odd numbers in another

Spark/Big Datamediumpartitionpythonspark0.5 min read
KPMG
β†’
1603

Sqoop Incremental Import?

Spark/Big Dataeasysql0.6 min read
Altimetrik
β†’
1604

Sqoop command for importing multiple tables

Spark/Big Dataeasyairflowsql0.5 min read
Meesho
β†’
1605

Steps to link a Databricks notebook to an ADF pipeline

Spark/Big Datahardspark0.6 min read
Kaseya
β†’
1606

Steps to mount storage in Databricks.

Spark/Big Datamediumsparkwindow0.5 min read
Chubb
β†’
1607

Suppose you have a DAG that ingests data from multiple databases. How would you increase task parallelism in Airflow to improve performance without overloading the system?

Spark/Big Dataeasyairflowsql0.6 min read
Dunnhumby
β†’
1608

Suppose you need to import 5 tables from an external RDBMS (like MySQL) into Hadoop HDFS. Write the Sqoop command

Spark/Big Dataeasyairflowsql0.6 min read
Meesho
β†’
1609

Task Dependencies in DAG

Spark/Big Dataeasyairflow0.5 min read
Verizon
β†’
1610

Trade-offs between batch processing (Spark) vs. real-time streams (Kafka)

Spark/Big Datahardpartitionspark0.7 min read
PayPal
β†’
1611

Transformation vs. Action in PySpark?

Spark/Big Datamediumjoinpartitionspark0.6 min read
Comcast
β†’
1612

Usage of UDFs?

Spark/Big Datahardoptimizationpythonsql0.6 min read
Citi
β†’
1613

Walk through how you would debug the data ingestion process to identify slow stages.

Spark/Big Datahardpartitionspark0.6 min read
Swiggy
β†’
1614

Walkthrough Spark's architecture, focusing on driver, executors, and DAGs

Spark/Big Datahardoptimizationpartitionspark2.5 min read
KPMG
β†’
1615

What Hadoop command would you use to merge multiple files into one?

Spark/Big Datamediumpartitionspark0.5 min read
Infosys
β†’
1616

What are Hadoop commands for Get and Merge?

Spark/Big Dataeasyspark0.4 min read
Altimetrik
β†’
1617

What are Spark Submit properties?

Spark/Big Datamediumpartitionpythonspark0.4 min read
HCL
β†’
1618

What are Spark optimizations, and can you explain them?

Spark/Big Datahardjoinoptimizationpartition0.6 min read
Cognizant
β†’
1619

What are the advantages of using Dataproc over a traditional Hadoop setup?

Spark/Big Dataeasyspark0.5 min read
Aarete
β†’
1620

What are the advantages of using Delta Lake over Parquet?

Spark/Big Dataeasy0.5 min read
Puma
β†’

Reading isn't practice. Get AI feedback on your answers.

Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.

Try AI Answer Coach β€” FreeStart a Mock Interview
Previous1...7980818283...94Next
Categories
All QuestionsSQLSpark / Big DataPython / CodingSystem DesignCloud / ToolsBehavioral
By Company
AmazonGoogleDatabricksSnowflakeMicrosoftNetflixUberTCS
Interview Guides
All GuidesTop SQL QuestionsTop Spark QuestionsTop Python QuestionsTop System DesignSQL Window FunctionsETL QuestionsData Modeling
Products
AI Interview CoachAnswer AnalyzerSQL PlaygroundResume AnalyzerInterview PacksPricing
Company
About UsContact UsAI DisclosureDisclaimerTerms of ServicePrivacy Policy
Β© 2026 DataEngPrep.tech. All rights reserved.
AboutBlogContactDisclaimer