DataEngPrep.tech
QuestionsPracticeAI CoachDashboardPacksBlog
ProLogin

Interview Questions

Real questions from top companies

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
1701

Write a PySpark script to check for missing values and duplicate rows in a DataFrame. How would you ensure data quality before saving it to a storage system?

Spark/Big Datahardpartitionspark0.9 min read
Dunnhumby
β†’
1702

Write a PySpark script to filter out invalid records from a dataset and calculate the average for a specific column, ensuring the schema is strictly defined at runtime.

Spark/Big Datamediumpartitionspark0.7 min read
Bristol Myers Squibb
β†’
1703

Write a PySpark script to process data stored in Delta format and transform it into Parquet.

Spark/Big Datamediumpartitionspark0.7 min read
Tredence
β†’
1704

Write a PySpark script to read a CSV file, filter rows where the age column is less than 18, and write the result to a new CSV file.

Spark/Big Datamediumpartitionspark0.6 min read
Freight Tiger
β†’
1705

Write a Spark job to count word occurrences from an S3 dataset.

Spark/Big Datahardoptimizationpartitionspark0.6 min read
Daniel Wellington
β†’
1706

Write a complete PySpark program from import statements to the stop statement, covering transformations and actions.

Spark/Big Datamediumjoinpartitionpython0.6 min read
Carelon
β†’
1707

Write a transformation in PySpark to join and clean multiple raw input sources

Spark/Big Datamediumjoinpartitionpython0.7 min read
Netflix
β†’
1708

Write code to read data from Delta Lake in S3 and perform upsert based on primary key

Spark/Big Datamediumpartitionspark0.6 min read
Walmart
β†’
1709

Write maintainable, efficient Pandas or PySpark code.

Spark/Big Datamediumjoinpartitionpython0.6 min read
Apple
β†’
1710

Write the Spark command to rename an existing column in a DataFrame.

Spark/Big Dataeasyspark0.5 min read
Dunnhumby
β†’
1711

Writing Excel sheets to Delta tables in Databricks

Spark/Big Dataeasyspark0.5 min read
Nihilent
β†’
1712

You are given 10 worker machines with 100 GB RAM and 25 CPU cores. How would you determine the number of executors and the size of each executor?

Spark/Big Dataeasyspark0.7 min read
Meesho
β†’
1713

Your Kafka producer schema has changed, and the new data includes additional fields. How would you ensure backward compatibility using Schema Registry while consuming data from the same topic?

Spark/Big Datamediumpartition0.6 min read
Dunnhumby
β†’
1714

Z-Ordering - use cases for partitioned Delta tables

Spark/Big Datamediumjoinpartition0.7 min read
Myntra
β†’
1715

Architect a solution to handle notifications for millions of users with varying preferences.

System Design/Architecturehardoptimizationpartitionspark4.1 min read
Disney+ Hotstar
β†’
1716

Build a banking system architecture from scratch, highlighting critical workflows, scalability, and data management strategies.

System Design/Architecturehardoptimizationpartitionspark4.1 min read
Expedia
β†’
1717

Business Role of Data Pipeline

System Design/Architecturehardbigqueryoptimizationpartition4 min read
Verizon
β†’
1718

CAP Theorem

System Design/Architecturehardoptimizationpartitionspark4 min read
ZS Associates
β†’
1719

CI/CD implementation across environments (DEV, QA, UAT, PreProd, PROD)

System Design/Architecturehardoptimizationpartitionspark4.1 min read
Zen Data Shastra
β†’
1720

Can Schema Evolution lead to data inconsistencies? If so, how do you manage them?

System Design/Architecturehardoptimizationpartitionspark4.1 min read
PWC
β†’

Reading isn't practice. Get AI feedback on your answers.

Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.

Try AI Answer Coach β€” FreeStart a Mock Interview
Previous1...8485868788...94Next
Categories
All QuestionsSQLSpark / Big DataPython / CodingSystem DesignCloud / ToolsBehavioral
By Company
AmazonGoogleDatabricksSnowflakeMicrosoftNetflixUberTCS
Interview Guides
All GuidesTop SQL QuestionsTop Spark QuestionsTop Python QuestionsTop System DesignSQL Window FunctionsETL QuestionsData Modeling
Products
AI Interview CoachAnswer AnalyzerSQL PlaygroundResume AnalyzerInterview PacksPricing
Company
About UsContact UsAI DisclosureDisclaimerTerms of ServicePrivacy Policy
Β© 2026 DataEngPrep.tech. All rights reserved.
AboutBlogContactDisclaimer