DataEngPrep.tech
QuestionsBlogStore
Get PDF Bundle

Interview Questions

Real questions from top companies · medium

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
1

Write a PySpark script to process data stored in Delta format and transform it into Parquet.

Spark/Big Datamedium
2

Write a PySpark script to read a CSV file, filter rows where the age column is less than 18, and write the result to a new CSV file.

Spark/Big Datamedium
3

Write a complete PySpark program from import statements to the stop statement, covering transformations and actions.

Spark/Big Datamedium
4

Write a transformation in PySpark to join and clean multiple raw input sources

Spark/Big Datamedium
5

Write code to read data from Delta Lake in S3 and perform upsert based on primary key

Spark/Big Datamedium
6

Write maintainable, efficient Pandas or PySpark code.

Spark/Big Datamedium
7

Your Kafka producer schema has changed, and the new data includes additional fields. How would you ensure backward compatibility using Schema Registry while consuming data from the same topic?

Spark/Big Datamedium
8

Z-Ordering - use cases for partitioned Delta tables

Spark/Big Datamedium

+10 More Questions with Expert Answers

Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.

Get PDF Bundle — from $21Try Free Sample
Previous1...222324