Write a PySpark script to read a CSV file, filter rows where the age column is less than 18, and write the result to a new CSV file.
Spark/Big Datamedium
2
Write a complete PySpark program from import statements to the stop statement, covering transformations and actions.
Spark/Big Datamedium
3
Write a transformation in PySpark to join and clean multiple raw input sources
Spark/Big Datamedium
4
Write code to read data from Delta Lake in S3 and perform upsert based on primary key
Spark/Big Datamedium
5
Write maintainable, efficient Pandas or PySpark code.
Spark/Big Datamedium
6
Your Kafka producer schema has changed, and the new data includes additional fields. How would you ensure backward compatibility using Schema Registry while consuming data from the same topic?
Spark/Big Datamedium
7
Z-Ordering - use cases for partitioned Delta tables
Spark/Big Datamedium
8
How do you ensure the scalability of a data pipeline handling rapidly growing data volumes?
System Design/Architecturemedium
+8 More Questions with Expert Answers
Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.