Write a PySpark job that calculates the number of unique users who logged in per day, but exclude any logins from inactive users listed in a separate file.
Spark/Big Datamedium
2
Write a complete PySpark program from import statements to the stop statement, covering transformations and actions.
Spark/Big Datamedium
3
Write a transformation in PySpark to join and clean multiple raw input sources
Spark/Big Datamedium
4
Write maintainable, efficient Pandas or PySpark code.
Spark/Big Datamedium
5
Z-Ordering - use cases for partitioned Delta tables
Spark/Big Datamedium
6
Describe how data is ingested, transformed, and served in a data pipeline.
System Design/Architecturehard
7
Design a data pipeline for streaming analytics.
System Design/Architecturehard
8
Design a data pipeline from end to end - describe how data would be ingested, processed, stored, and queried.
System Design/Architecturehard
+20 More Questions with Expert Answers
Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.