Convert complex SQL (CTEs, window functions, subqueries) to production-grade PySpark. Discuss when to use spark.sql() vs. DataFrame API, and the implications for testability, partitioning, and execution predictability.
Spark/Big Datamedium
2
What is the size of the teams I've worked with and how we handled sprints during the project?
Behavioraleasy
3
Why are you considering leaving your current company?
Behavioraleasy
4
Given the input string "AAABBBCCCDDDAAA," compress it to output "A3B3C3D3A3."
Python/Codingeasy
5
Explain the differences between Redshift and Snowflake, and how I've used them in previous projects.
SQLeasy
6
Explain the scalability, performance, and cost-efficiency of both Redshift and Snowflake in different use cases.
SQLeasy
7
Write a query to find the 5th highest salary in an employee table and calculate the number of employees whose salary is greater than that of their manager.
SQLmedium
8
Explain how I handle performance optimizations, scheduling tasks, and monitoring DAGs in Airflow.
Spark/Big Datahard
+10 More Questions with Expert Answers
Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.