Real questions from top companies
Tell me about yourself and your experience.
What is the difference between repartition and coalesce in Apache Spark?
What is the difference between SparkSession and SparkContext in Spark?
Write an SQL query to find the second-highest salary from an employee table.
What are traits in Scala, and how are they different from classes?
What is the difference between cache() and persist() in Spark? When would you use each?
What is the difference between groupByKey and reduceByKey in Spark?
What is the difference between narrow and wide transformations in Apache Spark? Explain with examples.
What architecture are you following in your current project, and why?
Tell me about your family background
What are your salary expectations for this role?
Where do you see yourself in your career five years from now?
What are Airflow Operators? Give examples.
CDC During Migration - explain approaches for real-time Change Data Capture
Demonstrate the difference between DENSE_RANK() and RANK()
Discuss differences between ROW_NUMBER(), RANK(), and DENSE_RANK(), and provide examples from your projects.
Explain the differences between Data Warehouse, Data Lake, and Delta Lake
Explain the differences between Repartition and Coalesce. When would you use each?
Explain the differences between a Data Lake and a Data Warehouse.
What is the difference between partitioning and bucketing in Spark, and when would you use bucketing?