Interview questions · medium
What is the difference between repartition and coalesce in Apache Spark?
Write an SQL query to find the second-highest salary from an employee table.
What strategies can you use to handle skewed data in Spark?
Describe strategies for optimizing a slow-running query on a massive dataset.
Explain the difference between Star and Snowflake schemas. When would you choose one over the other?
Explain the use of surrogate keys vs. natural keys in data modeling.
Given an unoptimized query execution plan, how would you diagnose and improve performance?
Kafka Partitioning: How would you ensure even load distribution across Kafka partitions in a high-volume system?
Write a query to remove duplicate records from a table while retaining the earliest entry.
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.