Real questions from top companies · medium
When would you choose a Snowflake schema over a Star schema?
Explain the difference between Spark's map() and flatMap() transformations.
Explain the concept of Broadcast Join in Spark. When should it be used?
Tell me about a time when you faced a challenging situation at work and how you handled it.
What challenges did you face, and how did you tackle them?
What would you do if a pipeline failed and you couldn't find the reason?
Why do you want to join this company?
What is the role of AWS Lambda in a data engineering pipeline?
How would you read data from a web API? What steps would you follow after reading the data?
What is the difference between SQL and NoSQL databases?
Detail examples of inner, outer, left, and right joins.
Difference between ROW_NUMBER(), RANK(), and DENSE_RANK() with examples.
Difference between where and having clause with examples.
Explain SQL Window Functions with examples.
Explain the use of the MERGE statement in SQL.
How do you handle NULL values in SQL? Mention functions like COALESCE and ISNULL.
How would you handle duplicate records in an SQL table?
Implement a query to find the top 5 customers by total sales amount.
SQL query to find the second highest salary from each department.
What are primary keys and foreign keys? Why are they important?