Interview questions · medium
Describe a scenario where partitioning and bucketing would improve query performance.
Difference between ROW_NUMBER(), RANK(), and DENSE_RANK() with examples.
Difference between where and having clause with examples.
Implement a query to find the top 5 customers by total sales amount.
What are primary keys and foreign keys? Why are they important?
What is a self-join, and when would you use it?
What is normalization and denormalization? When would you use each?
What is the difference between a view and a materialized view?
Write an SQL query to find duplicate emails in a users table.
Describe a time when you went above and beyond for a project or a customer.
Give an example of a time you failed and what you learned from it.
Calculate a 7-day moving average of orders for each city in the Swiggy database.
Describe a scenario where you had to optimize a slow-running data pipeline.
Compare the star schema and snowflake schema. Which one would you use for reporting at Swiggy, and why?
How do you handle NULL values in a SQL query to avoid incorrect results?
Optimize a slow SQL query for a large orders table containing billions of rows.
What are Slowly Changing Dimensions (SCD), and how would you implement them for tracking customer data changes?
Write a SQL query to find the top 5 most ordered dishes in the last 30 days.
Write a query to identify duplicate customer entries based on email and phone number.
Describe how you would use PySpark to aggregate and summarize large transaction datasets.
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.