The most frequently asked SQL questions in real data engineering interviews at top tech companies. Sorted by how often they appear.
SQL remains the most critical skill tested in data engineering interviews. These questions cover window functions (ROW_NUMBER, RANK, DENSE_RANK), complex joins, CTEs, recursive queries, query optimization, indexing strategies, and real-world data transformation patterns. Each question includes a detailed answer and the companies that have asked it.
Write an SQL query to find the second-highest salary from an employee table.
Demonstrate the difference between DENSE_RANK() and RANK()
Discuss differences between ROW_NUMBER(), RANK(), and DENSE_RANK(), and provide examples from your projects.
Explain the differences between a Data Lake and a Data Warehouse.
Explain the differences between Data Warehouse, Data Lake, and Delta Lake
Explain the differences between Repartition and Coalesce. When would you use each?
What is the difference between partitioning and bucketing in Spark, and when would you use bucketing?
Can you explain the difference between OLTP and OLAP?
Describe a scenario where partitioning and bucketing would improve query performance.
Describe a time when you had to optimize a slow SQL query. What steps did you take?
Explain Fact and Dimension Tables with examples.
Explain the concept of ACID properties in the context of databases.
Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
Explain the types of triggers in ADF, including schedule, tumbling window, and event-based triggers.
How do you handle NULL values in SQL? Mention functions like COALESCE and NULLIF.
How do you remove duplicate rows in BigQuery?
Joins and window functions - INNER, LEFT, RIGHT, FULL OUTER, ROW_NUMBER(), RANK(), DENSE_RANK()
What is a Common Table Expression (CTE), and when would you use it?
What is the difference between a primary key and a unique key?
What is the difference between WHERE and HAVING clauses in SQL?
When would you choose a Snowflake schema over a Star schema?
Detail examples of inner, outer, left, and right joins.
Difference Between Internal and External Tables in BigQuery
Difference between ROW_NUMBER(), RANK(), and DENSE_RANK() with examples.
Difference between where and having clause with examples.
Explain Common Table Expressions (CTEs) and their benefits.
Explain SQL Window Functions with examples.
Explain the difference between UNION and UNION ALL.
Explain the use of the MERGE statement in SQL.
How do you handle NULL values in SQL? Mention functions like COALESCE and ISNULL.
How do you optimize a long-running SQL query?
How would you handle duplicate records in an SQL table?
Implement a query to find the top 5 customers by total sales amount.
SQL query to find the second highest salary from each department.
Triggers in ADF, especially tumbling window triggers.
What are primary keys and foreign keys? Why are they important?
What is a CTE (Common Table Expression)? What are its uses?
What is a self-join, and when would you use it?
What is a window function? Explain with an example.
What is normalization and denormalization? When would you use each?
What is the difference between a clustered and non-clustered index?
What is the difference between a view and a materialized view?
What is the difference between DELETE and TRUNCATE?
What is the difference between OLTP and OLAP?
Write a query to find the top three highest-paid employees in each department using window functions.
Write a SQL query to find top 3 earners in each department.
Write an SQL query to find duplicate emails in a users table.
Write complex SQL queries involving multiple joins, subqueries, and data aggregation logic.
Add a column to the Employees table that shows the name of the employee with the next higher employee_id.
Add a new column with manager names for each employee using a self-join.
Download the complete interview prep bundle with expert answers. Study offline, on your commute, anywhere.