Data engineering interview questions · medium
Find average salary for each manager – Assume a table with manager_id and employee_salary
Find each student's previous year's scores using the LAG function.
Find employees who earn the third-highest salary.
Find the 2nd highest salary for each department using the DENSE_RANK() function.
Find the error in the provided SQL query.
Find the second-highest salary in the employees table using three different methods.
Find the third-highest salary for each department.
Finding nth Highest Salary within each department
Given a complex nested query, how would you refactor it for better readability and efficiency?
Given a table of sales data, use window functions to calculate a running total.
Given a table with 10 records and another with 4 records, how many records result from a cross join?
Given a table with sales data, write a query to find consecutive days with decreasing revenue.
Given an unoptimized query execution plan, how would you diagnose and improve performance?
Given the data below, explain the results of different types of joins: Inner Join, Left Join, Right Join. Will a schema be created?
Given two dataframes (df1: id, name and df2: id, country, address, city, count), join them, filter for rows where country = 'Singapore', and pivot the output. Sort cities in descending order of population count
Given two tables, calculate the row count for different types of joins (inner, left, right, and full outer)
Handle nulls, duplicates, and inconsistent timestamp formats in data.
Hierarchical Employee-Manager Query - navigate hierarchy with head of company as employee_id = 1
How can you delete partitions from a table in Hive using a command?
How do partitioning strategies differ between source and sink?
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.
SQL is the most tested topic in data engineering interviews. Most companies dedicate an entire round to SQL, typically asking 3-5 questions covering window functions, CTEs, joins, optimization, and platform-specific features.
Focus on: window functions (RANK, ROW_NUMBER, LAG/LEAD), CTEs and recursive queries, query optimization and execution plans, indexing strategies, and platform-specific features for BigQuery, Redshift, or Snowflake depending on the company.
Yes. Data engineering SQL rounds emphasize analytical queries (window functions, aggregations), large-scale optimization (partitioning, indexing), and data warehouse concepts (star schema, slowly changing dimensions). Software engineering SQL tends to focus on CRUD operations and basic joins.
For a mid-level data engineering role, plan 2-4 weeks of focused SQL practice. Cover window functions, CTEs, optimization, and practice writing queries under time pressure. Use real interview questions from companies you're targeting.