Data engineering interview questions
Explain normalization and its disadvantages.
Explain normalization in databases and its importance. Write an SQL query to handle SCD-1 or SCD-3
Explain offset management, Sync vs. Async commits, partition assignment strategies and Consumer groups, and handling backpressure in Kafka streams.
Explain peer code review and team lead review.
Explain row_number, rank, and dense_rank with examples.
Explain the Medallion Architecture (Bronze, Silver, Gold).
Explain the difference between a clustered and non-clustered index.
Explain the difference between a fact table and a dimension table.
Explain the difference between a primary key and a unique key.
Explain the concept of window functions in SQL and provide an example
Explain the difference between Star and Snowflake schemas. When would you choose one over the other?
Explain the difference between partition count and query performance in Spark.
Explain the differences between OLTP and OLAP databases and their relevance in Adidas's operations.
Explain the differences between Redshift and Snowflake, and how I've used them in previous projects.
Explain the differences between table re-creation and ALTER TABLE operations.
Explain the order in which SQL clauses are executed.
Explain the process you would follow for optimizing a database query that is running slow.
Explain the purpose of windowing and triggering in streaming data pipelines.
Explain the scalability, performance, and cost-efficiency of both Redshift and Snowflake in different use cases.
Explain the use of Amazon Athena for serverless querying.
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.
SQL is the most tested topic in data engineering interviews. Most companies dedicate an entire round to SQL, typically asking 3-5 questions covering window functions, CTEs, joins, optimization, and platform-specific features.
Focus on: window functions (RANK, ROW_NUMBER, LAG/LEAD), CTEs and recursive queries, query optimization and execution plans, indexing strategies, and platform-specific features for BigQuery, Redshift, or Snowflake depending on the company.
Yes. Data engineering SQL rounds emphasize analytical queries (window functions, aggregations), large-scale optimization (partitioning, indexing), and data warehouse concepts (star schema, slowly changing dimensions). Software engineering SQL tends to focus on CRUD operations and basic joins.
For a mid-level data engineering role, plan 2-4 weeks of focused SQL practice. Cover window functions, CTEs, optimization, and practice writing queries under time pressure. Use real interview questions from companies you're targeting.