DataEngPrep.tech

JavaScript is required to use this application. Please enable JavaScript in your browser settings or disable any extensions that may be blocking scripts.

DataEngPrep.tech

Questions Practice AI Coach Dashboard Packs Blog

Interview Questions

Real questions from top companies · medium

700+ Easy450+ Medium650+ Hard

All Categories Behavioral Spark/Big Data SQL Python/Coding System Design/Architecture Cloud/Tools General/Othereasy medium hard

Describe strategies for optimizing a slow-running query on a massive dataset.

SQLmediumpartition0.4 min read

Discuss strategies for handling schema evolution in data warehouses.

SQLmediumwindow0.3 min read

Disney+ Hotstar

Duplicate characters in a string (e.g., '123a!' to '112233aa!!').

SQLmediumjoinpythonspark0.3 min read

ER Modeling vs. Dimensional Modeling?

SQLmediumjoinsnowflake0.4 min read

Explain CTE vs Temp Table. What are the differences and use cases?

SQLmediumjoin0.5 min read

Explain Data Modeling SCD Types (Type 1, 2, 3).

SQLmediumjoin0.5 min read

Explain Dynamic Partition Pruning error and how to fix it.

SQLmediumjoinpartitionspark0.5 min read

Explain Fact Table and Star Schema.

SQLmediumjoinpartition0.4 min read

Explain Redshift Data Distribution (EVEN, KEY, ALL).

SQLmediumjoin0.4 min read

Explain Union vs Union All in SQL.

SQLmediumjoinsql0.4 min read

Implement a recursive query for hierarchy (employee-manager). Explain the termination guarantees, depth limits, and when a recursive CTE becomes a scalability bottleneck. What alternatives exist for graph-scale hierarchies in Spark or a data lake?

SQLmediumjoinspark0.6 min read

American Express

Compare Glue partition discovery with Hive MSCK/ADD PARTITION. Explain the operational and cost implications of crawler-based vs. partition-projection approaches. When does partition projection become necessary, and what are its limitations?

SQLmediumpartition0.5 min read

Explain how partitioning and bucketing in Hive/Spark optimize queries. What are the trade-offs in bucket count, partition cardinality, and small-file problem? When does over-partitioning or over-bucketing become counterproductive?

SQLmediumjoinpartitionspark0.6 min read

Explain how to implement cumulative sum in SQL.

SQLmediumpartitionsparksql0.3 min read

Explain how you would implement partitioning and bucketing for data stored in S3 to improve query performance.

SQLmediumjoinpartitionspark0.3 min read

Explain how you would optimize Redshift query performance for a reporting system with large fact tables.

SQLmediumjoin0.4 min read

Explain how you would use repartition or coalesce effectively to optimize processing when analyzing data only for a specific region.

SQLmediumpartition0.4 min read

Explain indexing and its impact on database performance.

SQLmediumbigqueryjoinpartition0.3 min read

Explain normalization and its disadvantages.

SQLmediumjoin0.3 min read

Explain offset management, Sync vs. Async commits, partition assignment strategies and Consumer groups, and handling backpressure in Kafka streams.

SQLmediumpartition0.4 min read

Reading isn't practice. Get AI feedback on your answers.

Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.

Try AI Answer Coach — Free Start a Mock Interview

Previous 1...9 10 11 12 13...24 Next

Categories

All Questions SQL Spark / Big Data Python / Coding System Design Cloud / Tools Behavioral

By Company

Amazon Google Databricks Snowflake Microsoft Netflix Uber TCS

Interview Guides

All Guides Top SQL Questions Top Spark Questions Top Python Questions Top System Design SQL Window Functions ETL Questions Data Modeling

Products

AI Interview Coach Answer Analyzer SQL Playground Resume Analyzer Interview Packs Pricing

Company

About Us Contact Us AI Disclosure Disclaimer Terms of Service Privacy Policy

© 2026 DataEngPrep.tech. All rights reserved.

About Blog Contact Disclaimer