DataEngPrep.tech
QuestionsPracticeAI CoachDashboardPacksBlog
ProLogin

Interview Questions

Real questions from top companies Β· medium

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
201

Describe strategies for optimizing a slow-running query on a massive dataset.

SQLmediumpartition0.4 min read
BCG
β†’
202

Discuss strategies for handling schema evolution in data warehouses.

SQLmediumwindow0.3 min read
Disney+ Hotstar
β†’
203

Duplicate characters in a string (e.g., '123a!' to '112233aa!!').

SQLmediumjoinpythonspark0.3 min read
HashedIn
β†’
204

ER Modeling vs. Dimensional Modeling?

SQLmediumjoinsnowflake0.4 min read
Comcast
β†’
205

Explain CTE vs Temp Table. What are the differences and use cases?

SQLmediumjoin0.5 min read
Fractal
β†’
206

Explain Data Modeling SCD Types (Type 1, 2, 3).

SQLmediumjoin0.5 min read
EY
β†’
207

Explain Dynamic Partition Pruning error and how to fix it.

SQLmediumjoinpartitionspark0.5 min read
Globant
β†’
208

Explain Fact Table and Star Schema.

SQLmediumjoinpartition0.4 min read
HCL
β†’
209

Explain Redshift Data Distribution (EVEN, KEY, ALL).

SQLmediumjoin0.4 min read
Impetus
β†’
210

Explain Union vs Union All in SQL.

SQLmediumjoinsql0.4 min read
Gartner
β†’
211

Implement a recursive query for hierarchy (employee-manager). Explain the termination guarantees, depth limits, and when a recursive CTE becomes a scalability bottleneck. What alternatives exist for graph-scale hierarchies in Spark or a data lake?

SQLmediumjoinspark0.6 min read
American Express
β†’
212

Compare Glue partition discovery with Hive MSCK/ADD PARTITION. Explain the operational and cost implications of crawler-based vs. partition-projection approaches. When does partition projection become necessary, and what are its limitations?

SQLmediumpartition0.5 min read
Capco
β†’
213

Explain how partitioning and bucketing in Hive/Spark optimize queries. What are the trade-offs in bucket count, partition cardinality, and small-file problem? When does over-partitioning or over-bucketing become counterproductive?

SQLmediumjoinpartitionspark0.6 min read
Adidas
β†’
214

Explain how to implement cumulative sum in SQL.

SQLmediumpartitionsparksql0.3 min read
Hexaware
β†’
215

Explain how you would implement partitioning and bucketing for data stored in S3 to improve query performance.

SQLmediumjoinpartitionspark0.3 min read
EPAM
β†’
216

Explain how you would optimize Redshift query performance for a reporting system with large fact tables.

SQLmediumjoin0.4 min read
Capco
β†’
217

Explain how you would use repartition or coalesce effectively to optimize processing when analyzing data only for a specific region.

SQLmediumpartition0.4 min read
Dunnhumby
β†’
218

Explain indexing and its impact on database performance.

SQLmediumbigqueryjoinpartition0.3 min read
Goldman Sachs
β†’
219

Explain normalization and its disadvantages.

SQLmediumjoin0.3 min read
Gartner
β†’
220

Explain offset management, Sync vs. Async commits, partition assignment strategies and Consumer groups, and handling backpressure in Kafka streams.

SQLmediumpartition0.4 min read
Expedia
β†’

Reading isn't practice. Get AI feedback on your answers.

Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.

Try AI Answer Coach β€” FreeStart a Mock Interview
Previous1...910111213...24Next
Categories
All QuestionsSQLSpark / Big DataPython / CodingSystem DesignCloud / ToolsBehavioral
By Company
AmazonGoogleDatabricksSnowflakeMicrosoftNetflixUberTCS
Interview Guides
All GuidesTop SQL QuestionsTop Spark QuestionsTop Python QuestionsTop System DesignSQL Window FunctionsETL QuestionsData Modeling
Products
AI Interview CoachAnswer AnalyzerSQL PlaygroundResume AnalyzerInterview PacksPricing
Company
About UsContact UsAI DisclosureDisclaimerTerms of ServicePrivacy Policy
Β© 2026 DataEngPrep.tech. All rights reserved.
AboutBlogContactDisclaimer