Interview questions · medium
What is the difference between repartition and coalesce in Apache Spark?
What is the difference between partitioning and bucketing in Spark, and when would you use bucketing?
What strategies can you use to handle skewed data in Spark?
What is a window function? Explain with an example.
An existing job running longer suddenly: how to analyze the issue?
Shell commands for renaming a file?
Oozie join condition?
Partitioning a table with card details and transactions?
Teradata to Hadoop migration and handling data with SCD Type 2?
What is a Kafka topic, and how do you choose the number of partitions for it?
What is the role of a partition in Kafka, and how does it impact scalability?
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.