Real questions from top companies Β· easy
Write SQL query to replace specific patterns in a string column.
Write a Merge Statement for SCD Type 2.
Write a SQL query to find distinct IDs from a table where the count is more than 1 and greater than 200.
Write a query for second-highest salary using LIMIT, OFFSET, or ROW_NUMBER()
Write a query that identifies numbers appearing at least three times consecutively without interruption
Write a query to find minimum age.
Write a query to find the first number repeating consecutively three times in a sequence.
Write a query to find the median salary of employees in a table.
Write a query to switch values in the Gender column (M to F and F to M).
A JSON file with evolving schema needs to be ingested into a DataFrame. How would you handle new fields dynamically in PySpark without breaking the job for previous structures?
A task intermittently fails due to external API limitations. How would you configure Airflow retries and alerts to manage this situation efficiently?
Accumulator and Broadcast Variables - explain
Approaches to handling multiple tasks within a sprint?
Cache() vs Persist(): Explain the difference and use cases for caching and persisting data in Spark with memory levels.
Can you explain dynamic resource allocation in Spark? How does it help optimize job performance?
Can you explain the concept of incremental loading in Sqoop and how to use it for job processing?
Can you give a use case where Delta Live Tables would be ideal?
Can you share a time when you had to shift focus due to urgent tasks?
Cluster Resource Allocation in Spark
Compare HDFS and cloud-based storage systems in terms of scalability and performance.
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.