Interview questions · easy
What is the difference between Managed and External tables in Hive/Spark?
Explain the difference between shallow copy and deep copy in Python.
Write a Python function to find the first non-repeating character in a string.
Explain Common Table Expressions (CTEs) and their benefits.
Write a Python function to find the first non-repeating character in a string.
Explain your projects on which you worked till now and what was your role?
What would you do if you were assigned a task with a technology you've never used before?
Write a SQL query to find distinct IDs from a table where the count is more than 1 and greater than 200.
A JSON file with evolving schema needs to be ingested into a DataFrame. How would you handle new fields dynamically in PySpark without breaking the job for previous structures?
A task intermittently fails due to external API limitations. How would you configure Airflow retries and alerts to manage this situation efficiently?
Suppose you have a DAG that ingests data from multiple databases. How would you increase task parallelism in Airflow to improve performance without overloading the system?
What are the different modes in which you can submit Spark jobs? Explain each.
What is the difference between external and internal tables in Hive?
Write the Spark command to rename an existing column in a DataFrame.
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.