DataEngPrep.tech
QuestionsBlogStore
Get PDF Bundle

Interview Questions

Real questions from top companies in Spark/Big Data · easy

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
1

Why does Hive use Derby by default, and what alternatives are used in production?

Spark/Big Dataeasy
2

Worked with UDFs - share examples

Spark/Big Dataeasy
3

Write PySpark code to filter and count records.

Spark/Big Dataeasy
4

Write PySpark code to filter records based on specific conditions and add a calculated column.

Spark/Big Dataeasy
5

Write a PySpark code snippet to filter rows with a specific condition.

Spark/Big Dataeasy
6

Write the Spark command to rename an existing column in a DataFrame.

Spark/Big Dataeasy
7

Writing Excel sheets to Delta tables in Databricks

Spark/Big Dataeasy
8

You are given 10 worker machines with 100 GB RAM and 25 CPU cores. How would you determine the number of executors and the size of each executor?

Spark/Big Dataeasy

+8 More Questions with Expert Answers

Get the complete 1,800+ question library with detailed, expert-level answers covering SQL, Spark, System Design, Python, Cloud, and Behavioral topics.

Get PDF Bundle — from $21Try Free Sample
Previous1...345