Interview Questions

A JSON file with evolving schema needs to be ingested into a DataFrame. How would you handle new fields dynamically in PySpark without breaking the job for previous structures?

Spark/Big Dataeasyspark0.3 min read

Dunnhumby

→

631

A task intermittently fails due to external API limitations. How would you configure Airflow retries and alerts to manage this situation efficiently?

Spark/Big Dataeasyairflow0.2 min read

Dunnhumby

→

632

Accumulator and Broadcast Variables - explain

Spark/Big Dataeasy0.2 min read

LTIMindtree

→

633

Approaches to handling multiple tasks within a sprint?

Spark/Big Dataeasy0.6 min read

Snowflake

→

634

Cache() vs Persist(): Explain the difference and use cases for caching and persisting data in Spark with memory levels.

Spark/Big Dataeasyspark0.5 min read

Capgemini

→

635

Can you explain dynamic resource allocation in Spark? How does it help optimize job performance?

Spark/Big Dataeasyspark0.5 min read

Coforge

→

636

Can you explain the concept of incremental loading in Sqoop and how to use it for job processing?

Spark/Big Dataeasy0.5 min read

Infosys

→

637

Can you give a use case where Delta Live Tables would be ideal?

Spark/Big Dataeasyetllakehousespark0.5 min read

TCS

→

638

Can you share a time when you had to shift focus due to urgent tasks?

Spark/Big Dataeasy0.5 min read

Moonfare

→

639

Cluster Resource Allocation in Spark

Spark/Big Dataeasyspark0.4 min read

Walmart

→

640

Compare HDFS and cloud-based storage systems in terms of scalability and performance.

Spark/Big Dataeasy0.5 min read

Swiggy

→

Reading isn't practice. Get AI feedback on your answers.

Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.

Try AI Answer Coach — Free Start a Mock Interview

Previous 1...30 31 32 33 34...36 Next