DataEngPrep.tech
QuestionsPracticeAI CoachDashboardPacksBlog
ProLogin

Interview Questions

Real questions from top companies Β· easy

700+ Easy450+ Medium650+ Hard
All CategoriesBehavioralSpark/Big DataSQLPython/CodingSystem Design/ArchitectureCloud/ToolsGeneral/Othereasymediumhard
621

Write SQL query to replace specific patterns in a string column.

SQLeasybigquerysql0.2 min read
Incedo
β†’
622

Write a Merge Statement for SCD Type 2.

SQLeasysnowflakesql0.4 min read
EY
β†’
623

Write a SQL query to find distinct IDs from a table where the count is more than 1 and greater than 200.

SQLeasysql0.3 min read
Dunnhumby
β†’
624

Write a query for second-highest salary using LIMIT, OFFSET, or ROW_NUMBER()

SQLeasy0.3 min read
Nihilent
β†’
625

Write a query that identifies numbers appearing at least three times consecutively without interruption

SQLeasy0.3 min read
JP Morgan
β†’
626

Write a query to find minimum age.

SQLeasy0.1 min read
Altimetrik
β†’
627

Write a query to find the first number repeating consecutively three times in a sequence.

SQLeasy0.3 min read
American Express
β†’
628

Write a query to find the median salary of employees in a table.

SQLeasybigquery0.2 min read
Goldman Sachs
β†’
629

Write a query to switch values in the Gender column (M to F and F to M).

SQLeasy0.3 min read
Fossil Group
β†’
630

A JSON file with evolving schema needs to be ingested into a DataFrame. How would you handle new fields dynamically in PySpark without breaking the job for previous structures?

Spark/Big Dataeasyspark0.3 min read
Dunnhumby
β†’
631

A task intermittently fails due to external API limitations. How would you configure Airflow retries and alerts to manage this situation efficiently?

Spark/Big Dataeasyairflow0.2 min read
Dunnhumby
β†’
632

Accumulator and Broadcast Variables - explain

Spark/Big Dataeasy0.2 min read
LTIMindtree
β†’
633

Approaches to handling multiple tasks within a sprint?

Spark/Big Dataeasy0.6 min read
Snowflake
β†’
634

Cache() vs Persist(): Explain the difference and use cases for caching and persisting data in Spark with memory levels.

Spark/Big Dataeasyspark0.5 min read
Capgemini
β†’
635

Can you explain dynamic resource allocation in Spark? How does it help optimize job performance?

Spark/Big Dataeasyspark0.5 min read
Coforge
β†’
636

Can you explain the concept of incremental loading in Sqoop and how to use it for job processing?

Spark/Big Dataeasy0.5 min read
Infosys
β†’
637

Can you give a use case where Delta Live Tables would be ideal?

Spark/Big Dataeasyetllakehousespark0.5 min read
TCS
β†’
638

Can you share a time when you had to shift focus due to urgent tasks?

Spark/Big Dataeasy0.5 min read
Moonfare
β†’
639

Cluster Resource Allocation in Spark

Spark/Big Dataeasyspark0.4 min read
Walmart
β†’
640

Compare HDFS and cloud-based storage systems in terms of scalability and performance.

Spark/Big Dataeasy0.5 min read
Swiggy
β†’

Reading isn't practice. Get AI feedback on your answers.

Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.

Try AI Answer Coach β€” FreeStart a Mock Interview
Previous1...3031323334...36Next
Categories
All QuestionsSQLSpark / Big DataPython / CodingSystem DesignCloud / ToolsBehavioral
By Company
AmazonGoogleDatabricksSnowflakeMicrosoftNetflixUberTCS
Interview Guides
All GuidesTop SQL QuestionsTop Spark QuestionsTop Python QuestionsTop System DesignSQL Window FunctionsETL QuestionsData Modeling
Products
AI Interview CoachAnswer AnalyzerSQL PlaygroundResume AnalyzerInterview PacksPricing
Company
About UsContact UsAI DisclosureDisclaimerTerms of ServicePrivacy Policy
Β© 2026 DataEngPrep.tech. All rights reserved.
AboutBlogContactDisclaimer