Real interview questions asked at Meesho. Practice the most frequently asked questions and land your next role.
Meesho data engineering interviews test your ability across multiple domains. These questions are sourced from real Meesho interview experiences and sorted by frequency. Practice the ones that matter most.
Tell me about yourself and your experience.
Tell me about your family background
Explain the differences between Data Warehouse, Data Lake, and Delta Lake
Design a fault-tolerant Spark Streaming checkpoint strategy: what to persist, recovery semantics, and cost/scalability trade-offs with checkpoint frequency.
Given a streaming dataset from Kafka, how would you ingest the data in real-time using Spark?
What are your hobbies or activities you enjoy outside of work?
What are your key achievements in your career so far?
What database would you choose for handling transactional and non-transactional data? Why?
What is your ultimate career goal or life goal?
Why do you think Meesho is the right fit for you?
Can you elaborate on your Big Data project experience?
Describe the ZS projects you worked on
Find All Numbers that Appear at Least Three Times Consecutively
The Stock Span Problem
What would you do if a job misses its SLA? How would you handle such situations?
Can you explain the concept of polymorphism and inheritance in Java with examples?
Discuss the tech stacks and responsibilities at Morgan Stanley
Trapping Rain Water - calculate amount of water trapped between array elements
What are the key differences between interfaces and abstract classes in Java?
Write Java code to read a file using FileInputStream
Write a Java program using FileInputStream and BufferedReader to read data from a local file and print the output to the console
Zigzag Order Traversal of a Binary Tree
Design a Custom API that can query a backend server and return customer data such as the number of orders placed by a user based on their user ID
Design the data model for an ETL pipeline that ingests data from a database and loads it into Snowflake
Explain normalization in databases and its importance. Write an SQL query to handle SCD-1 or SCD-3
How soon could you join Meesho if you are selected?
How would you clean the data by filtering out records with null values in user_id?
Managed vs Unmanaged Tables
SQL Query to Find Average Sessions per User within 30-day period
What is the difference between Data Lakehouse, Delta Lake, and a Data Warehouse?
After cleaning, how would you store the transformed data into Delta Lake?
Compare Kafka Streams and Spark Structured Streaming for real-time processing
Databricks Cluster Management - standalone vs YARN mode
Design an ETL pipeline using Kafka and Spark Streaming
Explain how spark.read.format("delta").load() works
Explain the architecture and role of the Hive Metastore in a data pipeline
Explain the architecture of Kafka
Explain the architecture of Spark Streaming
Handling Skewness in Data - salting, broadcast join
Have you worked with data compaction in Delta Lake?
How do you store streaming data in Delta Lake and handle schema evolution?
How does Databricks create clusters for running Spark jobs?
How does Delta Lake store the transaction history in S3 buckets?
How to optimize mappers using properties in MapReduce?
How would you ensure exactly-once processing for Kafka consumers in your Spark job?
How would you handle memory management in Spark?
How would you manage the streaming data schema and handle schema evolution in Delta Lake?
How would you optimize your Spark Streaming ETL pipeline for high throughput and low latency?
Spark Executor Management: 10 workers, 100GB RAM, 25 cores - number of executors, size, OOM in Driver
Sqoop command for importing multiple tables
Suppose you need to import 5 tables from an external RDBMS (like MySQL) into Hadoop HDFS. Write the Sqoop command
What role would Kafka or similar event-driven platforms play in your architecture?
What strategies would you use to optimize Spark jobs for both performance and cost on AWS?
You are given 10 worker machines with 100 GB RAM and 25 CPU cores. How would you determine the number of executors and the size of each executor?
Design an e-commerce platform like Flipkart
How does Presto fetch data from a data catalog?
How would you design the architecture to handle high availability and scalability?
How would you ensure the system can handle millions of concurrent users?
How would you set up an alert system to monitor your ETL pipeline for failures or performance issues?
Download the complete interview prep bundle with expert answers. Study offline, on your commute, anywhere.