Real questions from top companies Β· medium
How do you manage data storage in AWS?
How do you merge data from different sources in ADF while maintaining data quality?
How would you optimize an ADF pipeline for high performance?
How would you migrate 1TB of data using ADF?
How would you optimize cost when using AWS for large-scale data processing?
Lambda vs. Glue: Discuss use cases for both services.
What alternatives to Kinesis would you consider for real-time data ingestion?
What integration challenges might you face with Glue Catalog in non-AWS environments?
APPLY Operator - CROSS APPLY and OUTER APPLY
An existing job running longer suddenly: how to analyze the issue?
Calculate a 7-day moving average of clicks for each user_id
Calculate a 7-day moving average of orders for each city in the Swiggy database.
Calculate cumulative sales for each product in each store, ordered by sale_date
Calculate the total number of transactions (units sold) for each product.
Calculate the total sales amount for customers born between 1998-01-15 and 2000-01-15.
Compute the moving average of daily transactions over a 7-day window.
Data Shuffling Causes and Techniques
Describe a scenario where you had to optimize a slow-running data pipeline.
Describe a time when you had to deal with a major data quality issue. How did you handle it?
Describe the concept of data sharding and when to use it.
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.