Data engineering interview questions · easy
Describe step scaling policies vs. target tracking policies in AWS Auto Scaling.
Describe the process and use cases of implementing Azure Data Factory pipelines.
Describe using Step Functions to handle retries and error notifications.
Difference between linked services and datasets in ADF.
Differentiate between global and local variables in ADF.
Discuss how versioning works in S3 and its use cases, such as data recovery and auditing.
Discuss the key differences between AWS Glue, Lambda, and Data Pipeline for orchestrating data workflows.
Discuss versioning in S3.
Docker - purpose and handling dependencies
Error Handling in ADF?
Explain AWS Lake Formation and its benefits.
Explain GetMetadata, ForEach, and Copy Data in Azure Data Factory.
Explain Microsoft Fabric and its use in data integration.
Explain Step Functions for orchestration of workflows.
Explain a linked service and how to create one.
Explain how Access Control Lists (ACLs) can affect IAM role permissions.
Explain how Infrastructure as Code (IaC) works in AWS and its advantages
Explain how Step Functions integrate with other AWS services.
Explain how you would configure an S3 bucket policy to allow access only from a specific EC2 instance
Explain linked services and how they are created.
Type or paste your answer to any of these questions and our AI Coach scores it, highlights gaps, and rewrites it at FAANG quality. Free to try.
Learn the platform used by your target companies. AWS is most common overall (Glue, Redshift, S3, Kinesis). GCP is preferred by Google and startups (BigQuery, Dataflow, Pub/Sub). Azure is dominant in enterprise (Synapse, Data Factory). Learn one deeply and understand the equivalents on others.
Core tools: SQL, Python, Spark, Airflow (or equivalent orchestrator), one cloud platform. Increasingly important: dbt, Kafka, Terraform, Docker/Kubernetes, Delta Lake or Apache Iceberg, a data observability tool. The specific stack varies by company.
Yes. Apache Airflow is the most widely used orchestration tool and questions about DAG design, task dependencies, XComs, operators, and failure handling are common. If the company uses a different orchestrator, expect similar questions adapted to their tool.