InterviewStack.io LogoInterviewStack.io

Amazon Data Scientist Interview Preparation Guide (Mid-Level)

Data Scientist
Amazon
Mid Level
8 rounds
Updated 6/16/2026

Amazon's Data Scientist interview process consists of an initial recruiter screen followed by two technical phone screens and five onsite rounds. The process evaluates candidates across SQL, Machine Learning, Python coding, Statistics, Algorithms, and Behavioral/Cultural fit. Interviewers assess both technical depth and ability to translate business problems into data-driven solutions. The entire process typically spans 4-6 weeks.

Interview Rounds

1

Recruiter Screening

2

Technical Phone Screen 1: SQL & Data Analysis

3

Technical Phone Screen 2: Machine Learning & Modeling

4

Onsite Round 1: Machine Learning & Modeling Deep Dive

5

Onsite Round 2: Data Analysis & A/B Testing

6

Onsite Round 3: SQL & Database Optimization

7

Onsite Round 4: Algorithms & Problem Solving

8

Onsite Round 5: Amazon Leadership Principles & Behavioral

Frequently Asked Data Scientist Interview Questions

Applying Data Science Techniques to Business ProblemsMediumTechnical
73 practiced
Given these tables:
orders(order_id bigint, user_id bigint, order_date date, revenue numeric)
users(user_id bigint, signup_date date)
Write a PostgreSQL query that produces cohort_monthly_ltv with columns: cohort_month (date), month_number (int; 0 = signup month), users_in_cohort, month_revenue, cumulative_revenue, avg_ltv_per_user (cumulative) for the first 12 months after signup. Explain assumptions and performance tuning tips for large datasets.
Advanced Querying with Structured Query LanguageEasyTechnical
18 practiced
Given a table events(user_id, event_time, event_type), write a SQL query (Postgres/ANSI) that returns the latest event per user (user_id, event_time, event_type). Use window functions (row_number) and briefly explain why window functions may be preferred over a correlated subquery here.
Model Evaluation and ValidationEasyTechnical
69 practiced
Explain what stratified sampling achieves in cross-validation. Give an example using a 10-fold stratified CV for a binary classification task with 1% positives. Why is stratification important for rare classes?
Hypothesis Testing and InferenceHardTechnical
29 practiced
Write Python code that implements the Benjamini-Hochberg procedure to control the false discovery rate at level q given an array of p-values. Your implementation should return the indices of hypotheses declared significant and adjusted p-values. Discuss time complexity and how to handle tied p-values or grouped hypotheses.
Cross Functional Collaboration and CoordinationMediumTechnical
44 practiced
You notice repeated misunderstandings about data lineage are causing duplicated work across teams. How would you create sustainable documentation and processes to reduce handoffs and ensure a single source of truth? Include tooling and governance ideas.
A and B Test DesignEasyTechnical
63 practiced
Define type I error (false positive), type II error (false negative), statistical power, significance level (alpha), and Minimum Detectable Effect (MDE). For each concept provide a practical interpretation in the context of a conversion-rate A/B test and a short note on how product trade-offs influence acceptable values.
Applying Data Science Techniques to Business ProblemsHardSystem Design
68 practiced
Design an analytics pipeline that computes near real-time experiment metrics (e.g., conversion rate) with 1M events/sec ingestion and target dashboard latency < 30 seconds. Discuss streaming ingestion, stateful windowed aggregation, exactly-once processing semantics, storage choices for materialized views, consistency trade-offs, backfills, and cost optimizations. Name concrete technologies you would consider.
Advanced Querying with Structured Query LanguageMediumTechnical
21 practiced
Explain partitioning strategies for a large table events(event_date DATE, user_id, event_type, payload). Which partition key and method (range, list, hash) would you choose? Show a sample query that benefits from partition pruning and explain how pruning reduces scanned data.
Model Evaluation and ValidationEasyTechnical
93 practiced
You built a multiclass classifier (5 classes). Explain the difference between macro, micro, and weighted averaging when computing F1 scores. Provide an example scenario where macro F1 is preferable to weighted F1.
Hypothesis Testing and InferenceMediumTechnical
35 practiced
You're running an A/B/n test with three variants and plan to look at interim results daily. Explain the statistical risks of sequential peeking, how repeated looks inflate Type I error, and describe practical approaches to allow interim monitoring while controlling error rate (alpha-spending functions, group sequential designs, and sequential probability ratio tests).
Additional Information

Want to create your own tailored preparation guide using our deep research?

Get Started for Free

Interview-Ready Courses

Visual-first, interactive, structured learning paths

Browse Data Scientist jobs

AI-enriched listings across hundreds of company career pages

Explore Jobs
Amazon Data Scientist Interview Questions & Prep Guide (Mid-Level) | InterviewStack.io