InterviewStack.io LogoInterviewStack.io

Experiment Design and Execution Questions

Covers end to end design and execution of experiments and A B tests, including identifying high value hypotheses, defining treatment variants and control, ensuring valid randomization, defining primary and guardrail metrics, calculating sample size and statistical power, instrumenting events, running analyses and interpreting results, and deciding on rollout or rollback. Also includes building testing infrastructure, establishing organizational best practices for experimentation, communicating learnings, and discussing both successful and failed tests and their impact on product decisions.

MediumTechnical
0 practiced
You're seeing suspiciously high conversion rates from a cohort of users suspected to be bots during an experiment. Describe methods to detect bot traffic in experiment data and how to mitigate its effect on randomization and results (e.g., filtering, weighting, instrument-level fixes).
HardTechnical
0 practiced
Design and validate an uplift modeling approach for targeting treatment to users most likely to benefit. Outline data collection (treatment labels, features), model architecture choices, evaluation metrics (Qini, uplift curves), offline validation strategy, and how you'd safely roll this into production experiments.
MediumTechnical
0 practiced
Case study: An experiment shows a statistically significant +1.2% lift in click-through rate (p=0.02) but a negative guardrail: -0.8% in 7-day retention (p=0.12). The PM wants to ship. Walk through your analysis plan to recommend rollout or rollback. Include additional checks, power considerations for the retention metric, and possible mitigations if you proceed.
EasyTechnical
0 practiced
For a recommender system UI change, define 3 primary metrics and 3 guardrail metrics you would set before launching an A/B experiment. Explain why each metric is appropriate and how you would instrument them. The scenario: the change shows a new 'recommended for you' carousel on the homepage for logged-in users.
MediumTechnical
0 practiced
Medium: How would you design an experiment and analysis to measure long-term retention impact (e.g., 90-day retention) for a feature when you need results faster than 90 days? Describe surrogate metrics, interim analyses, and how to validate that short-term proxies are predictive of long-term retention.

Unlock Full Question Bank

Get access to hundreds of Experiment Design and Execution interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.