InterviewStack.io LogoInterviewStack.io

Experiment Design and Execution Questions

Covers end to end design and execution of experiments and A B tests, including identifying high value hypotheses, defining treatment variants and control, ensuring valid randomization, defining primary and guardrail metrics, calculating sample size and statistical power, instrumenting events, running analyses and interpreting results, and deciding on rollout or rollback. Also includes building testing infrastructure, establishing organizational best practices for experimentation, communicating learnings, and discussing both successful and failed tests and their impact on product decisions.

HardSystem Design
55 practiced
Design automated rollback policies and guardrail triggers for experiments that can be activated without human intervention. Specify which statistical signals (significance, effect size) and operational signals (error-rate, billing anomalies) should trigger rollback, describe cooldown or hysteresis to avoid flapping, and define safe human override policies. Discuss failure scenarios and safeguards to prevent false positive rollbacks.
HardTechnical
60 practiced
You analyze experiments with many correlated metrics. Propose statistical strategies to deal with correlated testing, including multivariate testing, hierarchical modeling, and constructing composite business metrics. Explain the trade-offs, how to choose weights for composite metrics considering covariance, and how to preserve interpretability for product stakeholders.
MediumTechnical
48 practiced
Describe how you would detect novelty effects or time-varying treatment effects in an experiment that runs for multiple weeks. Which visualizations, models, and statistical checks would you run to distinguish a genuine effect from novelty or novelty decay? Explain how you might adjust experiment duration or analysis to handle these dynamics.
MediumTechnical
49 practiced
Design experimental approaches to handle interference in a social product where users influence each other (SUTVA violations). Compare cluster randomization, graph cluster assignments, and exposure mapping. For each approach, describe assumptions, analysis methods, and trade-offs in bias and power.
MediumSystem Design
59 practiced
Design an experiment to evaluate a personalized recommendation algorithm. The algorithm tailors ranking per user. Discuss randomization choices: should you randomize model assignment per user, randomize per impression, or randomize ranking perturbations? Address stratification, consistency across sessions, SUTVA concerns, logging needs for unbiased ATE/CATE estimation, and offline simulation considerations.

Unlock Full Question Bank

Get access to hundreds of Experiment Design and Execution interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.