InterviewStack.io LogoInterviewStack.io

A and B Test Design Questions

Designing and running A and B tests and split tests to evaluate product and feature changes. Candidates should be able to form clear null and alternative hypotheses, select appropriate primary metrics and guardrail metrics that reflect both product goals and user safety, choose randomization and assignment strategies, and calculate sample size and test duration using power analysis and minimum detectable effect reasoning. They should understand applied statistical analysis concepts including p values confidence intervals one tailed and two tailed tests sequential monitoring and stopping rules and corrections for multiple comparisons. Practical abilities include diagnosing inconclusive or noisy experiments detecting and mitigating common biases such as peeking selection bias novelty effects seasonality instrumentation errors and network interference and deciding when experiments are appropriate versus alternative evaluation methods. Senior candidates should reason about trade offs between speed and statistical rigor plan safe rollouts and ramping define rollback plans and communicate uncertainty and business implications to technical and non technical stakeholders. For developer facing products candidates should also consider constraints such as small populations cross team effects ethical concerns and special instrumentation needs.

MediumTechnical
0 practiced
Describe how to adjust sample size calculations for a cluster-randomized experiment (e.g., randomizing by household or geographic region) using the intra-class correlation (ICC). Given 1000 clusters, average cluster size 10, ICC 0.02, and desired effective sample size of 2000 users, estimate whether you have sufficient power.
MediumTechnical
0 practiced
A product manager wants daily p-value updates and will stop the experiment as soon as p < 0.05. Explain why 'peeking' is problematic and propose a sequential monitoring strategy (frequentist or Bayesian) that allows regular looks while controlling type I error. Outline the operational rules you'd recommend.
HardSystem Design
0 practiced
Describe a safe deployment architecture for machine-learned models using feature flags that supports dark launches, shadow mode, canary, and progressive rollout. Specify what telemetry you'd collect at each stage to make reliable decisions.
EasyTechnical
0 practiced
Explain stratified (blocked) randomization and give three situations where stratification is important for A/B testing. Include how stratification affects variance and sample size calculations.
HardTechnical
0 practiced
You run multiple sequential experiments and want to control the false discovery rate (FDR) across the whole sequence of tests. Describe an approach to maintain long-run FDR control when experiments arrive over time and hypotheses are adaptive.

Unlock Full Question Bank

Get access to hundreds of A and B Test Design interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.