A and B Test Design Questions

Designing and running A and B tests and split tests to evaluate product and feature changes. Candidates should be able to form clear null and alternative hypotheses, select appropriate primary metrics and guardrail metrics that reflect both product goals and user safety, choose randomization and assignment strategies, and calculate sample size and test duration using power analysis and minimum detectable effect reasoning. They should understand applied statistical analysis concepts including p values confidence intervals one tailed and two tailed tests sequential monitoring and stopping rules and corrections for multiple comparisons. Practical abilities include diagnosing inconclusive or noisy experiments detecting and mitigating common biases such as peeking selection bias novelty effects seasonality instrumentation errors and network interference and deciding when experiments are appropriate versus alternative evaluation methods. Senior candidates should reason about trade offs between speed and statistical rigor plan safe rollouts and ramping define rollback plans and communicate uncertainty and business implications to technical and non technical stakeholders. For developer facing products candidates should also consider constraints such as small populations cross team effects ethical concerns and special instrumentation needs.

EasyTechnical

1 practiced

Explain in plain language the difference between a one-tailed and two-tailed hypothesis test in the context of product experiments. Give two concrete A/B testing examples: one where a one-tailed test is appropriate and one where a two-tailed test must be used, and explain the trade-offs of choosing one over the other.

MediumTechnical

1 practiced

For a developer-facing SDK change that reduces API latency but could raise error rates, propose a structured set of primary, secondary, and guardrail metrics. Describe how to instrument these metrics in CI, staging, and production and how to detect regressions early.

MediumTechnical

0 practiced

Write a Python function to compute required sample size per group for a two-sample proportion test. Inputs: baseline conversion p0, absolute MDE (delta), power (1-beta), alpha, two_sided flag. Use the normal approximation and describe any approximations made.

EasyTechnical

1 practiced

What is 'peeking' during an online experiment? Describe how peeking inflates false positive rates and name two defensible strategies to allow interim looks without invalidating conclusions. Provide a short example of each strategy.

MediumTechnical

0 practiced

An A/B test shows a small positive lift on the primary metric that is not statistically significant, while several secondary metrics move in conflicting directions. Provide a structured checklist to diagnose noise and inconclusive results, including data checks, statistical tests, segmentation, and business-context investigations. Propose concrete next steps.

Unlock Full Question Bank

Get access to hundreds of A and B Test Design interview questions and detailed answers.

Join thousands of developers preparing for their dream job.