Experimentation and Product Validation Questions
Designing and interpreting experiments and validation strategies to test product hypotheses. Includes hypothesis formulation, experimental design, sample sizing considerations, metrics selection, interpreting results and statistical uncertainty, and avoiding common pitfalls such as peeking and multiple hypothesis testing. Also covers qualitative validation methods such as interviews and pilots, and using a mix of methods to validate product ideas before scaling.
MediumTechnical
70 practiced
A PM asks you to 'optimize signups' but the business cares most about long-term retention. Walk through how you would select a primary metric and at least three guardrail metrics for an experiment intended to increase signups. For each proposed metric, justify trade-offs, how you'd measure it, and potential pitfalls.
HardTechnical
60 practiced
Case study: A freemium product has an onboarding funnel Visit -> Signup -> Activate -> First Purchase with conversion rates Visit->Signup 4%, Signup->Activate 50%, Activate->Purchase 10%. The PM asks to improve purchases. Propose a 3–5 experiment roadmap with hypotheses, primary and guardrail metrics for each experiment, sample sizing considerations, and prioritization rationale.
HardTechnical
66 practiced
Given a daily aggregated table daily_results(date, bucket, users, conversions) for an experiment, write a Postgres SQL query that computes cumulative conversion rates per day for control and treatment, the cumulative absolute lift, the z-test p-value for the cumulative difference per day, and flags the first date where p-value < 0.05. Note: this replicates naive sequential peeking; explain its pitfalls in the comments.
HardTechnical
64 practiced
Explain uplift modeling (incremental response modeling). How does it differ from standard supervised learning that predicts conversion? Describe what data (randomized experiment data) you need to train uplift models and how a BI analyst could use uplift outputs to prioritize users for treatment.
MediumTechnical
60 practiced
Write a Python function that computes a bootstrap 95% confidence interval for mean revenue per user given a list or iterator of aggregated user revenues. The function should allow specifying number of bootstrap samples and be memory efficient by using user-level aggregation as input. Include brief comments on complexity and parallelization options.
Unlock Full Question Bank
Get access to hundreds of Experimentation and Product Validation interview questions and detailed answers.
Sign in to ContinueJoin thousands of developers preparing for their dream job.