InterviewStack.io LogoInterviewStack.io

Applying Data Science Techniques to Business Problems Questions

Recognizing when A/B testing is appropriate vs observational analysis. Suggesting SQL queries or analysis approaches that would answer the business question. Understanding when you'd need advanced modeling vs simpler analysis. Connecting technical approaches to business decisions (e.g., 'This cohort analysis would tell us whether the decline is from existing users or new users').

MediumTechnical
73 practiced
Given these tables:
orders(order_id bigint, user_id bigint, order_date date, revenue numeric)
users(user_id bigint, signup_date date)
Write a PostgreSQL query that produces cohort_monthly_ltv with columns: cohort_month (date), month_number (int; 0 = signup month), users_in_cohort, month_revenue, cumulative_revenue, avg_ltv_per_user (cumulative) for the first 12 months after signup. Explain assumptions and performance tuning tips for large datasets.
HardSystem Design
85 practiced
Design an experimentation platform for a consumer product that must support: (1) randomized experiments across web and mobile where users are identified by email and device IDs, (2) overlapping experiments (factorial and sequential), (3) exposure and event tracking at 1M daily active users, and (4) safe rollouts with ramping and kill-switch. Outline architecture, assignment mechanism, data collection, metrics pipeline, and how you would ensure consistent randomization across platforms while preserving privacy.
HardTechnical
75 practiced
Product managers want to peek at experiment results daily. Explain statistical consequences of repeatedly peeking at p-values and propose safe approaches: alpha-spending boundaries (e.g., O'Brien-Fleming), sequential probability ratio test (SPRT), and Bayesian sequential testing. Compare pros/cons and operational considerations (pre-specification, stopping rules, communication).
HardTechnical
66 practiced
Given a large events table:
events(user_id bigint, event_time timestamptz, touchpoint text, revenue numeric)
Design a Postgres SQL approach to compute three attribution views over a 30-day conversion window per user: first-touch, last-touch, and exponential time-decay attribution (decay half-life = 7 days). The output should aggregate credited_revenue per touchpoint/campaign. Include SQL snippets and describe assumptions, normalization of weights per conversion, and efficiency considerations for scaling.
HardTechnical
72 practiced
Several product changes were released within a short window, complicating attribution of a retention uplift. Outline a rigorous analytical approach to disentangle effects using methods such as staggered DiD, synthetic control, hierarchical time-series models, and time-series decomposition. For each method, describe required data, assumptions, and how you'd validate the estimates.

Unlock Full Question Bank

Get access to hundreds of Applying Data Science Techniques to Business Problems interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.