InterviewStack.io LogoInterviewStack.io

Data Problem Solving and Business Context Questions

Practical data oriented problem solving that connects business questions to correct, robust analyses. Includes translating business questions into queries and metric definitions, designing SQL or query logic for edge cases, handling data quality issues such as nulls duplicates and inconsistent dates, validating assumptions, and producing metrics like retention and churn. Emphasizes building queries and pipelines that are resilient to real world data issues, thinking through measurement definitions, and linking data findings to business implications and possible next steps.

MediumTechnical
28 practiced
You receive frequent alert flapping and high storage costs due to metrics with very high tag cardinality (user_id, request_id, session_id). As the SRE, describe a plan to diagnose which labels cause worst overhead, and propose concrete fixes (relabeling, aggregation, reducing cardinality, cardinality limits), with steps to implement changes safely and measure impact.
HardTechnical
26 practiced
A new feature rollout coincides with a 15% drop in 7-day retention. As the SRE collaborating with product, outline a concrete investigative plan to determine whether the rollout caused the drop. Include data checks, segmentation strategies, A/B or causal analysis approaches, statistical tests, and minimum data you need to make a confident decision and possible mitigations.
HardTechnical
24 practiced
COUNT(DISTINCT user_id) is slow on your large dataset. Explain approximate distinct counting techniques appropriate for SRE product metrics: HyperLogLog, LogLog, and sample-and-extrapolate. For each method describe accuracy, memory footprint, mergeability, and how you'd validate the approximation against exact counts in production.
MediumTechnical
23 practiced
Design an alerting policy for team-owned services to minimize false positives while ensuring reliability. Describe: what to alert on (SLO burn, pageable vs non-pageable), threshold selection, alert grouping and deduplication, escalation policy, and how to measure alert health (MTTA, false-positive rate).
MediumTechnical
28 practiced
Write SQL to compute rolling retention: for each cohort day 0 (signup date) produce, for days 0..30, the fraction of cohort users who were active on each day N. Output should be a long form table (cohort_date, day_offset, retained_users, cohort_size, retention_rate). Assume users(user_id, signup_time) and events(user_id, event_time).

Unlock Full Question Bank

Get access to hundreds of Data Problem Solving and Business Context interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.