InterviewStack.io LogoInterviewStack.io

Data Analysis and Requirements Translation Questions

Focuses on translating ambiguous business questions into concrete data analysis plans. Candidates should identify the data points required, define metrics and key performance indicators, state assumptions to validate, design the analysis steps and queries, and explain how analysis results map back to business decisions. This includes data quality considerations, required instrumentation, and how analytical findings influence product requirements or architectural choices.

EasyTechnical
0 practiced
A PM says 'measure retention' with no further detail. Propose a precise retention metric: choose cohort start (first session, signup), retention windows (day 1, 7, 30), what counts as retained (any event vs specific activation event), how to handle multi-device users and guest accounts, and what assumptions must be documented for the analysis.
MediumTechnical
0 practiced
Tables:
orders(order_id bigint, user_id bigint, amount decimal, currency varchar, created_at timestamp)refunds(refund_id bigint, order_id bigint, refunded_amount decimal, refunded_at timestamp)
Write ANSI SQL to compute daily Average Order Value (AOV) excluding refunded orders. Explain assumptions about partial refunds (do you exclude entire orders or adjust amounts), currency normalization, and how you attribute refunds that occur after the original order date.
MediumTechnical
0 practiced
Explain strategies to handle late-arriving or out-of-order event data in analytics systems. Discuss watermarking and allowed lateness in stream processing, strategies for incremental re-aggregation, ways to show provisional versus final metrics to stakeholders, and how to design backfill procedures when late data arrives after daily reports are published.
MediumTechnical
0 practiced
Describe privacy and compliance trade-offs when instrumenting events that may contain PII (emails, phone numbers, IPs). Explain approaches for minimization, hashing vs tokenization (and their reversibility), access controls, data retention policies, and how to support user deletion requests (GDPR/CCPA) without breaking analytics.
HardTechnical
0 practiced
Analyze trade-offs between deriving sessions via heuristics (e.g., 30-minute inactivity) versus relying on server-issued session IDs for sessionization in analytics. Consider correctness, instrumentation complexity, multi-device behavior, bot traffic, attribution of session metrics (session length), and implications for historical comparability when switching methods.

Unlock Full Question Bank

Get access to hundreds of Data Analysis and Requirements Translation interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.