Data Problem Solving and Business Context Questions
Practical data oriented problem solving that connects business questions to correct, robust analyses. Includes translating business questions into queries and metric definitions, designing SQL or query logic for edge cases, handling data quality issues such as nulls duplicates and inconsistent dates, validating assumptions, and producing metrics like retention and churn. Emphasizes building queries and pipelines that are resilient to real world data issues, thinking through measurement definitions, and linking data findings to business implications and possible next steps.
MediumTechnical
22 practiced
A metric in production (request_count) shows 1,000,000 requests yesterday, but the corresponding database table that records user actions has only 900,000 new rows. As SRE, describe a prioritized investigation plan: what SQL checks and production queries would you run, instrumentation/tests you would add, and the likely root causes you would consider (at least five).
MediumTechnical
42 practiced
Design a metric and SQL-based definition called 'failed_deploy_impact' that quantifies user-facing error rate attributable to a specific deployment. Requirements: it should compare error rate in the 60 minutes before vs the 60 minutes after a deploy, attribute only errors likely caused by the deploy, and tolerate noisy background traffic. Describe the events/metrics needed, the SQL logic (tables: requests, deploys), and edge cases such as rollbacks and partial-traffic rollouts.
HardTechnical
30 practiced
Design a cross-team dashboard and a data contract to track SLOs and error budgets across multiple services with self-service access for teams. Include: required fields in the contract, versioning strategy, data ownership and ownership enforcement, validation checks, and UI considerations so teams can understand their budgets and historical burn. Explain how to handle inconsistent SLO definitions between teams.
MediumTechnical
28 practiced
You receive a high cardinality event stream with occasional duplicate event_id due to at-least-once delivery. Input table raw_events(event_id STRING, event_time TIMESTAMP, ingestion_time TIMESTAMP, payload JSON). Design an approach (SQL or streaming framework pseudocode) to deduplicate events and compute unique request counts per minute, while tolerating late-arriving events up to 2 hours. Describe state handling, watermarking, and backfill behavior.
MediumTechnical
23 practiced
Design an alerting policy for team-owned services to minimize false positives while ensuring reliability. Describe: what to alert on (SLO burn, pageable vs non-pageable), threshold selection, alert grouping and deduplication, escalation policy, and how to measure alert health (MTTA, false-positive rate).
Unlock Full Question Bank
Get access to hundreds of Data Problem Solving and Business Context interview questions and detailed answers.
Sign in to ContinueJoin thousands of developers preparing for their dream job.