Using metrics and analytics to inform operational and strategic decisions. Topics include defining and interpreting operational measures such as throughput cycle time error rates resource utilization cost per unit quality measures and on time delivery, as well as growth and lifecycle metrics across acquisition activation retention and revenue. Emphasis is on building audience segmented dashboards and reports presenting insights to influence stakeholders diagnosing problems through variance analysis and performance analytics identifying bottlenecks measuring campaign effectiveness and guiding resource allocation and investment decisions. Also covers how metric expectations change with seniority and how to shape organizational metric strategy and scorecards to drive accountability.
MediumTechnical
51 practiced
Given table events(user_id, event_name, event_time, model_version, country) write an ANSI SQL query that returns 7-day retention for users whose first 'activate' occurred in a given ISO week, grouped by model_version and country. Include logic to identify first activation per user and to convert event_time into the user's local day using a country-to-timezone mapping table.
HardTechnical
36 practiced
Formulate a cost-benefit model to decide whether to upgrade to a larger pre-trained language model for customer support generation. Include direct costs (licensing, compute per token, latency impacts), expected benefits (reduced agent time, improved retention, NPS uplift), and uncertainty. Build formulae and show a sample numeric scenario with sensitivity analysis to determine ROI thresholds.
MediumTechnical
29 practiced
Implement (in Python) a streaming-friendly function that reads inference logs with fields (request_id, model_version, start_ts, end_ts nullable, status) and computes approximate p95 and p99 latencies per model_version over the last 24 hours using a single pass and limited memory. Provide code sketch, explain choice of streaming quantile algorithm (e.g., t-digest), and analyze time and memory complexity.
EasyTechnical
28 practiced
You have a jobs table for model training and inference logs with columns: (job_id, job_type, start_time TIMESTAMP, end_time TIMESTAMP NULLABLE, status ENUM('success','failed','running')). Write an ANSI SQL query to compute for the last 7 days: hourly throughput (jobs/hour), median and 95th percentile cycle time (end_time - start_time) per job_type, and error rate (failed/total) per job_type. Explain how you handle running jobs (NULL end_time) and show example output for job_type='inference'.
HardSystem Design
30 practiced
Design a scalable system to detect real-time regressions in model performance caused by upstream data pipeline changes. The system must include canary and shadow deployments, feature-distribution comparisons, automatic label sampling and scoring, rollback actions with human-in-loop approval, and an evidence store for root-cause analysis. Provide architecture, data flows, metrics, decision thresholds, and fail-safes to avoid oscillating rollbacks.
Unlock Full Question Bank
Get access to hundreds of Data Driven Decision Making interview questions and detailed answers.