Metric Definition and Implementation Questions

End to end topic covering the precise definition, computation, transformation, implementation, validation, documentation, and monitoring of business metrics. Candidates should demonstrate how to translate business requirements into reproducible metric definitions and formulas, choose aggregation methods and time windows, set filtering and deduplication rules, convert event level data to user level metrics, and compute cohorts, retention, attribution, and incremental impact. The work includes data transformation skills such as normalizing and formatting date and identifier fields, handling null values and edge cases, creating calculated fields and measures, combining and grouping tables at appropriate levels, and choosing between percentages and absolute numbers. Implementation details include writing reliable structured query language code or scripts, selecting instrumentation and data sources, considering aggregation strategy, sampling and margin of error, and ensuring pipelines produce reproducible results. Validation and quality practices include spot checks, comparison to known totals, automated tests, monitoring and alerting, naming conventions and versioning, and clear documentation so all calculations are auditable and maintainable.

MediumTechnical

0 practiced

Instrumentation design: Propose a minimal event schema to support product metrics and experiments. For each field specify data type and purpose. Include fields such as event_id, user_id, anonymous_id, device_id, event_time, ingestion_time, event_type, properties JSON, experiment_id, and campaign_id.

EasyTechnical

0 practiced

Explain the difference between event-level and user-level metrics. Describe the steps to convert event-level data into user-level metrics (e.g., DAU, conversions per user) including deduplication, time-window choice, and sessionization pitfalls across devices.

HardTechnical

0 practiced

Streaming implementation: Provide pseudocode (PySpark/Scala) for deduplicating events and sessionizing user events in Spark Structured Streaming. Discuss watermark choices, state TTL, how to scale state for millions of users, and trade-offs of stateful processing versus micro-batching.

MediumTechnical

0 practiced

Attribution: Given table touches(user_id STRING, event_time TIMESTAMP, channel STRING, is_conversion BOOLEAN), write SQL to assign each conversion to the last non-null channel touch within 30 days prior to conversion (last-touch). Explain assumptions and limitations.

MediumSystem Design

0 practiced

System design: Design an ETL/ELT pipeline that ingests web and mobile events at 100k events/sec, supports schema evolution, deduplication, near-real-time metrics and batch backfills. List components (ingest, broker, schema registry, stream processor, storage, warehouse), data contracts, idempotency approach, and failure/retry strategies.

Unlock Full Question Bank

Get access to hundreds of Metric Definition and Implementation interview questions and detailed answers.

Join thousands of developers preparing for their dream job.