Validation and Edge Case Handling Questions

Focuses on validating data correctness and robustness across application and data layers, and on identifying and handling boundary conditions. Topics include input validation and sanitization, server side validation and schema checks, null and missing value behavior, duplicate and cartesian join issues, off by one and boundary testing, date range and type mismatch handling, and test strategies for edge cases. Emphasizes designing systems and queries that fail safely, produce meaningful errors, and include checks that protect aggregations and joins from corrupt or unexpected data.

HardTechnical

0 practiced

Write SQL and outline a small Python checker that detects unexpected many-to-many expansion in joins. The SQL should compute pre-join distinct counts and an expected join size; the Python checker should compute ratio actual/expected and flag expansions above a configurable factor. Discuss thresholds and handling false positives.

MediumSystem Design

0 practiced

Multiple sources send timestamps in different timezones and formats. Describe an end-to-end validation and normalization strategy to guarantee consistent event-time processing: detection of timezone-less timestamps, canonical timezone selection, DST handling, parsing errors, and test cases you would add to catch ambiguous times.

EasyTechnical

0 practiced

You need to write a short checklist and unit test examples for protecting aggregation queries from corrupt or extreme values in ETL (e.g., negative amounts, extreme outliers, nulls). Provide at least three SQL assertions or tests and explain where they should run (pre-aggregation, post-aggregation).

EasyTechnical

0 practiced

Write a pytest unit test (or set of tests) for a Python function normalize_name(name: str) that strips whitespace, lowercases, handles None and empty strings, and preserves apostrophes (e.g., "O'Connor"). Provide test cases for edge inputs: None, '', ' Alice ', 'O'Connor', and a long whitespace-only string.

MediumSystem Design

0 practiced

Design a monitoring and alerting strategy for key data quality metrics: completeness, freshness, uniqueness, distribution drift, and schema changes. Specify where to store metrics, what dashboards and alert thresholds you'd create, and tactics to reduce alert fatigue (aggregations, severity levels, dynamic thresholds).

Unlock Full Question Bank

Get access to hundreds of Validation and Edge Case Handling interview questions and detailed answers.

Join thousands of developers preparing for their dream job.