Validation and Edge Case Handling Questions
Focuses on validating data correctness and robustness across application and data layers, and on identifying and handling boundary conditions. Topics include input validation and sanitization, server side validation and schema checks, null and missing value behavior, duplicate and cartesian join issues, off by one and boundary testing, date range and type mismatch handling, and test strategies for edge cases. Emphasizes designing systems and queries that fail safely, produce meaningful errors, and include checks that protect aggregations and joins from corrupt or unexpected data.
HardSystem Design
132 practiced
Design a regression testing framework for model behavior after feature changes. Include unit tests for feature transforms, integration tests with a golden dataset, shadow deployments, and approval gates. Explain how to automate checks for performance regressions and to measure test coverage for data changes.
HardTechnical
138 practiced
Design a canary deployment and validation plan for a new model and changes in the data pipeline feeding it. Include canary traffic percentage, validation metrics to compute on canary traffic, stopping criteria, rollback mechanism, and how to include edge-case tests in the canary phase.
EasyTechnical
74 practiced
You need to join `orders` and `users`, but some ETL runs produce unexpectedly large result sets. Describe step-by-step validation you would add to prevent accidental cartesian joins: key uniqueness checks, sample joins, expected-row-ratio heuristics, and pre-join cardinality estimation. Provide queries or pseudocode for each check.
HardTechnical
87 practiced
Design a fuzz-testing approach for the data validation layer: how to generate malformed and edge-case inputs (type violations, huge payloads, invalid enums), how to measure code coverage, and how to define stopping criteria. Give an example using Python fuzzing or Hypothesis for a JSON parser used in ingestion.
HardSystem Design
71 practiced
Design a safe backfill system for batch feature computation that supports partial reruns, is idempotent, and protects downstream models from duplicate updates. Include job checkpoints, transactional writes, validation checks before commit (row counts, checksums), and an approach to test backfills in staging before production promotion.
Unlock Full Question Bank
Get access to hundreds of Validation and Edge Case Handling interview questions and detailed answers.
Sign in to ContinueJoin thousands of developers preparing for their dream job.