Validation and Edge Case Handling Questions
Focuses on validating data correctness and robustness across application and data layers, and on identifying and handling boundary conditions. Topics include input validation and sanitization, server side validation and schema checks, null and missing value behavior, duplicate and cartesian join issues, off by one and boundary testing, date range and type mismatch handling, and test strategies for edge cases. Emphasizes designing systems and queries that fail safely, produce meaningful errors, and include checks that protect aggregations and joins from corrupt or unexpected data.
EasyTechnical
0 practiced
Given an `events(user_id INT, occurred_at TIMESTAMP, source VARCHAR)` table, write a SQL query that flags rows where `occurred_at` falls outside an allowed window (2020-01-01 to 2024-12-31), groups anomalies by `source`, and returns counts. Explain how you'd incorporate this check into nightly ETL to fail-safe the pipeline.
HardSystem Design
0 practiced
Design a regression testing framework for model behavior after feature changes. Include unit tests for feature transforms, integration tests with a golden dataset, shadow deployments, and approval gates. Explain how to automate checks for performance regressions and to measure test coverage for data changes.
EasyTechnical
0 practiced
You need to join `orders` and `users`, but some ETL runs produce unexpectedly large result sets. Describe step-by-step validation you would add to prevent accidental cartesian joins: key uniqueness checks, sample joins, expected-row-ratio heuristics, and pre-join cardinality estimation. Provide queries or pseudocode for each check.
MediumTechnical
0 practiced
Design a model-serving validation layer for prediction requests (up to 1,000 req/sec) that performs schema validation, value range checks, authentication, and fallback behavior. Discuss performance optimizations to keep latency low and how to handle invalid requests safely (reject vs. default prediction).
MediumTechnical
0 practiced
You must join two large tables where join keys are non-unique and logs show occasional many-to-many anomalies. Design a robust join-key strategy: possible use of surrogate keys, hashing strategies, pre-join deduplication, and checks to detect many-to-many anomalies. Explain tradeoffs in uniqueness guarantees and storage.
Unlock Full Question Bank
Get access to hundreds of Validation and Edge Case Handling interview questions and detailed answers.
Sign in to ContinueJoin thousands of developers preparing for their dream job.