InterviewStack.io LogoInterviewStack.io

Data Cleaning and Quality Validation in SQL Questions

Handle NULL values, duplicates, and data type issues within queries. Implement data validation checks (row counts, value distributions, date ranges). Practice identifying and documenting data quality issues that impact analysis reliability.

MediumTechnical
0 practiced
Design a SQL-based audit report for a table 'sales' that returns metrics per column: total_rows, null_percentage, distinct_count, min/max (for dates/numbers), and duplicate_count for candidate keys. Provide the SQL pattern you would use to produce a single audit table.
EasyTechnical
0 practiced
You have a staging table 'transactions' (transaction_id INT, user_id INT, amount NUMERIC NULL, currency VARCHAR NULL, created_at TIMESTAMP). Write a SELECT in SQL (PostgreSQL) that returns the data with NULL amounts replaced by 0 and NULL currency replaced by 'USD' without modifying the underlying table. Explain why you might not want to UPDATE the raw staging table immediately.
MediumTechnical
0 practiced
During ETL you want to capture provenance: add columns source_file VARCHAR, ingest_timestamp TIMESTAMP, and row_hash to every row in table 'events'. Provide SQL to compute a deterministic row_hash (e.g., SHA256) of payload columns, show how to query for all rows from a given source_file, and discuss collision concerns.
MediumTechnical
0 practiced
Given a 'metrics' table (entity_id, ts, value NUMERIC), write SQL to flag per-entity rows that are statistical outliers using the IQR method: compute Q1 and Q3 per entity and mark rows where value is outside [Q1 - 1.5*IQR, Q3 + 1.5*IQR]. Provide the SQL and discuss handling few-sample entities.
EasyTechnical
0 practiced
You receive a raw 'sales_raw' table with column amount_text VARCHAR that should be numeric but includes garbage like 'N/A', '$1,234.56', and '1.234,56'. Write SQL (PostgreSQL) to return rows where amount_text cannot be safely cast to NUMERIC, and explain how you would detect locale and symbol issues.

Unlock Full Question Bank

Get access to hundreds of Data Cleaning and Quality Validation in SQL interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.