Query Optimization and Execution Plans Questions

Focuses on diagnosing slow queries and reducing execution cost through analysis of query execution plans and systematic query rewrites. Candidates should be able to read and interpret explain output and execution plans including identifying expensive operators such as sequential table scans index scans sorts nested loop join hash join and merge join and explaining why those operators appear. Core skills include cost and cardinality estimation understanding join order and predicate placement predicate pushdown and selectivity reasoning comparing exists versus in versus join patterns and identifying common anti patterns such as N plus one queries. The topic covers profiling and benchmarking approaches using explain analyze and runtime statistics comparing estimated and actual row counts proposing and validating query rewrites and configuration or schema changes and reasoning about trade offs when using materialized views caching denormalization or partitioning to improve performance. Candidates should present step by step approaches to diagnose problems measure improvements and assess impact on other workloads.

EasyTechnical

71 practiced

Given the following Postgres EXPLAIN (ANALYZE, BUFFERS) output snippet for a reporting query, identify the single most expensive operator, explain why it appears in this plan, and list two quick mitigation steps you would try as a data analyst.

Hash Join  (cost=1000.00..5000.00 rows=100000 width=64) (actual time=200.00..450.00 rows=95000 loops=1)
  Hash Cond: (orders.customer_id = customers.id)
  -> Seq Scan on orders  (cost=0.00..3000.00 rows=500000 width=48) (actual time=0.02..150.00 rows=500000 loops=1)
  -> Hash  (cost=700.00..700.00 rows=100000 width=24) (actual time=180.00..180.00 rows=100000 loops=1)
        -> Seq Scan on customers  (cost=0.00..700.00 rows=100000 width=24) (actual time=0.01..40.00 rows=100000 loops=1)

Answer should reference cost and actual times and explain practical steps you would take next.

HardSystem Design

68 practiced

Design a reproducible automated test that will fail if any PR introduces a query plan regression for a critical analytical SQL used in reporting. Describe where tests live, how they run (CI, nightly), what baseline data to use, how to store baseline plans and metrics, and how you would present failures to the team so they are actionable.

HardTechnical

88 practiced

A dashboard occasionally becomes slow in production even though the underlying query hasn't changed. You suspect plan cache churn or statistics updates cause plan regressions intermittently. Design a monitoring and mitigation strategy that lets you detect, reproduce, and remedy these intermittent performance regressions with minimal user impact.

HardTechnical

95 practiced

Explain parameter sniffing and how prepared statement plan caching can cause a query to have a bad cached plan for different parameter values. Give examples for both SQL Server (parameter sniffing) and Postgres (prepared plan) and propose at least three mitigation strategies, explaining pros/cons of each.

MediumTechnical

94 practiced

A report uses a WHERE clause like `LOWER(email) = 'alice@example.com'` which prevents use of a btree index on email. As a data analyst, suggest at least three different ways to regain index-backed performance for lookups by normalized email, explain how each works and the trade-offs (storage, complexity, back-compat).

Unlock Full Question Bank

Get access to hundreds of Query Optimization and Execution Plans interview questions and detailed answers.

Join thousands of developers preparing for their dream job.