InterviewStack.io LogoInterviewStack.io

Aggregation and Grouping Questions

Covers SQL grouping and aggregation concepts used to summarize data across rows. Key skills include using GROUP BY with aggregate functions such as COUNT, SUM, AVG, MIN, and MAX, counting distinct values, and filtering grouped results with HAVING while understanding the difference between WHERE and HAVING. Candidates should demonstrate correct handling of NULL values in aggregates, grouping by expressions and multiple columns, and writing multi level aggregations using ROLLUP, CUBE, and GROUPING SETS. Also important is knowing when to use subqueries or common table expressions for intermediate aggregation, the difference between aggregate functions and window functions, and how grouping interacts with joins and data types. Interview questions may test correctness of queries, edge cases, performance considerations such as appropriate indexes and query plans, and the ability to transform business questions like who are the top customers or which categories have declining sales into correct aggregated SQL statements.

EasyTechnical
0 practiced
Given orders(customer_id, amount numeric, order_date timestamp) write a SQL query to return the top 5 customers by total spend over the past 12 months. Break ties deterministically by customer_id ascending. Use standard SQL (works in Postgres/MySQL).
MediumTechnical
0 practiced
Table orders(order_id, items jsonb) where items is a JSON array of objects {"product_id":..., "price":..., "qty":...}. Write PostgreSQL SQL to extract items, unnest json arrays, and compute total revenue per product_id across all orders. Discuss performance considerations and alternatives.
MediumTechnical
0 practiced
Given sales(product_id, sale_date date, amount numeric), write SQL to pivot monthly sales for Jan, Feb, and Mar (for a given year) into columns product_id, jan, feb, mar using CASE aggregates. Provide a Postgres-compatible query and explain how you would scale to 12 months.
MediumTechnical
0 practiced
You manage a 500M-row events table: events(user_id, event_type, occurred_at timestamp, properties jsonb). Typical query counts events per day per event_type over a 90-day period. Describe index and partitioning strategies, and storage recommendations (columnar vs row) to speed aggregation queries. Include concrete index definitions and partition keys.
HardSystem Design
0 practiced
You need to aggregate daily active users per country from 100 billion events stored across a distributed cluster. Describe a distributed aggregation plan: map-side partial aggregation, shuffle/partitioning strategy, reduce-side final aggregation, how to minimize network traffic, and strategies to bound memory usage during group-by.

Unlock Full Question Bank

Get access to hundreds of Aggregation and Grouping interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.