InterviewStack.io LogoInterviewStack.io

Aggregation Functions and Group By Questions

Fundamentals of aggregation in Structured Query Language covering aggregate functions such as COUNT, SUM, AVG, MIN, and MAX and how to use them to calculate totals, averages, minima, maxima, and row counts. Includes mastery of the GROUP BY clause to group rows by one or more dimensions such as customer, product, region, or time period, and producing metrics like total revenue by month, average order value by product, or count of transactions by date. Covers the HAVING clause for filtering aggregated groups and explains how it differs from WHERE, which filters rows before aggregation. Also addresses related topics commonly tested in interviews and practical problems: grouping by multiple columns, grouping on expressions and date truncation, using DISTINCT inside aggregates, handling NULL values, ordering and limiting grouped results, using aggregates in subqueries or derived tables, and basic performance considerations when aggregating large datasets. Practice examples include calculating monthly revenue, finding customers with more than a threshold number of orders, and identifying top products by sales.

EasyTechnical
57 practiced
Write an SQL query to return, per customer_id, first_purchase_date (MIN(created_at)), last_purchase_date (MAX(created_at)), and last_order_amount (amount of the most recent order). Provide the approach in ANSI SQL and an implementation using window functions if needed.
HardTechnical
49 practiced
You see a GROUP BY query running very slowly on Redshift that aggregates on product_id and date. Describe a step-by-step plan to profile and optimize it: explain explain/analyze output interpretation, distribution and sort key choices, vacuum/analyze, use of DISTKEY/SORTKEY, and possible rewrite using materialized views or pre-aggregations.
MediumTechnical
41 practiced
Explain strategies to optimize GROUP BY queries on high-cardinality columns (e.g., billions of distinct user_ids). Discuss pre-aggregation, hashing, approximate aggregation, partitioning, and when to use each strategy with examples.
MediumTechnical
55 practiced
Write a SQL query that computes average order value (AOV) per product while excluding customers whose only orders have NULL amounts. Explain how AVG behaves and how to filter customers with only NULL order amounts before computing AOV.
HardSystem Design
44 practiced
You have aggregates stored in multiple shards/databases (shard_id 1..N). Propose a reliable approach to compute global group-by aggregates (e.g., total revenue per product) across shards with the lowest network overhead and ensure correctness if shards may lag. Include SQL-level partial aggregation examples and architectural considerations.

Unlock Full Question Bank

Get access to hundreds of Aggregation Functions and Group By interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.