InterviewStack.io LogoInterviewStack.io

Aggregation Functions and Group By Questions

Fundamentals of aggregation in Structured Query Language covering aggregate functions such as COUNT, SUM, AVG, MIN, and MAX and how to use them to calculate totals, averages, minima, maxima, and row counts. Includes mastery of the GROUP BY clause to group rows by one or more dimensions such as customer, product, region, or time period, and producing metrics like total revenue by month, average order value by product, or count of transactions by date. Covers the HAVING clause for filtering aggregated groups and explains how it differs from WHERE, which filters rows before aggregation. Also addresses related topics commonly tested in interviews and practical problems: grouping by multiple columns, grouping on expressions and date truncation, using DISTINCT inside aggregates, handling NULL values, ordering and limiting grouped results, using aggregates in subqueries or derived tables, and basic performance considerations when aggregating large datasets. Practice examples include calculating monthly revenue, finding customers with more than a threshold number of orders, and identifying top products by sales.

MediumTechnical
0 practiced
Explain GROUPING SETS, ROLLUP, and CUBE in SQL and give an example that computes revenue by region, by product, and by both region and product in a single query. Describe when using these constructs is preferable to running multiple separate GROUP BY queries in BI workflows.
MediumTechnical
0 practiced
Describe how Looker persistent derived tables (PDTs) can be used to precompute expensive GROUP BY aggregates for dashboards. Explain scheduling strategies, trade-offs between PDTs and on-the-fly derived tables, how to control refresh frequency, and provide a simple SQL transform example that computes daily_revenue_by_region to be persisted as a PDT.
MediumTechnical
0 practiced
Write a query that computes daily revenue broken down by region and product_category for the last 90 days. Table schema: sales(order_id, sold_at date, region varchar, product_category varchar, amount decimal). Return columns: sold_date, region, product_category, total_revenue. Explain how grouping by multiple columns helps BI filtering and suggest strategies to reduce cardinality for interactive dashboards.
HardSystem Design
0 practiced
Design a star schema for a retail analytics warehouse focused on efficient aggregations for BI. Define the fact and dimension tables, the fact grain, surrogate key strategy, handling slowly changing dimensions (SCD Type 2), and explain how this design speeds up GROUP BY queries compared to a highly normalized OLTP schema.
EasyTechnical
0 practiced
Write SQL to return the top 10 products by total sales value for the current year using sales(product_id, order_id, amount, sold_at). Include product_id, total_sales and rank in the result. Discuss whether you would use ORDER BY + LIMIT or a window function for a dashboard that displays Top-N and the trade-offs between the two approaches.

Unlock Full Question Bank

Get access to hundreds of Aggregation Functions and Group By interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.