InterviewStack.io LogoInterviewStack.io

Advanced SQL: Window Functions & CTEs for Complex Analysis Questions

Advanced SQL techniques using window functions (ROW_NUMBER, RANK, DENSE_RANK, etc.) and common table expressions (CTEs), including recursive queries, for complex data analysis, ranking and analytics patterns, cumulative totals, and multi-step data transformations within relational databases and data warehousing contexts.

MediumTechnical
0 practiced
Write SQL to compute daily cumulative conversion rate for an A/B test. Table: `events(user_id, variant TEXT, event_ts TIMESTAMP, converted BOOLEAN)`. The output should be one row per day and variant with cumulative_conversions, cumulative_users, conversion_rate, and a 95% confidence interval (approximate) for the conversion rate. Use window functions for cumulative sums.
MediumTechnical
0 practiced
You're computing daily aggregates aligned to each user's local midnight. Given `events(user_id INT, ts_utc TIMESTAMP)` and `users(user_id INT, tz TEXT)` where tz stores an IANA timezone, write SQL to convert timestamps to local date per user and compute daily event counts per user handling DST transitions.
EasyTechnical
0 practiced
Explain the difference between SQL aggregate functions (GROUP BY) and window/analytic functions (OVER(...)). As a Machine Learning Engineer building feature pipelines, describe three concrete use cases where window functions (ROW_NUMBER, LAG/LEAD, RANK, SUM() OVER ...) are preferable to GROUP BY aggregates. For each use case, explain why window functions avoid row-level collapse and how that helps feature creation.
HardTechnical
0 practiced
Write SQL to generate positive training sequences of length 10 for recommendation models plus 4 negative samples per sequence. Inputs: `interactions(user_id, item_id, ts)` and `items(item_id)`. Output columns: `user_id, seq_items ARRAY, pos_item, neg_items ARRAY`. Use window functions, CTEs, and a reproducible negative sampling approach. Comment on performance implications.
HardTechnical
0 practiced
Provide a SQL derivation and a working implementation that computes an exponentially time-decayed sum per user without using a recursive CTE by transforming the recurrence into a closed form. Input: `events(user_id, date, amount)` where decay is per day. Explain numerical pitfalls and how you would implement efficiently on large datasets.

Unlock Full Question Bank

Get access to hundreds of Advanced SQL: Window Functions & CTEs for Complex Analysis interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.