SQL for Data Analysis Questions
Using SQL as a tool for data analysis and reporting. Focuses on writing queries to extract metrics, perform aggregations, join disparate data sources, use subqueries and window functions for trends and rankings, and prepare data for dashboards and reports. Includes best practices for reproducible analytical queries, handling time series and date arithmetic, basic query optimization considerations for analytic workloads, and when to use SQL versus built in reporting tools in analytics platforms.
MediumTechnical
74 practiced
Pivot / crosstab requirement: Produce monthly revenue per region for the last 6 months in a single row per region with columns: region, rev_YYYY_MM (one column per month). You can use conditional aggregation (CASE) or vendor-specific PIVOT. Given schema:Write the SQL (choose any dialect) and explain how you'd keep this crosstab efficient for dashboards.
sql
orders(order_id, region varchar, order_date date, total_amount numeric)MediumTechnical
68 practiced
Given a customers table with lifetime_value computed per customer, write a SQL query to identify the top 5% customers by lifetime_value. Return customer_id, lifetime_value, percentile_rank. Use window functions or analytic functions to compute percentile. Assume the table:Specify the SQL dialect you use.
sql
customers(customer_id bigint, lifetime_value numeric)MediumTechnical
63 practiced
NULLs and inconsistent values plague an ETL. Describe SQL strategies to detect and handle inconsistent status values, missing keys, and NULL timestamps before exposing data to dashboards. Provide example SQL snippets for validation checks and remediation (e.g., imputation or quarantining bad rows).
MediumTechnical
70 practiced
Cohort analysis: You have a users table and events table:Write a SQL query to compute 7-day retention for cohorts defined by signup week. Output should include cohort_week, cohort_size, day_0_retention, day_7_retention (percentage). Explain assumptions needed for this calculation.
sql
users(user_id, signup_date date)
events(user_id, event_date date, event_type varchar)MediumTechnical
56 practiced
Attribution problem: you have three tables:Design a SQL query to attribute orders to the most recent click within 7 days prior to the order (last-touch). Return order_id, user_id, order_time, attributed_ad_id, days_since_click. Discuss how you would handle multiple clicks and missing clicks.
sql
impressions(impression_id, user_id, ad_id, impression_time)
clicks(click_id, user_id, ad_id, click_time)
orders(order_id, user_id, order_time, amount)Unlock Full Question Bank
Get access to hundreds of SQL for Data Analysis interview questions and detailed answers.
Sign in to ContinueJoin thousands of developers preparing for their dream job.