InterviewStack.io LogoInterviewStack.io

Data Modeling and Schema Design Questions

Focuses on designing efficient, maintainable data schemas for transactional and analytical systems. Candidates should demonstrate understanding of normalization principles and normal forms, when and why to denormalize for performance, and schema design patterns for different use cases. Expect dimensional modeling topics including fact and dimension tables, star and snowflake schemas, grain definition, slowly changing dimensions, and strategies for handling historical data. The topic also includes trade offs between online transaction processing and online analytical processing designs, query performance considerations, indexing and partitioning strategies, and the ability to evaluate and improve existing schemas to meet business requirements and scale.

HardTechnical
0 practiced
An analytical query scans a partitioned fact table but isn't benefiting from partition pruning. Given the query and partitioning scheme below, identify why pruning fails and propose fixes.
Partitioning: orders partitioned by RANGE(order_date) monthlyQuery: SELECT product_id, SUM(amount) FROM orders WHERE order_date >= '2023-01-15' AND order_date < '2023-02-10' GROUP BY product_id;
Assume order_date is stored as a string in 'YYYY-MM-DD' format.
MediumTechnical
0 practiced
Explain how bitmap and B-tree indexes differ. For a large dimension table with low-cardinality columns (e.g., gender, boolean flags), which index type would you choose and why? Mention concurrency considerations.
HardTechnical
0 practiced
You inherit a reporting schema where many dimension tables have nullable surrogate foreign keys to a central 'master' table. Query performance suffers from many LEFT JOINs producing repeated nulls. Propose schema and query-level optimizations to reduce join cost and simplify reporting queries.
MediumSystem Design
0 practiced
Given a denormalized reporting table that is periodically rebuilt from source systems, how would you design incremental rebuilds to minimize downtime and keep the table consistent for readers? Include transactional considerations and schema-level techniques.
EasyTechnical
0 practiced
Explain the difference between 1NF, 2NF, and 3NF. Give a concrete example table (columns and sample rows) that violates 2NF and show how to transform it into 2NF.

Unlock Full Question Bank

Get access to hundreds of Data Modeling and Schema Design interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.