InterviewStack.io LogoInterviewStack.io

Data Modeling and Architecture Questions

Design and modeling principles for transactional and analytical data systems. Topics include entity relationship modeling, normalization and denormalization trade offs, dimensional modeling with fact and dimension tables and star and snowflake schemata, indexing strategies, partitioning and sharding, and schema design for performance and maintainability. Cover data pipelines and integration patterns including extract transform load and extract load transform approaches, data warehousing and data lake concepts, ETL orchestration, and how sources feed into reporting and business intelligence systems. Also include considerations for data quality, governance, and the differences between online transaction processing and online analytical processing workloads.

HardTechnical
0 practiced
Compare columnar storage formats (Parquet, ORC) and row-based formats (Avro, JSON) for data lakes that feed BI. Discuss compression, predicate pushdown, schema evolution, support for nested types, and implications for downstream query engines like Spark, Presto, or BigQuery.
MediumTechnical
0 practiced
A critical executive dashboard is slow. Queries show heavy joins across several denormalized views and federated sources. As the BI analyst, outline step-by-step how you'd diagnose the root cause (what metrics and plans to collect) and provide at least three optimization strategies (database and BI-side) to reduce latency.
EasyTechnical
0 practiced
Explain the difference between OLTP and OLAP workloads. Describe typical characteristics (read/write patterns, schema design, query latency, concurrency), give two example systems you'd choose for each (one open-source, one cloud service), and explain which workload is the primary source for enterprise reporting and why.
HardSystem Design
0 practiced
Design a metadata and data catalog system to help BI analysts discover datasets, view lineage, check SLA/freshness, find dataset owners, and see automated profiling (row counts, null rates, basic histograms). Describe how you'd integrate the catalog with ETL orchestration (Airflow/dbt) to keep metadata up to date.
EasyTechnical
0 practiced
Explain the trade-offs between normalization and denormalization for reporting systems. Discuss effects on update complexity, query speed, storage footprint, and how different BI tools cope with highly normalized vs denormalized data sources.

Unlock Full Question Bank

Get access to hundreds of Data Modeling and Architecture interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.