Data Architecture and Pipelines Questions
Designing data storage, integration, and processing architectures. Topics include relational and NoSQL database design, indexing and query optimization, replication and sharding strategies, data warehousing and dimensional modeling, ETL and ELT patterns, batch and streaming ingestion, processing frameworks, feature stores, archival and retention strategies, and trade offs for scale and latency in large data systems.
HardTechnical
44 practiced
Propose a robust strategy to manage schema evolution across producers and consumers without breaking dashboards. Include use of schema registries, versioning, backward/forward compatibility rules, migration playbooks, and a plan for rolling out breaking changes.
MediumSystem Design
47 practiced
Design an end-to-end near-real-time dashboard pipeline that needs to reflect transactions within 1–2 minutes. Include data capture (CDC/event bus), stream processing, low-latency storage or materialized views, and how BI tools will query the results. State trade-offs (consistency, cost, complexity).
MediumTechnical
58 practiced
For a fact_sales table with 50 billion rows, propose a partitioning and clustering strategy (e.g., daily partitions, clustering on product_id) to optimize common BI queries like time-based reporting and top-N product lists. Explain trade-offs and maintenance tasks required.
MediumTechnical
41 practiced
You need to design an incremental ETL process to load a daily sales fact table where late-arriving records are common (orders may be updated days later). Outline the approach for capture, deduplication, updating existing rows, and backfilling. Include considerations for idempotency and performance.
MediumTechnical
86 practiced
Given a transactional events table that may contain duplicates, write an SQL pattern (ANSI SQL) to deduplicate events keeping the latest event_time per event_id.Schema:Provide a query that selects deduplicated rows.
events(event_id STRING, user_id STRING, event_time TIMESTAMP, payload VARIANT)Unlock Full Question Bank
Get access to hundreds of Data Architecture and Pipelines interview questions and detailed answers.
Sign in to ContinueJoin thousands of developers preparing for their dream job.