InterviewStack.io LogoInterviewStack.io

Data Consistency and Idempotency Questions

Understand how to maintain correct data in distributed and asynchronous systems and how to design idempotent operations so retries do not produce duplicate effects. Cover the relationship between consistency models and idempotency, transactional guarantees across components, patterns for idempotent request handling, unique request identifiers, deduplication, compensating transactions, and when to use eventual reconciliation or strong transactional boundaries. Discuss how idempotency affects API design, retry strategies, and user visible correctness.

HardTechnical
100 practiced
Implement a high-concurrency safe idempotent writer in Python that writes business records to an SQL database. Your approach should use a dedupe table or unique constraint on an idempotency_key and perform the business write inside a transaction. Provide the core function that attempts the insert, detects duplicate-key exceptions, and returns whether the operation was applied. Explain how you avoid deadlocks and ensure correctness under concurrent retries.
EasyTechnical
60 practiced
What is a unique request identifier (e.g., correlation-id or idempotency-key) and how should it be generated, propagated, and logged in a distributed data pipeline? Include considerations for identifier format, entropy, collision probability, and privacy (PII) implications.
MediumTechnical
60 practiced
How do you handle schema evolution in event-driven pipelines while preserving idempotency and backwards compatibility? Describe strategies using Avro/Protobuf schema compatibility rules, versioned topics, contract testing, and transformation layers to ensure older consumers remain idempotent.
MediumSystem Design
98 practiced
Explain the saga pattern and how it supports eventual consistency across multiple services. Provide a concrete ETL-related example orchestration (e.g., enrichment service, write to analytics store, notify downstream systems) using either choreography or orchestration, and show where compensating transactions are required.
MediumTechnical
51 practiced
You operate a nightly ETL job that sometimes produces duplicate rows because upstream producers retry. You cannot change producers. Design a deduplication strategy purely in the warehouse layer. Describe an efficient approach (SQL or merge-based), indexing/partitioning choices, and how you would run this as an incremental job with minimal cost.

Unlock Full Question Bank

Get access to hundreds of Data Consistency and Idempotency interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.