InterviewStack.io LogoInterviewStack.io

Data Consistency and Distributed Transactions Questions

In depth focus on data consistency models and practical approaches to maintaining correctness across distributed components. Covers strong consistency models including linearizability and serializability, causal consistency, eventual consistency, and the implications of each for replication, latency, and user experience. Discusses CAP theorem implications for consistency choices, idempotency, exactly once and at least once semantics, concurrency control and isolation levels, handling race conditions and conflict resolution, and concrete patterns for coordinating updates across services such as two phase commit, three phase commit, and the saga pattern with compensating transactions. Also includes operational challenges like retries, timeouts, ordering, clocks and monotonic timestamps, trade offs between throughput and consistency, and when eventual consistency is acceptable versus when strong consistency is required for correctness (for example financial systems versus social feeds).

EasyTechnical
32 practiced
Explain how vector clocks detect concurrent updates in an eventually-consistent key-value store. Give a small example where two nodes concurrently update the same key and produce divergent versions, and contrast that with last-write-wins (LWW) conflict resolution—describe situations where LWW is acceptable and where it causes data loss.
MediumTechnical
36 practiced
Write Python-like pseudocode for a message consumer that achieves exactly-once processing semantics using a persistent deduplication table in PostgreSQL. Show how to atomically check-and-mark message IDs, process the message, handle crashes and retries, and implement garbage collection for old message IDs.
HardSystem Design
28 practiced
Design a globally-distributed sequencing service that assigns strictly increasing sequence numbers to events across regions (e.g., for globally-ordered logs) while keeping latency low. Evaluate approaches such as a single global leader using consensus, per-region sequencers with gap-filling, batching, and Hybrid Logical Clocks. Discuss failover, throughput bottlenecks, and availability under partitions.
MediumSystem Design
35 practiced
Design a scalable approach to support atomic increments for counters across sharded keys (e.g., global 'likes' counts). Compare per-shard counters with periodic aggregation, CRDT counters (G-counter/PN-counter), a central counter service, and optimistic CAS-based increments. Discuss accuracy, throughput, read latency, and reconciliation strategies for each.
HardTechnical
37 practiced
Given this naive message handler pseudocode: 'def handle_message(msg): try: db.execute("INSERT INTO processed_messages(id) VALUES (%s)", msg.id) except UniqueViolation: return process_already_done(); process_side_effects(msg)', identify race conditions or failure modes that could still cause duplicate side effects or lost processing. Propose corrected code-level fixes using SQL constraints, explicit transactions, status fields, or the transactional outbox pattern and provide corrected Python/SQL pseudocode.

Unlock Full Question Bank

Get access to hundreds of Data Consistency and Distributed Transactions interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.