InterviewStack.io LogoInterviewStack.io

Data Consistency and Distributed Transactions Questions

In depth focus on data consistency models and practical approaches to maintaining correctness across distributed components. Covers strong consistency models including linearizability and serializability, causal consistency, eventual consistency, and the implications of each for replication, latency, and user experience. Discusses CAP theorem implications for consistency choices, idempotency, exactly once and at least once semantics, concurrency control and isolation levels, handling race conditions and conflict resolution, and concrete patterns for coordinating updates across services such as two phase commit, three phase commit, and the saga pattern with compensating transactions. Also includes operational challenges like retries, timeouts, ordering, clocks and monotonic timestamps, trade offs between throughput and consistency, and when eventual consistency is acceptable versus when strong consistency is required for correctness (for example financial systems versus social feeds).

MediumTechnical
0 practiced
Given a Kafka topic with 10 partitions ingesting user events where per-user ordering is required, design a partitioning and consumer topology to guarantee ordering per user while maximizing throughput. Discuss partitioner key design, consumer group behavior, rebalancing implications, and approaches to mitigate hot keys (e.g., high-activity users).
MediumTechnical
0 practiced
Explain how Kafka's transactional API and idempotent producers work together to achieve exactly-once semantics for producer→Kafka→consumer cycles. Cover transactional.id configuration, producer fencing, transaction begin/commit/abort, consumer read_committed mode, and limitations when integrating transactional writes with external datastores.
HardTechnical
0 practiced
Compare Two-Phase Commit (2PC) and Three-Phase Commit (3PC) regarding blocking behavior, coordinator failure handling, assumptions about reliable failure detection, and practical applicability in microservices. Explain why 3PC is rarely used in production and list modern alternatives (sagas, consensus protocols) for cross-service coordination.
MediumTechnical
0 practiced
What is the write-skew anomaly? Provide a concrete example in SQL or pseudocode for a booking-like scenario where two concurrent transactions each observe the other's reads and both commit, violating a business invariant. Explain how serializable isolation prevents this anomaly and why snapshot isolation still allows it.
MediumTechnical
0 practiced
As a senior data engineer, you must lead the team decision between adopting sagas or distributed transactions (e.g., 2PC) for cross-service updates. Describe the evaluation criteria you would use (SLOs, failure modes, operational complexity, throughput, recovery time), experiments or proof-of-concepts you'd run, and how you'd present a recommendation and migration plan to engineering and product stakeholders.

Unlock Full Question Bank

Get access to hundreds of Data Consistency and Distributed Transactions interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.