InterviewStack.io LogoInterviewStack.io

Stream Processing and Event Streaming Questions

Designing and operating systems that ingest, process, and serve continuous event streams with low latency and high throughput. Core areas include architecture patterns for stream native and event driven systems, trade offs between batch and streaming models, and event sourcing concepts. Candidates should demonstrate knowledge of messaging and ingestion layers, message brokers and commit log systems, partitioning and consumer group patterns, partition key selection, ordering guarantees, retention and compaction strategies, and deduplication techniques. Processing concerns include stream processing engines, state stores, stateful processing, checkpointing and fault recovery, processing guarantees such as at least once and exactly once semantics, idempotence, and time semantics including event time versus processing time, watermarks, windowing strategies, late and out of order event handling, and stream to stream and stream to table joins and aggregations over windows. Performance and operational topics cover partitioning and scaling strategies, backpressure and flow control, latency versus throughput trade offs, resource isolation, monitoring and alerting, testing strategies for streaming pipelines, schema evolution and compatibility, idempotent sinks, persistent storage choices for state and checkpoints, and operational metrics such as stream lag. Familiarity with concrete technologies and frameworks is expected when discussing designs and trade offs, for example Apache Kafka, Kafka Streams, Apache Flink, Spark Structured Streaming, Amazon Kinesis, and common serialization formats such as Avro, Protocol Buffers, and JSON.

HardTechnical
41 practiced
You must join two high-volume streams on a non-key attribute (e.g., location name) where the keyspace is large and skewed. Design an approach including repartitioning, possible broadcasting, state-size mitigation, appropriate windowing, and strategies to maintain correctness when events arrive late or out of order. Discuss costs and operational constraints.
HardTechnical
36 practiced
Design a deduplication and ordering strategy for e-commerce order events where retries can create duplicates and events may arrive out of order. Requirements: ensure a single fulfillment action per logical order, maintain auditability, and support reconciliation. Explain use of event IDs, stateful dedup stores, idempotent sinks, and downstream reconciliation jobs, including recovery scenarios.
EasyTechnical
35 practiced
Compare Apache Flink, Kafka Streams, and Spark Structured Streaming from the perspective of a Solutions Architect advising a client with low-latency, stateful processing requirements. Cover strengths and weaknesses, state management approaches, event-time handling, latency vs throughput trade-offs, deployment and operational complexity, and typical cost implications.
MediumTechnical
46 practiced
Describe mechanisms to handle backpressure when sinks are slower than upstream producers. Cover buffer limits, rate limiting, flow-control protocols (e.g., Reactive Streams), disk-spilling, adaptive batching, dead-letter queues, and retry/backoff strategies. Compare how Kafka producers, Flink, and Spark handle backpressure and which controls a Solutions Architect can tune.
MediumTechnical
33 practiced
Compare managed streaming offerings (Confluent Cloud, Amazon MSK, Amazon Kinesis) with a self-managed Kafka for a startup planning global expansion. Evaluate cost drivers including storage retention, replication factor, network egress, HA, operational staffing, and developer productivity. Provide decision criteria and a cost-sensitive recommendation.

Unlock Full Question Bank

Get access to hundreds of Stream Processing and Event Streaming interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.