InterviewStack.io LogoInterviewStack.io

Stream Processing and Event Streaming Questions

Designing and operating systems that ingest, process, and serve continuous event streams with low latency and high throughput. Core areas include architecture patterns for stream native and event driven systems, trade offs between batch and streaming models, and event sourcing concepts. Candidates should demonstrate knowledge of messaging and ingestion layers, message brokers and commit log systems, partitioning and consumer group patterns, partition key selection, ordering guarantees, retention and compaction strategies, and deduplication techniques. Processing concerns include stream processing engines, state stores, stateful processing, checkpointing and fault recovery, processing guarantees such as at least once and exactly once semantics, idempotence, and time semantics including event time versus processing time, watermarks, windowing strategies, late and out of order event handling, and stream to stream and stream to table joins and aggregations over windows. Performance and operational topics cover partitioning and scaling strategies, backpressure and flow control, latency versus throughput trade offs, resource isolation, monitoring and alerting, testing strategies for streaming pipelines, schema evolution and compatibility, idempotent sinks, persistent storage choices for state and checkpoints, and operational metrics such as stream lag. Familiarity with concrete technologies and frameworks is expected when discussing designs and trade offs, for example Apache Kafka, Kafka Streams, Apache Flink, Spark Structured Streaming, Amazon Kinesis, and common serialization formats such as Avro, Protocol Buffers, and JSON.

HardTechnical
0 practiced
A team wants to remove a required field and rename another in an Avro schema used across multiple services and topics. Create a migration plan that avoids consumer breakage, supports rollbacks, and details schema registry steps, use of default values, compatibility settings, and consumer upgrade sequencing.
EasyTechnical
0 practiced
Define producer idempotence and idempotent sinks in the context of streaming pipelines. Explain how Kafka's idempotent producer and transactions work and describe practical alternatives when external sinks do not support transactional writes.
MediumTechnical
0 practiced
You need to pick a partition key for user-activity events where downstream pipelines perform per-user aggregation and also join with a product catalog. How would you select a partition key to balance write/read parallelism, avoid hotspots, and enable efficient joins? Discuss composite keys, hashing strategies, trade-offs with repartitioning, and the impact on ordering guarantees.
HardSystem Design
0 practiced
Design a secure streaming pipeline that encrypts data at rest and in transit, masks or tokenizes PII in-flight, restricts access via topic ACLs, maintains audit logs for access and schema changes, and integrates with a KMS for key rotation. Detail choices for TLS, encryption at broker and storage, key management, schema enforcement, and how to operationalize audits for compliance.
MediumTechnical
0 practiced
You're asked to join a high-volume transaction stream with a slowly changing in-memory 'customer-profile' table for enrichment in real time. Describe architecture options: stream-table join using a changelog topic, asynchronous lookups with caching, side-inputs, or pre-materialized views. Discuss implications for consistency, latency, state size, and how updates to the profile propagate to the join.

Unlock Full Question Bank

Get access to hundreds of Stream Processing and Event Streaming interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.