InterviewStack.io LogoInterviewStack.io

Load Balancing and Traffic Distribution Questions

Covers why load balancers are used and how traffic is distributed across backend servers to avoid single server bottlenecks, enable horizontal scaling, and provide fault tolerance. Candidates should know common distribution algorithms such as round robin, least connections, weighted balancing, and consistent hashing, and understand trade offs among them. Explain the difference between layer four and layer seven load balancing and the implications for routing, request inspection, and protocol awareness. Discuss stateless design versus stateful services, the impact of session affinity and sticky sessions, and alternatives such as external session stores or token based sessions to preserve scalability. Describe high availability and resilience patterns to mitigate a single point of failure, including active active and active passive configurations, health checks, connection draining, and global routing options such as DNS based and geo aware routing. At senior and staff levels, cover advanced capabilities like request routing based on metadata or headers, weighted traffic shifting for canary and blue green deployments, traffic mirroring, rate limiting and throttling, integration with autoscaling, and strategies for graceful degradation and backpressure. Also include operational concerns such as secure termination of transport layer security, connection pooling, caching and consistent hashing for caches, monitoring and observability, capacity planning, and common debugging and failure modes.

HardTechnical
0 practiced
Provide pseudocode or actual code to perform weighted traffic shifting at runtime while minimizing cache miss spikes. Use a consistent-hashing approach that preserves cache locality and show how you would add weight to a canary without flushing caches globally. Explain rollback steps.
EasyTechnical
0 practiced
What is connection draining (graceful shutdown) for load balancers and backend instances? Explain why it's important during deployments and scaling events, how it interacts with HTTP keep-alive and long-lived connections, and practical guidance on choosing drain timeouts.
MediumTechnical
0 practiced
Implement a thread-safe least-connections scheduler in Go (or Python). Provide functions: addBackend(id), removeBackend(id), incConn(id), decConn(id), selectBackend(). Ensure selectBackend returns the backend with the fewest active connections and supports tie-breaking. Describe concurrency strategy and complexity.
HardTechnical
0 practiced
Describe a capacity planning methodology for the load balancer tier. Include modeling traffic growth, handling peak-to-average ratios, required headroom for failover and autoscaling lag, calculations for required instances given connection and bandwidth limits, and provide a worked numerical example (e.g., compute instances for 1M RPS with per-instance limits).
EasyTechnical
0 practiced
Describe common health check types used by load balancers (TCP connect, HTTP GET/HEAD, HTTP health endpoints, gRPC healthchecks). For each type explain what it detects, common failure modes (false positives/negatives), and how to design health endpoints to reflect true service readiness and liveness.

Unlock Full Question Bank

Get access to hundreds of Load Balancing and Traffic Distribution interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.