Load Balancing and Horizontal Scaling Questions

Covers principles and mechanisms for distributing traffic and scaling services horizontally. Includes load balancing algorithms such as round robin, least connections, and consistent hashing; health checks, connection draining, and sticky sessions; and session management strategies for stateless and stateful services. Explains when to scale horizontally versus vertically, capacity planning, and trade offs of each approach. Also includes infrastructure level autoscaling concepts such as auto scaling groups, launch templates, target tracking and step scaling policies, and how load balancers and autoscaling interact to absorb traffic spikes. Reviews different load balancer types and selection criteria, integration with service discovery, and operational concerns for maintaining availability and performance at scale.

MediumTechnical

0 practiced

You're migrating a monolithic web app to microservices. Describe how load balancing and autoscaling approaches should evolve during migration and after cutover. Cover traffic routing (API gateway), incremental routing, health checks per service, autoscaling granularity per microservice, and monitoring/observability changes you must implement to safely operate the system.

MediumTechnical

0 practiced

How would you implement connection draining and zero-downtime deployments for services using long-lived WebSocket or Server-Sent Events connections? Describe required load balancer settings (idle timeouts, deregistration delay), application lifecycle hooks (SIGTERM handling), client reconnection strategies, and testing approaches to verify behavior under rollouts.

MediumTechnical

0 practiced

Estimate the number of server instances and load balancer capacity needed for a streaming ingestion API that accepts 1,000 concurrent TCP connections, each sending 5KB/sec on average. Show calculations for aggregate bandwidth, headroom for spikes, per-connection CPU/memory assumptions (e.g., 0.5 KB/s -> 1% CPU), and how you would size LB and network capacity. State your assumptions explicitly.

EasyTechnical

0 practiced

Explain differences between Layer 4 (transport) and Layer 7 (application) load balancers. Describe features typically available at each layer (TLS termination, path-based routing, header inspection, WebSocket support), their performance characteristics and typical cloud examples. Provide guidance on when to prefer L4 over L7 for a production service.

HardTechnical

0 practiced

New autoscaled instances repeatedly fail health checks because the application cold-start is >2 minutes. Propose a redesign of autoscaling and LB interaction to meet SLAs while minimizing cost. Discuss baked AMIs/container images, warm pools, lifecycle hooks (initialization before registration), container image pre-pull, and alternative architectures such as serverless for burst handling.

Unlock Full Question Bank

Get access to hundreds of Load Balancing and Horizontal Scaling interview questions and detailed answers.

Join thousands of developers preparing for their dream job.