InterviewStack.io LogoInterviewStack.io

Load Balancing and Horizontal Scaling Questions

Covers principles and mechanisms for distributing traffic and scaling services horizontally. Includes load balancing algorithms such as round robin, least connections, and consistent hashing; health checks, connection draining, and sticky sessions; and session management strategies for stateless and stateful services. Explains when to scale horizontally versus vertically, capacity planning, and trade offs of each approach. Also includes infrastructure level autoscaling concepts such as auto scaling groups, launch templates, target tracking and step scaling policies, and how load balancers and autoscaling interact to absorb traffic spikes. Reviews different load balancer types and selection criteria, integration with service discovery, and operational concerns for maintaining availability and performance at scale.

EasyTechnical
74 practiced
Explain differences between Layer 4 (transport) and Layer 7 (application) load balancers. Describe features typically available at each layer (TLS termination, path-based routing, header inspection, WebSocket support), their performance characteristics and typical cloud examples. Provide guidance on when to prefer L4 over L7 for a production service.
EasyTechnical
62 practiced
Explain the main differences between horizontal and vertical scaling. Provide concrete examples (e.g., adding CPU/RAM to an existing VM vs. adding more instances behind a load balancer) and discuss operational trade-offs including downtime, complexity, cost, single point of failure, and the impact on stateful components such as databases and caches. Give scenarios where vertical scaling is still the right choice and describe migration considerations from vertical to horizontal.
HardSystem Design
53 practiced
Design a load balancing and autoscaling model for a multi-tenant SaaS platform where tenants require different isolation and scaling preferences. Decide between shared pools with per-tenant quotas, per-tenant target groups, or dedicated tenant clusters. Discuss routing options, security/isolation (network and compute), billing and metering, noisy neighbor mitigation, and operational overhead.
MediumTechnical
58 practiced
Write a Terraform HCL snippet (resource declarations and attributes, not provider blocks) to create an AWS Launch Template, an Auto Scaling Group (using the Launch Template), and attach instances to an Application Load Balancer target group. Include attributes for health_check_type, health_check_grace_period, and deregistration_delay to ensure graceful deregistration. Focus on correct resource relationships.
MediumTechnical
55 practiced
How would you implement connection draining and zero-downtime deployments for services using long-lived WebSocket or Server-Sent Events connections? Describe required load balancer settings (idle timeouts, deregistration delay), application lifecycle hooks (SIGTERM handling), client reconnection strategies, and testing approaches to verify behavior under rollouts.

Unlock Full Question Bank

Get access to hundreds of Load Balancing and Horizontal Scaling interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.