Capacity Planning and Resource Optimization Questions

Covers forecasting, provisioning, and operating compute, memory, storage, and network resources efficiently to meet demand and service level objectives. Key skills include monitoring resource utilization metrics such as central processing unit usage, memory consumption, storage input and output and network throughput; analyzing historical trends and workload patterns to predict future demand; and planning capacity additions, safety margins, and buffer sizing. Candidates should understand vertical versus horizontal scaling, autoscaling policy design and cooldowns, right sizing instances or containers, workload placement and isolation, load balancing algorithms, and use of spot or preemptible capacity for interruptible workloads. Practical topics include storage planning and archival strategies, database memory tuning and buffer sizing, batching and off peak processing, model compression and inference optimization for machine learning workloads, alerts and dashboards, stress and validation testing of planned changes, and methods to measure that capacity decisions meet both performance and cost objectives.

EasyTechnical

0 practiced

Explain what an autoscaler cooldown period is and why cooldowns are necessary. Describe one scenario where a cooldown that is too short leads to oscillations and another scenario where a cooldown that is too long causes SLA violations. Finally, outline a simple method for choosing an appropriate cooldown for a web application.

EasyTechnical

0 practiced

Compare round-robin, least-connections, and consistent-hash load balancing algorithms in terms of capacity distribution, cache affinity, and suitability for sticky or stateful traffic. For each algorithm, give one concrete scenario where it helps balance load effectively and one scenario where it could cause problems that affect capacity planning.

EasyTechnical

0 practiced

Given an HTTP API with an objective: responses <= 200ms for 99.9% of requests measured monthly, define the appropriate SLO, SLI(s), and error budget for a month. Then explain at a high level how you would translate that SLO into concrete capacity targets (instances, headroom, and load balancing rules) that an SRE team can act on.

MediumTechnical

0 practiced

Write (or describe) a Python function propose_instances(timeseries_cpu_percent, per_instance_cpu_capacity_percent, target_p95_util_percent) that, given CPU utilization samples for existing instances over time, proposes the number of identical instances needed to keep p95 utilization below the target. Assume adding instances divides utilization proportionally. Explain handling of missing values and rounding.

EasyTechnical

0 practiced

Explain the trade-offs when using spot or preemptible instances for capacity optimization in cloud environments. For which workload types (stateless web, stateful DB, batch ML training, real-time streaming) are they appropriate or inappropriate? Describe one architecture pattern to safely incorporate spot capacity for cost savings.

Unlock Full Question Bank

Get access to hundreds of Capacity Planning and Resource Optimization interview questions and detailed answers.

Join thousands of developers preparing for their dream job.