InterviewStack.io LogoInterviewStack.io

Infrastructure Scaling and Capacity Planning Questions

Operational and infrastructure level planning to ensure systems meet current demand and projected growth. Topics include forecasting demand headroom planning and three to five year capacity roadmaps; autoscaling policies and metrics driven scaling using central processing unit memory and custom application metrics; load testing benchmarking and performance validation methodologies; cost modeling and right sizing in cloud environments and trade offs between managed services and self hosted solutions; designing non disruptive upgrade and migration strategies; multi region and availability zone deployment strategies and implications for data placement and latency; instrumentation and observability for capacity metrics; and mapping business growth projections into infrastructure acquisition and scaling decisions. Candidates should demonstrate how to translate requirements into capacity plans and how to validate assumptions with experiments and measurements.

EasyTechnical
0 practiced
Differentiate load testing, stress testing, soak testing, and benchmarking for data pipelines and storage systems. For a newly built ETL pipeline, recommend which of these tests you would run, what each test should validate (throughput, stability, failure modes), and which metrics you would collect to support capacity planning decisions.
EasyTechnical
0 practiced
Explain common autoscaling signals for cloud data services: CPU utilization, memory usage, request throughput (RPS), request latency (p95/p99), queue length/backlog, and custom application metrics such as consumer lag. For each metric describe when it is appropriate to drive scaling, typical pitfalls (noise, latency, cardinality), and give one concrete example of a service (stateless API, stream consumer, batch worker) that should use it.
MediumTechnical
0 practiced
Write a Python script or clear pseudocode that reads a CSV of historic batch jobs (columns: job_id, avg_runtime_seconds, avg_cpu_seconds, avg_concurrent_runs) and outputs a recommended minimum cluster vCPU count to meet a target average job latency. Explain the math and assumptions (parallelism, headroom), and provide sample output for three example jobs.
EasyTechnical
0 practiced
A medium-sized company must choose between a managed cloud data warehouse (BigQuery or Snowflake) and a self-hosted Presto/Trino cluster. List at least eight technical and operational factors to evaluate (e.g., elasticity, operational overhead, cost predictability, concurrency, latency, vendor lock-in, security, backup/recovery) and explain how each factor impacts capacity planning decisions.
MediumTechnical
0 practiced
Given 18 months of CPU utilization data for a service with daily and weekly seasonality, explain a statistical approach to compute appropriate headroom percentage and dynamic scaling thresholds. Include seasonality decomposition, percentile selection, trend extrapolation, and how to set conservative buffers for infrequent spikes while minimizing cost.

Unlock Full Question Bank

Get access to hundreds of Infrastructure Scaling and Capacity Planning interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.