InterviewStack.io LogoInterviewStack.io

Performance Engineering and Cost Optimization Questions

Engineering practices and trade offs for meeting performance objectives while controlling operational cost. Topics include setting latency and throughput targets and latency budgets; benchmarking profiling and tuning across application database and infrastructure layers; memory compute serialization and batching optimizations; asynchronous processing and workload shaping; capacity estimation and right sizing for compute and storage to reduce cost; understanding cost drivers in cloud environments including network egress and storage tiering; trade offs between real time and batch processing; and monitoring to detect and prevent performance regressions. Candidates should describe measurement driven approaches to optimization and be able to justify trade offs between cost complexity and user experience.

MediumTechnical
58 practiced
Design a distributed cache invalidation strategy to minimize stale reads for a multi-region service. Consider TTLs, explicit invalidation, versioned keys, pubsub-based invalidation, and the risk of race conditions or partial invalidation.
MediumTechnical
54 practiced
Describe how you would configure a realistic distributed load test using tools like Locust, Gatling, or k6 to model sessions with think-times, authenticated flows, and background data warm-up. Explain how to collect and interpret percentiles and error budgets from the test.
MediumTechnical
60 practiced
Implement, in Python pseudo-code, a thread-safe batching layer that aggregates messages and flushes to a backend either when 500 messages are collected or when 100ms have elapsed since the first message in the batch. Include graceful shutdown flush.
MediumTechnical
59 practiced
Estimate capacity for a service with average 2000 RPS and peak 8000 RPS. Each request consumes roughly 15ms CPU and 1MB working memory while active. Using a 4-vCPU, 8 GB VM costing $0.08/hr, calculate VMs for average and peak with 30% headroom and monthly cost. Show calculations and assumptions.
HardSystem Design
53 practiced
Architect a system that supports ad-hoc analytics queries with sub-500ms latency for interactive dashboards while also running full nightly aggregations. Propose a hybrid architecture that balances cost and freshness and explain the trade-offs.

Unlock Full Question Bank

Get access to hundreds of Performance Engineering and Cost Optimization interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.