InterviewStack.io LogoInterviewStack.io

Performance Engineering and Cost Optimization Questions

Engineering practices and trade offs for meeting performance objectives while controlling operational cost. Topics include setting latency and throughput targets and latency budgets; benchmarking profiling and tuning across application database and infrastructure layers; memory compute serialization and batching optimizations; asynchronous processing and workload shaping; capacity estimation and right sizing for compute and storage to reduce cost; understanding cost drivers in cloud environments including network egress and storage tiering; trade offs between real time and batch processing; and monitoring to detect and prevent performance regressions. Candidates should describe measurement driven approaches to optimization and be able to justify trade offs between cost complexity and user experience.

HardTechnical
0 practiced
Create a cost-performance model comparing serverless functions and a managed container cluster for a bursty event-driven workload with 1M events/day, median processing time 200ms, and 99th percentile 2s. List model inputs, assumptions, and the decision thresholds you would use to recommend one platform over the other.
HardSystem Design
0 practiced
Design a workload-shaping system to gracefully handle a 100x traffic surge from an API while preserving high-value traffic. Include admission control, priority queues, token-bucket limits, backpressure mechanisms, and how you would surface degraded behavior to customers.
MediumTechnical
0 practiced
For a read-heavy relational database workload, outline optimization strategies including indexing, read replicas, materialized views, partitioning, and denormalization. Provide criteria for choosing among these options.
MediumTechnical
0 practiced
Design a distributed cache invalidation strategy to minimize stale reads for a multi-region service. Consider TTLs, explicit invalidation, versioned keys, pubsub-based invalidation, and the risk of race conditions or partial invalidation.
HardTechnical
0 practiced
Given a histogram of request latencies and CPU utilization for the last 90 days, describe an algorithm or step-by-step method to select instance types and autoscaling thresholds that minimize cost while meeting p95 latency SLO. Provide pseudocode or clear decision rules.

Unlock Full Question Bank

Get access to hundreds of Performance Engineering and Cost Optimization interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.