InterviewStack.io LogoInterviewStack.io

Performance Optimization and Latency Engineering Questions

Covers systematic approaches to measuring and improving system performance and latency at architecture and code levels. Topics include profiling and tracing to find where time is actually spent, forming and testing hypotheses, optimizing critical paths, and validating improvements with measurable metrics. Candidates should be able to distinguish central processing unit bound work from input output bound work, analyze latency versus throughput trade offs, evaluate where caching and content delivery networks help or hurt, recognize database and network constraints, and propose strategies such as query optimization, asynchronous processing patterns, resource pooling, and load balancing. Also includes performance testing methodologies, reasoning about trade offs and risks, and describing end to end optimisation projects and their business impact.

MediumTechnical
0 practiced
Given a simplified OLTP schema below, propose concrete indexes to optimize the supplied queries and explain the trade-offs (write amplification, disk usage, index-only scans):
Tables:users(id PK, email text, created_at timestamptz)orders(id PK, user_id FK, total numeric, status text, created_at timestamptz)line_items(id PK, order_id FK, product_id int, qty int)
Queries:1) SELECT * FROM orders WHERE user_id = ? ORDER BY created_at DESC LIMIT 202) SELECT COUNT(*) FROM orders WHERE status = 'failed' AND created_at > now() - interval '30 days'3) SELECT SUM(total) FROM orders WHERE created_at BETWEEN ? AND ?
HardTechnical
0 practiced
Your service shows increased latency only at high concurrency: CPU usage remains low but many requests are blocked on I/O. Provide a systematic investigation and mitigation plan involving OS-level (kernel tunables, syscalls), network-level (socket buffers, backlog), and application-level (async IO, worker pools) actions. Include tools you would use and staged mitigations to lower user impact.
MediumTechnical
0 practiced
A production Java service is showing p99 latency spikes during periods of load, and GC logs indicate long pauses. Describe a methodical approach to diagnose and mitigate GC-induced latency: which GC flags/logs to collect, how to interpret pause times, GC algorithm choices (G1, ZGC, Shenandoah), and code-level mitigations (allocation hotspots, object pooling, escape analysis).
EasyBehavioral
0 practiced
Tell me about a time you led or contributed to improving a service's latency or throughput. Use the STAR method: describe the Situation, the Task and SLOs, the Actions you took (profiling, code changes, infra changes), the Results with measurable metrics, and what you learned or changed in process afterward.
EasyTechnical
0 practiced
What is a flame graph and how do you use it to find CPU hotspots that affect latency? Explain how stack-sampling produces a flame graph, what wide vs tall blocks indicate, and how you'd combine flame graphs with heap or blocking profiles to form a hypothesis about a latency regression.

Unlock Full Question Bank

Get access to hundreds of Performance Optimization and Latency Engineering interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.