InterviewStack.io LogoInterviewStack.io

Performance and Code Optimization Questions

Covers techniques and decision making for improving application and code performance across levels from algorithm and memory access patterns to frontend bundling and runtime behavior. Candidates should be able to profile and identify bottlenecks, apply low level optimizations such as loop unrolling, function inlining, cache friendly access patterns, reducing branching, and smart memory layouts, and use compiler optimizations effectively. It also includes higher level application and frontend optimizations such as code splitting and lazy loading, tree shaking and dead code elimination, minification and compression, dynamic imports, service worker based caching, prefetching strategies, server side rendering versus client side rendering trade offs, static site generation considerations, and bundler optimization with tools like webpack Vite and Rollup. Emphasize measurement first and avoiding premature optimization, and explain the trade offs between performance gains and added complexity or maintenance burden. At senior levels expect ability to make intentional trade off decisions and justify which optimizations are worth their complexity for a given system and workload.

HardTechnical
21 practiced
A Kubernetes cluster shows pods experiencing CPU throttling which increases p99 latency during sustained bursts. Diagnose likely causes and propose remediation: tuning requests/limits, node sizing, QoS classes, vertical/horizontal autoscaling, cgroup settings, CPU pinning, and admission controls. Which metrics reveal throttling and how would you alert on them?
MediumTechnical
17 practiced
CI build times for a backend repo are 40 minutes, slowing deployments and developer feedback loops. Propose concrete build and bundler optimizations (incremental builds, artifact caching, parallel compilation, dependency pruning, and optimizing webpack/Vite/Rollup configs). Describe how you'd measure ROI and mitigate risks such as cache invalidation or inconsistent builds.
MediumTechnical
16 practiced
A Python web handler spends 80% CPU serializing and filtering a list of 10k JSON objects per request. Propose a prioritized optimization plan: measure/profile, algorithmic changes (filter earlier, use streaming/generators), vectorized libraries or C-extensions, caching precomputed results, or moving heavy work to background jobs. Explain how you'd validate each step in production-like conditions.
EasyTechnical
24 practiced
You're on-call for a Linux-based microservice that reports increased latency. Describe the first three profiling or instrumentation tools/techniques you would use to identify the bottleneck (e.g., flamegraphs, perf, pprof, eBPF, strace). For each tool explain what data it reveals, what kinds of problems it's best at finding, and a simple command or workflow to get an initial result.
MediumTechnical
31 practiced
Design a benchmarking suite for an HTTP service to measure throughput and latency under realistic traffic. Describe how you'd generate realistic test data (payloads, distribution of endpoints), tools you'd use (wrk, k6, locust), warmup period, concurrency patterns, and how to avoid common benchmarking pitfalls such as client-side bottlenecks, caching effects, or incorrect timeouts.

Unlock Full Question Bank

Get access to hundreds of Performance and Code Optimization interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.