InterviewStack.io LogoInterviewStack.io

Performance Fundamentals and Troubleshooting Questions

Core skills for identifying, diagnosing, and resolving general performance problems across applications and systems. Topics include establishing baselines and metrics, using monitoring and profiling tools to determine whether issues are CPU bound, memory bound, input output bound, or network bound, and applying systematic troubleshooting workflows. Candidates should be able to prioritize fixes, recommend temporary mitigations and long term solutions, and explain when to escalate to specialists. This canonical topic covers general performance awareness, common diagnostic tools, and basic remediation approaches for slow systems and resource exhaustion.

EasyTechnical
56 practiced
A Linux server shows sustained high CPU utilization. Provide a step-by-step troubleshooting checklist to determine whether the load is user-space CPU, kernel CPU, or I/O wait. Include specific commands (top/htop, mpstat, iostat, vmstat, pidstat, perf) and what specific output patterns would indicate each cause.
MediumTechnical
74 practiced
Explain how swap usage affects performance on Linux. Describe how you would distinguish swap-related performance problems from other memory issues (commands and metrics), and list three remediation strategies with pros and cons for each.
HardTechnical
59 practiced
Explain how eBPF (extended Berkeley Packet Filter) can be used for low-overhead performance troubleshooting in production. Provide three concrete use cases (for example: tracing syscalls for latency, tracking TCP retransmissions, measuring scheduler latency), describe the probes you would attach, and explain how you'd ensure safety and minimal performance impact.
HardSystem Design
58 practiced
Propose a plan to evaluate and reduce tail latency in a distributed storage system used for read-modify-write cycles. Discuss architectural changes (quorum adjustments, leader placement), request scheduling, batching strategies, speculative retries, and backpressure mechanisms. Explain how you would measure impact on p99 and overall throughput.
MediumTechnical
109 practiced
Explain how connection pooling helps prevent resource exhaustion and how it affects latency behavior under load. Describe the metrics you would monitor to detect connection pool saturation and outline a process to tune pool size for a given application and database (including safety margins).

Unlock Full Question Bank

Get access to hundreds of Performance Fundamentals and Troubleshooting interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.