InterviewStack.io LogoInterviewStack.io

Performance Fundamentals and Troubleshooting Questions

Core skills for identifying, diagnosing, and resolving general performance problems across applications and systems. Topics include establishing baselines and metrics, using monitoring and profiling tools to determine whether issues are CPU bound, memory bound, input output bound, or network bound, and applying systematic troubleshooting workflows. Candidates should be able to prioritize fixes, recommend temporary mitigations and long term solutions, and explain when to escalate to specialists. This canonical topic covers general performance awareness, common diagnostic tools, and basic remediation approaches for slow systems and resource exhaustion.

EasyTechnical
68 practiced
Explain why p50 alone is insufficient for performance monitoring of user-facing services. Provide an example showing how p50 and p99 could tell different stories and one practical consequence of relying only on p50.
MediumTechnical
75 practiced
You are creating a performance run for a database-backed API. Propose a set of load-test scenarios (3-5) that will surface common backend bottlenecks, including the traffic profile, dataset size, and success criteria for each scenario.
EasyTechnical
69 practiced
Write a Python function that parses a log file of HTTP request durations and returns the p50, p95, and p99 latency in milliseconds. Input: a newline-delimited file where each line is: "timestamp request_id duration_ms". Assume file fits in memory. Provide function signature: def percentiles(file_path: str) -> Dict[str, float].
HardSystem Design
71 practiced
You must design a lightweight observability plan for a greenfield microservice expected to be latency-sensitive. Specify what metrics, traces, and logs to collect initially, what sampling rates to use, and a plan to increase observability as traffic grows while keeping costs in check.
HardTechnical
72 practiced
A production service shows increasing CPU steal time on VMs in a cloud provider. Explain what CPU steal means, why it matters for latency-sensitive services, and three remediation options (short-term and long-term) you would evaluate.

Unlock Full Question Bank

Get access to hundreds of Performance Fundamentals and Troubleshooting interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.