InterviewStack.io LogoInterviewStack.io

System Resource Management and Monitoring Questions

Monitor and manage operating system and hardware level resources to ensure application performance and stability. Topics include central processing unit utilization and context switching, system load trends, memory usage including heap and stack behavior, paging and swapping effects, disk input output operations and free space, and network bandwidth utilization and packet loss. Know diagnostic tools and commands for observing these signals, recognize patterns of resource contention and exhaustion such as out of memory and high input output wait, and understand mitigation techniques including tuning, resource limits, throttling, caching, capacity planning, and vertical or horizontal scaling.

EasyTechnical
0 practiced
Explain paging and swapping on Linux. Describe the performance impacts of heavy paging/swapping and how to detect it using commands like free, vmstat, and sar -W. Explain vm.swappiness and when tuning it may improve or worsen latency.
MediumTechnical
0 practiced
You receive an alert that a Java process is consuming 95% CPU on a production host. Outline step-by-step triage actions: the commands and tools you'd run (top/jstack/jmap/jcmd/jstat/perf), how to interpret thread dumps versus CPU profiles, and what immediate mitigations you might take to reduce impact while preserving forensic information.
HardTechnical
0 practiced
You observe intermittent spikes in load average on several hosts without corresponding high CPU or disk metrics. Propose a systematic debugging plan: list kernel-level phenomena that can explain this (blocked processes in D-state, NFS hangs, stuck IRQ handlers), which /proc fields and commands to inspect, and how to capture evidence of the root cause.
MediumSystem Design
0 practiced
Design a set of memory metrics to monitor in a containerized environment. For host-level and container-level monitoring list the specific metrics to collect (e.g., RSS, cache, anon, mem.limit_in_bytes, swap), sampling frequencies, required labels, recommended alert thresholds, and tradeoffs between sampling frequency and storage costs.
HardTechnical
0 practiced
Explain NUMA (Non-Uniform Memory Access) architecture and how it impacts multi-socket server performance. Describe detection methods for NUMA imbalance (numastat, /proc/zoneinfo, perf), how to bind processes and memory with numactl, and when to use hugepages or CPU pinning to improve locality.

Unlock Full Question Bank

Get access to hundreds of System Resource Management and Monitoring interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.