Database Troubleshooting and Diagnostics Questions

Systematic approaches and technical techniques for diagnosing database issues and restoring healthy operation. Topics include identifying symptoms, gathering diagnostic data from error logs and system views, analyzing slow queries with explain plans and profiling, diagnosing connection and authentication failures, detecting and resolving deadlocks and blocking, capacity and storage issues, replication and consistency problems, backup and restore verification, and corruption investigation. Candidates should be familiar with database specific diagnostic tools, monitoring and alerting metrics, indexing and query optimization strategies, and effective communication of findings to application and infrastructure teams.

EasyTechnical

0 practiced

Given a MySQL slow query log entry, explain what each field means and how you'd use those logs to prioritize tuning work. Describe how to enable slow query logging safely in production, common pitfalls (like log rotation and sample bias), and how to correlate slow log entries with application traces.

MediumTechnical

0 practiced

Explain how to use database wait_event and lock-wait statistics together with OS-level metrics to determine whether slow queries are CPU-bound or I/O-bound. Specify which DB and OS signals you would combine (e.g., wait_event types, iowait, context switches, disk latency) and how to form a confident diagnosis.

EasyTechnical

0 practiced

You're the on-call SRE for a PostgreSQL instance that triggered a 'high CPU and slow queries' alert. Describe the immediate diagnostic steps you would take to gather evidence for triage. List specific system commands and PostgreSQL queries you'd run (examples: top/ps/iostat/vmstat, `pg_stat_activity`, `pg_stat_statements`, server logs), which files to collect, and how to create a consistent snapshot to hand off to DBAs.

HardTechnical

0 practiced

Design a post-incident review template and investigation process for database incidents that produces accurate RCA, remains blameless, and results in actionable remediation. Include required fields (timeline, SLI graphs, root cause hypothesis), timelines for draft and final reports, how to validate remediation, and how to track recurring patterns across incidents.

MediumTechnical

0 practiced

Your database occasionally hits the configured max_connections limit and begins refusing connections. Propose immediate mitigation steps and longer-term fixes, including connection pooling strategies (pgbouncer modes), application changes, connection throttling, and monitoring you would implement. Discuss trade-offs between pooler modes and connection burst handling.

Unlock Full Question Bank

Get access to hundreds of Database Troubleshooting and Diagnostics interview questions and detailed answers.

Join thousands of developers preparing for their dream job.