InterviewStack.io LogoInterviewStack.io

Database Selection and Trade Offs Questions

How to evaluate and choose data storage systems and architectures based on workload characteristics and business constraints. Coverage includes differences between relational and nonrelational families such as document stores, key value stores, wide column stores, graph databases, time series databases, and search engines; mapping query patterns and latency requirements to storage options; trade offs between strong consistency and eventual consistency and their impact on availability and complexity; partition key design, replication strategies, and high availability considerations; operational concerns including backups, monitoring, vendor and cost trade offs, migration or hybrid strategies, and when to adopt polyglot persistence. Senior level discussion includes selecting specific managed services and reasoning about expected load patterns, failure modes, and operational burden.

MediumSystem Design
57 practiced
Design the data storage architecture for a social feed service with 10M users and 1B posts. Requirements: personalized feeds with p95 read <100ms, write throughput 50k posts/min, support full-text search over posts, and eventual consistency acceptable for feed freshness. Map query patterns (fanout-on-write vs fanout-on-read) to storage options (wide-column, key-value cache, search engine) and justify replication, caching, and consistency trade-offs.
HardTechnical
33 practiced
Design a benchmarking plan to compare performance between a managed and a self-hosted datastore for a real-time leaderboard with 1M players and 100k qps. Include workload generation (key distributions), measuring p50/p95/p99 latencies, failure injection, long-running soak tests, and statistical criteria for making a selection that includes operational considerations.
HardTechnical
42 practiced
While performing a cluster-wide backup of a distributed database, you must ensure a globally consistent snapshot for cross-table transactions without halting writes for more than 2 seconds. Describe distributed snapshot algorithms or MVCC-based approaches you could use, how to coordinate snapshot points, and operational steps for reliable restore.
HardTechnical
31 practiced
You're evaluating a managed proprietary DB with unique features but are concerned about vendor lock-in. Propose a strategy to minimize lock-in while leveraging managed features: abstraction layers, data export/import automation, compatibility layers, regular migration testing, and metrics to measure migration risk and readiness.
MediumSystem Design
38 practiced
Design a backup and recovery strategy for a distributed NoSQL cluster (e.g., Cassandra) storing user profiles. Requirements: RPO <= 15 minutes, RTO <= 1 hour for node or region failure, and minimal impact on production performance. Outline snapshot frequency, incremental backups, anti-entropy/repair, cross-region replication, and recovery steps for node and regional failures.

Unlock Full Question Bank

Get access to hundreds of Database Selection and Trade Offs interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.