Design a partitioning and sharding strategy for a multi-tenant user database where tenant sizes vary from a few users to millions. Discuss schema choices (shared vs isolated), shard key selection, tenant isolation, hotspot mitigation for large tenants, monitoring signals to trigger re-sharding, and a safe tenant relocation process with minimal downtime.
Architect a storage platform for 10 PB of data with mixed workload characteristics: frequent small random reads with low latency, occasional very large sequential writes, and strict retention/compliance requirements. Describe storage tiering, where to place metadata, erasure coding vs replication, hot/cold data movement, index design, backup and restore strategy, and how to handle hotspots and rebalancing.
Design a distributed rate-limiting and quota enforcement system for an API platform that supports multiple clients, fairness, priority tiers, per-endpoint limits, and global enforcement across regions. Describe enforcement points (edge vs centralized), token distribution, storage choices for counters, handling bursts and clock skew, and telemetry to monitor quota usage and throttling incidents.
Design an autoscaling and placement strategy for stateful services (e.g., caches or databases) across multiple regions that meets strict SLOs while minimizing cloud costs. Address instance sizing and families, reserved capacity and spot instances tradeoffs, predictive scaling versus reactive scaling, data locality and affinity rules, and how to prevent thrash during scale-up/down operations.
Design a near-real-time change-data-capture (CDC) and event-sourcing pipeline that guarantees exactly-once processing semantics and preserves ordering per aggregate id for downstream analytics. Specify source capture technique, transport system, partitioning strategy, deduplication approach, transactional guarantees, sink idempotency, schema evolution strategy and backfill procedures.