Storage Services and Data Management Questions

Know primary storage options: Object Storage (S3, Azure Blob, GCS) - for unstructured data at scale, highly available, cost-effective. Block Storage (EBS, Azure Managed Disks) - for VM storage, IOPS/throughput optimized. Databases - Relational (RDS, Azure SQL, Cloud SQL) for structured data with relationships; NoSQL (DynamoDB, Cosmos DB, Firestore) for flexible schemas and scale. Understand access patterns, durability, and consistency models. Know when to use each storage type based on data characteristics and access patterns.

EasyTechnical

0 practiced

Describe block storage (e.g., EBS, Azure Managed Disks). What are the key performance characteristics (IOPS, throughput, latency) and typical ML use cases such as ephemeral scratch space for distributed training, database backing disks, or caching? Compare instance store vs attached block volumes and explain implications for fault tolerance and checkpointing.

EasyTechnical

0 practiced

Explain ACID vs BASE database paradigms. For online feature updates in a production serving system, which transactional properties are most critical (atomicity, consistency, isolation, durability) and when might you accept BASE (eventual consistency) instead? Give concrete examples of features that require transactional guarantees.

HardSystem Design

0 practiced

Design an ingestion pipeline that moves streaming feature updates from Kafka into an online store with exactly-once semantics. Address deduplication, idempotent writes, ordering per key, backpressure handling, replay/backfill support, storage technology choices, and operational monitoring. Include trade-offs between at-least-once with idempotency vs exactly-once processing.

MediumSystem Design

0 practiced

Design a backup and restore plan for model artifacts and datasets for a company with ~1,000 production models and ~10 PB of training data. Cover object versioning, cross-region replication, retention policies, restore time objectives (RTO) and restore point objectives (RPO), cost controls, and how you would test periodic restores.

MediumTechnical

0 practiced

For a model registry storing metadata (model name, version, metrics, stage) and large binaries (model weights), decide whether to store binaries inside a transactional database or in object storage with DB metadata pointers. Provide schema examples for both approaches, discuss transactional and performance trade-offs, backup/restore implications, and operational considerations for atomic updates and rollbacks.

Unlock Full Question Bank

Get access to hundreds of Storage Services and Data Management interview questions and detailed answers.

Join thousands of developers preparing for their dream job.