InterviewStack.io LogoInterviewStack.io

Optimization Under Constraints Questions

Technical approaches for optimizing code and systems when operating under constraints such as limited memory, strict frame or latency budgets, network bandwidth limits, or device specific limitations. Topics include profiling and instrumentation to identify bottlenecks, algorithmic complexity improvements, memory and data structure trade offs, caching and data locality strategies, parallelism and concurrency considerations, and platform specific tuning. Emphasize measurement driven optimization, benchmarking, risk of premature optimization, graceful degradation strategies, and communicating performance trade offs to product and engineering stakeholders.

EasyTechnical
0 practiced
Explain data locality and cache locality and why they matter for ML systems. Give two concrete examples where improving locality reduces latency or memory pressure in training or serving.
HardTechnical
0 practiced
You need to train a very large transformer model but cross-machine network bandwidth is limited to 10 Gbps. Design a training strategy that balances model parallelism and data parallelism, and explain how to minimize communication overhead while preserving convergence properties.
HardTechnical
0 practiced
Feature retrieval from a remote feature store accounts for 60% of serving latency. Propose architectural and engineering changes to reduce user-visible latency. For each change, explain trade-offs in freshness, storage cost, consistency, and complexity.
MediumTechnical
0 practiced
A training job requires 20GB of GPU memory but you only have a 12GB GPU. List concrete strategies to fit training within memory (code or infra changes), explain trade-offs for each (performance, complexity, accuracy), and recommend an order of attempts.
MediumTechnical
0 practiced
Implement a PyTorch DataLoader-like generator in Python that reads image file paths from disk and yields batches with asynchronous prefetching and simple augmentation. Your implementation should minimize GPU idle time and be safe for multi-worker use.

Unlock Full Question Bank

Get access to hundreds of Optimization Under Constraints interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.