Discussion of practical projects and side work you have built or contributed to across domains. Candidates should be prepared to explain their role, architecture and design decisions, services and libraries chosen, alternatives considered, trade offs made, challenges encountered, debugging and troubleshooting approaches, performance optimization, testing strategies, and lessons learned. This includes independent side projects, security labs and capture the flag practice, bug bounty work, coursework projects, and other hands on exercises. Interviewers may probe for how you identified requirements, prioritized tasks, collaborated with others, measured impact, and what you would do differently in hindsight.
MediumSystem Design
0 practiced
Design an automated pipeline to detect dataset and model drift and trigger retraining. Specify drift metrics (feature distribution shifts, embedding drift, confidence drops), detection methods, retraining triggers, human-in-the-loop validation, evaluation gates, and deployment cadence in production.
HardTechnical
0 practiced
You are leading a cross-functional effort to integrate a generative AI feature (e.g., summarization or code generation) into an existing product. Describe your plan to collect and annotate data, choose between open-source models or cloud APIs, build an MVP, implement safety filters and guardrails, define business metrics for success, and coordinate engineering, legal, and ops for rollout.
HardTechnical
0 practiced
You have a BERT-base-like model and must achieve p95 inference latency under 10ms on commodity CPUs. Propose a prioritized engineering plan including model distillation, quantization to int8, operator fusion and pruning, conversion to ONNX and using ONNX Runtime, CPU threading and affinity, and benchmarking plan. Explain expected accuracy and latency trade-offs.
EasyTechnical
0 practiced
You have a one-file Python training script. Describe step-by-step how to instrument it with an experiment tracking tool (choose MLflow or Weights & Biases): initialization, logging hyperparameters and metrics, logging artifacts (model weights, plots), and configuring a remote artifact store (e.g., S3). Include short example code snippets or CLI commands in prose.
HardTechnical
0 practiced
In a multi-node distributed training job, node 3 intermittently throws CUDA OOM after several hours. Outline a thorough debugging plan: what logs and traces to collect (NCCL, CUDA, dmesg), how to check for memory leaks or fragmentation, how to validate that batch sizes and model sharding are consistent across ranks, and short-term mitigations to keep training running.
Unlock Full Question Bank
Get access to hundreds of Hands On Projects and Problem Solving interview questions and detailed answers.