Production Readiness and Professional Standards Questions
Addresses the engineering expectations and practices that make software safe and reliable in production and reflect professional craftsmanship. Topics include writing production suitable code with robust error handling and graceful degradation, attention to performance and resource usage, secure and defensive coding practices, observability and logging strategies, release and rollback procedures, designing modular and testable components, selecting appropriate design patterns, ensuring maintainability and ease of review, deployment safety and automation, and mentoring others by modeling professional standards. At senior levels this also includes advocating for long term quality, reviewing designs, and establishing practices for low risk change in production.
EasyTechnical
0 practiced
As an AI Engineer deploying an ML model to production, list and explain at least five logging best practices you would apply. Cover: what to log (inputs, outputs, metadata), log levels, structured JSON logs, PII redaction and retention, correlation IDs and request tracing, sampling strategy for high-volume fields, and cost/retention trade-offs.
MediumTechnical
0 practiced
You have SHAP-based feature attributions produced for several retrains of the same model. Propose statistical tests and thresholds to assert attribution stability (e.g., rank correlation, sign-consistency, bootstrap confidence intervals), and explain how you would fail model promotion if stability is inadequate.
HardTechnical
0 practiced
You are responsible for validating that a training pipeline complies with GDPR: detect PII in datasets, ensure deletion requests remove data from feature stores and backups, and measure privacy guarantees if applying DP-SGD. Describe tests, tooling, and auditing steps you would implement to provide verifiable compliance.
MediumTechnical
0 practiced
Given a PyTorch single-request inference function 'def infer_batch(model, inputs): return model(inputs)', implement a micro-batching wrapper 'def batched_infer(model, request_generator, max_batch_size, timeout_ms)' that accumulates incoming requests into a batch tensor up to max_batch_size or timeout and then calls infer_batch. Ensure predictions are mapped back correctly to each request and discuss padding or variable-length handling.
HardSystem Design
0 practiced
Design an end-to-end load test for a recommendation API that uses an online feature store with strong consistency guarantees. The test should simulate realistic user sessions, feature freshness constraints, delayed label arrival for offline metrics, warm and cold cache patterns, and failure modes. Describe traffic profiles, ramp patterns, failure injection points, and the metrics to capture (latency percentiles, error rates, throughput, feature-staleness).
Unlock Full Question Bank
Get access to hundreds of Production Readiness and Professional Standards interview questions and detailed answers.