Infrastructure Implementation and Operations Questions

Hands on design, deployment, and operational management of infrastructure components and services. This includes setting up and configuring load balancers, database replication and high availability, caching layers, networking and network security, service discovery and routing, container deployment and orchestration, monitoring and observability, logging and alerting, backup and disaster recovery strategies, and secrets management in runtime. Candidates should be able to walk through concrete implementations, explain trade offs, demonstrate troubleshooting and performance tuning, and show how infrastructure components integrate to meet availability, scalability, and security requirements.

MediumTechnical

0 practiced

Your organization requires daily rotation of database credentials for compliance. How would you implement fully automated credential rotation for services running in Kubernetes (both long-running services and short-lived batch jobs) with minimal disruption and auditability?

HardTechnical

0 practiced

Write a Prometheus recording rule and an alerting rule (YAML) that calculates the 99th-percentile processing latency for metric 'etl_processing_seconds' over a 1h sliding window and fires an alert if the 99p exceeds 5s for three consecutive evaluation intervals. Provide both rules only.

MediumTechnical

0 practiced

Write a small Bash script using curl and jq that queries the Prometheus HTTP API to return the top 5 series by increase over 5 minutes for metric 'etl_records_processed_total'. The script should accept a Prometheus server URL as an argument and print label + value pairs.

EasyTechnical

0 practiced

Explain the difference between Layer 4 (transport) and Layer 7 (application) load balancers. For a data ingestion API that accepts large batch uploads (multi-GB), which type would you choose and why? Include considerations about TLS termination, sticky sessions, request routing, and implications for large payload buffering.

HardTechnical

0 practiced

Design a validation and reconciliation system to detect silent data corruption and drift between sources and transformed analytics tables. Include checksums, sampling rates, reconciliation frequency, alerting thresholds, and automated remediation or manual repair procedures.

Unlock Full Question Bank

Get access to hundreds of Infrastructure Implementation and Operations interview questions and detailed answers.

Join thousands of developers preparing for their dream job.