InterviewStack.io LogoInterviewStack.io

Infrastructure Automation and Provisioning Questions

Covers designing, implementing, and operating automated infrastructure provisioning and configuration using Infrastructure as Code practices and complementary automation patterns. Candidates should be able to select and author declarative infrastructure definitions with tools such as Terraform, CloudFormation, and Azure Resource Manager templates, and discuss configuration management tools such as Ansible, Puppet, or Chef. Core skills include modular and reusable code organization for multiple environments, variable and output management, remote state management and locking, idempotency and atomicity of operations, and version control integration for infrastructure artifacts. Candidates should understand testing and validation practices including linting, plan or dry run validation, unit and integration testing of infrastructure changes, and drift detection and remediation. The topic includes strategies for safe changes and rollbacks, change coordination, error handling and recovery, and deployment patterns such as canary and blue green where applicable. It also encompasses automation and orchestration patterns, immutable infrastructure and self healing practices, autoscaling and scaling policies, automated patching and updates, secrets handling patterns using secret managers, and integrating observability and monitoring into automated workflows. Finally, candidates should be able to reason about trade offs between imperative and declarative approaches, scaling Infrastructure as Code across large projects and teams, and security and compliance considerations for automated provisioning.

MediumSystem Design
49 practiced
Design a remote state backend and locking strategy for a 100-person engineering organization using Terraform. Requirements: teams own separate services, state locking to prevent corruption, audit logs, cross-account access patterns, and minimal blast radius if a state file is compromised. Compare S3+DynamoDB, Consul, and Terraform Cloud and outline a migration plan from local state.
HardSystem Design
83 practiced
Design a multi-region disaster recovery strategy executed via IaC for a stateful service with RTO < 15 minutes and RPO < 5 minutes. Discuss options for synchronous vs asynchronous replication, automated provisioning of failover infrastructure using IaC, DNS failover strategies, orchestration of cutover, and how you'd test and automate DR drills.
MediumTechnical
47 practiced
As the SRE responsible for platform reliability, propose an automated drift detection and remediation design for cloud resources provisioned by IaC. Include detection methods (terraform plan diffs, provider config APIs, cloud-native config services), alerting thresholds, classification of drift severity, remediation actions (auto-apply vs human review), and safeguards to avoid noisy false positives.
HardTechnical
45 practiced
You inherit a large monolithic Terraform codebase with hundreds of resources and no moduleization. Create a step-by-step migration plan to refactor into reusable, versioned modules with minimal production risk. Include how you'll split state, import/move resources, test the refactor, and roll out changes across environments.
MediumTechnical
54 practiced
Write an Ansible playbook snippet that idempotently installs Nginx, deploys a templated configuration file, and restarts the nginx systemd service only when the configuration changes. Also explain how you'd validate this playbook in CI using linting and dry-run checks before applying to production.

Unlock Full Question Bank

Get access to hundreds of Infrastructure Automation and Provisioning interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.