Remote Role
Base Salary 200-225k plus 20% bonus and performance equity
About the Role
We are seeking an experienced and well-rounded VP of Cloud Engineering, Operations & Delivery to lead our cloud practice across a diverse portfolio of industry verticals. This role sits at the intersection of technical authority, executive leadership, and forward-thinking innovation — someone who brings genuine cloud engineering depth, while also driving strategy, client relationships, and organizational growth.
You will lead high-performing teams delivering complex, multi-cloud solutions across AWS, Azure, and Google Cloud Platform, setting the technical bar while ensuring the business delivers on its commitments. Critically, you will help shape and lead our evolution into an agentic AI-powered future — identifying opportunities to transform how our teams and our clients design, deploy, operate, and optimize cloud infrastructure using AI agents and intelligent automation.
The ideal candidate is a natural communicator who can shift seamlessly from an architecture discussion with engineers to a strategic briefing with a client's executive team — and be credible in both rooms. They are also someone who looks at today's manual, repetitive, or complex processes and asks: "How do we let intelligent agents handle this?"
- Serve as the senior technical authority for cloud architecture and infrastructure decisions across AWS, Azure, and GCP
- Advance and mature our Infrastructure as Code (IaC) practices — Github, Jenkins, Terraform, Qualys, Sonarqube, etc. — ensuring consistency, security, and scalability across client environments
- Provide meaningful technical guidance and architectural direction to engineering teams — going beyond high-level oversight to engage substantively on design decisions, standards, and delivery quality
- Guide adoption of cloud-native patterns including Kubernetes (EKS/AKS/GKE), serverless, CI/CD automation, and event-driven architecture
- Lead architecture reviews and serve as the escalation point for complex technical challenges
- Ensure security and compliance are embedded into infrastructure from the ground up — spanning IAM design, network segmentation, secrets management, and frameworks such as SOC 2, NIST, CIS, HIPAA, and PCI-DSS
Agentic AI Strategy & Transformation
- Champion the adoption of AI agents and multi-agent systems to transform how cloud infrastructure is built, operated, and optimized — moving teams from reactive, manual workflows to intelligent, autonomous execution
- Identify high-value opportunities to introduce agentic workflows into engineering operations — including infrastructure provisioning, incident detection and remediation, cost optimization, compliance monitoring, security response, and deployment pipelines
- Lead the evaluation and adoption of agentic AI frameworks and platforms (e.g., LangGraph, AutoGen, Amazon Bedrock Agents, Azure AI Agent Service, Vertex AI Agent Builder) to build purpose-built agents that extend the capabilities of our engineering teams
- Define governance, guardrails, and human-in-the-loop checkpoints for agentic systems operating in cloud environments — ensuring autonomous actions are safe, auditable, and aligned with client expectations
- Collaborate with engineering and solutions teams to design agentic delivery pipelines — where AI agents assist in code generation, IaC validation, drift detection, security scanning, and release orchestration
- Work with peer technology teams to identify process transformation opportunities — helping envision, roadmap and execute an agentic future state for cloud operations and engineering workflows
- Stay ahead of the rapidly evolving AI agent ecosystem and bring informed, practical perspectives on what is production-ready versus experimental
- Own the operational health of cloud environments across the client portfolio — including availability, performance, security posture, and cost efficiency
- Mature SRE practices across the organization: SLOs, error budgets, incident management, and blameless postmortems
- Drive FinOps discipline — optimizing cloud spend through right-sizing, commitment strategies, tagging governance, and anomaly detection — increasingly augmented by AI-driven insights and autonomous recommendations
- Define and enforce observability standards across logging, metrics, and tracing using Datadog and CloudWatch — and explore how agentic monitoring can move teams from alert fatigue to autonomous resolution
- Lead end-to-end delivery of cloud engineering engagements — from technical discovery and architecture through deployment, cutover, and steady-state operations
- Build scalable delivery frameworks, runbooks, and IaC-driven playbooks that can be applied consistently across verticals and client environments — and actively work to make those playbooks AI-executable over time
- Proactively identify technical risks and drive resolution before they become client issues
- Build, mentor, and retain a high-performing team of cloud engineers, DevOps engineers, SREs, and delivery managers — cultivating a team culture that embraces AI-augmented workflows as a force multiplier, not a threat
- Define clear career ladders, engineering standards, and technical growth paths that attract and retain top talent — including emerging skills in AI/ML infrastructure, prompt engineering, and agentic system design
- Foster a culture of engineering excellence, continuous learning, and genuine curiosity about what AI agents can unlock
Executive & Client Engagement
- Communicate cloud strategy, delivery status, and technical decisions clearly to executive stakeholders — both internally and with clients
- Help clients articulate and develop their agentic transformation roadmap — translating the potential of AI agents into concrete, phased business outcomes
- Participate in pre-sales and client-facing conversations with enough technical depth to build confidence and credibility
- Translate cloud provider roadmaps — including rapidly evolving AI and agent capabilities from AWS, Azure, and GCP — into strategic investments and differentiated service offerings
- Represent the engineering organization in leadership discussions, helping align technical capabilities with business growth objectives
- 12+ years of experience in cloud infrastructure, platform engineering, or DevOps — with at least 4 years in a senior leadership capacity
- Strong working knowledge of AWS, Azure, and GCP — you understand how these platforms work in practice, not just in principle; professional-level certifications are a plus
- Solid, proven experience with Infrastructure as Code — particularly Terraform — including best practices around module design, state management, GitOps workflows, and policy enforcement
- Demonstrated experience leading cloud delivery programs for enterprise clients across multiple industries
- Practical exposure to AI agents and agentic frameworks — you've either built, deployed, or operated AI agent systems in a production or near-production context and understand how to design reliable, governed agentic workflows
- A creative, process-transformation mindset — you look at how work gets done today and can credibly envision how intelligent agents could do it better, faster, and more reliably tomorrow
- Working knowledge of Kubernetes in production environments and modern CI/CD practices
- Familiarity with cloud security frameworks and compliance requirements relevant to multi-vertical client environments
- A track record of building and developing high-performing engineering teams
- Exceptional communication skills — able to engage engineers at a technical level and translate that into clear, confident messaging for executives and clients alike
- Hands-on experience with agentic AI platforms such as LangGraph, AutoGen, Amazon Bedrock Agents, Azure AI Agent Service, or Vertex AI Agent Builder
- Experience designing multi-agent architectures — including agent orchestration, tool use, memory management, and human-in-the-loop design patterns
- Familiarity with LLM integration patterns in cloud-native applications — RAG pipelines, vector databases, embedding workflows, and model hosting on cloud infrastructure
- Experience in a managed services or solutions provider environment serving diverse industry verticals
- Background in platform engineering or Internal Developer Platform (IDP) development
- Familiarity with policy-as-code tools such as OPA, Sentinel, or Checkov
- AWS, Azure, and/or GCP professional-level certifications, including any AI/ML specialty certifications