Solution IT Inc. is looking for Director/AVP Infrastructure & Cloud (IMT/Infra Leader) for one of its clients Miami, FL
Job Title: Director/AVP Infrastructure & Cloud (IMT/Infra Leader)
Summary
We are seeking a senior leader to own our infrastructure and cloud strategy, drive AI first operations, and build next generation automation and ITSM capabilities across our global delivery organization. The ideal candidate combines deep hands on experience in Cloud (preferably AWS) and infrastructure with strong leadership, stakeholder management, and a track record of building high performing, automation driven operations teams
Responsibilities:
Strategy, Leadership & Stakeholder Management
- Define and own the infrastructure, cloud, and AIOps strategy aligned to overall business and product goals.
- Act as primary technology partner to senior client stakeholders, internal business leaders, and product teams; translate business outcomes into a clear technical roadmap.
- Build, mentor, and scale a high performing global (onshore/offshore) team across cloud, SRE, infra, and automation disciplines.
- Drive an "AI first" and "automation first" culture in operations and service delivery, setting standards, playbooks, and best practices.
- Own budget planning and optimization for infrastructure, tools, and cloud spend; establish financial guardrails and cost efficiency initiatives.
Cloud, Infrastructure & Reliability
- Own the architecture, reliability, scalability, and security posture of cloud and on premise infrastructure, with strong focus on AWS.
- Establish best practices for high availability, DR, backup, capacity planning, performance, and observability.
- Define and enforce SLOs/SLAs, error budgets, and resilience practices in partnership with SRE and product teams.
- Ensure robust security, compliance, identity/access management, and governance across environments.
AI, Automation, Agentic Ops & AIOps
- Define and drive the roadmap for AIOps, Agentic Operations, and automation of infra/operations workflows (incident triage, root cause analysis, remediation, capacity management, change management).
- Evaluate, select, and implement AI and AIOps platforms/tools (e.g., observability, log analytics, anomaly detection, predictive alerting, intelligent runbooks).
- Lead the design of agentic workflows that use LLMs/AI agents to automate common operational tasks, knowledge retrieval, and ticket handling.
- Industrialize automation across infra, cloud, and ITSM (self healing, auto remediation, ChatOps, runbook automation, infrastructure-as-code).
Next Generation ITSM & Service Delivery
- Own the vision and roadmap for modern ITSM capabilities (ITIL aligned but automation driven) across Incident, Problem, Change, Request, CMDB, and Knowledge Management.
- Integrate ITSM with monitoring, observability, AIOps, and collaboration tools to deliver end to end, automated service workflows.
- Define and track operational KPIs (MTTR, incident volume, change success rate, availability, deployment frequency, automation coverage, etc.) and use data to continuously improve.
- Partner with account and sales teams to design solution offerings, respond to RFPs, and position our Infra/Cloud/AIOps capabilities as a differentiator in the market.
Offshore Leadership & AI Solutioning in AWS
- Provide architectural guidance and technical leadership to offshore engineering and operations teams building AI enabled solutions on AWS.
- Review and approve solution architectures for AI/ML workloads (e.g., LLM integration, data pipelines, MLOps, vector databases, model hosting) on AWS.
- Establish standards, reference architectures, and reusable components so teams can rapidly build compliant, scalable AI solutions for clients.
- Coach and upskill team members on cloud native AI services, infrastructure as code, DevOps, and SRE practices.
Required Experience:
- 15 18+ years in Infrastructure / Cloud / Operations roles, with at least 5 7 years in senior leadership (Director/Head/VP/ AVP) capacity.
- Strong, hands on background in AWS (multi account strategy, networking, security, containerization, serverless, observability).
- Demonstrated experience leading large, distributed teams (including offshore/nearshore) in a software product or IT services/vendor environment.
- Proven track record of implementing AIOps/observability platforms and driving automation of operations at scale.
- Experience designing or transforming ITSM organizations and processes, with focus on automation and AI assisted workflows.
- Strong understanding of SRE principles, reliability engineering, and modern DevOps practices.
- Exposure to building or integrating AI/ML solutions in production (LLMs, chatbots, agentic workflows, or predictive analytics) preferred.
- Excellent stakeholder management, communication, negotiation, and executive presentation skills.
Skills & Competencies
- Deep expertise in AWS cloud architecture, networking, security, and cost optimization.
- Strong knowledge of observability and AIOps tools (e.g., Datadog, New Relic, Dynatrace, Elastic, Splunk, PagerDuty, ServiceNow, or similar).
- Practical experience with automation and IaC tools (Terraform, CloudFormation, Ansible, CI/CD pipelines).
- Understanding of ITIL, ITSM platforms, and process design; ability to modernize and automate ITSM.
- Familiarity with AI/ML concepts, LLMs, vector stores, and MLOps practices; comfort working with data and AI engineers.
- Ability to define vision and roadmap, then translate into actionable plans for cross functional, global teams.
Work Site: Miami, FL
Duration: 12+ months
Expected Start Date: Immediate
Number of Positions: 01
Please send your responses back to