Senior Infrastructure Platform Engineer / Azure Platform Lead
Location: Charlotte, NC
Onsite Requirement: Yes
Number of Days Onsite: 3 Days ( Hybrid Charlotte, NC)
We are seeking a highly skilled Senior Infrastructure Platform Engineer / Azure Platform Lead with strong hands-on expertise in Azure cloud infrastructure, Kubernetes (AKS), Infrastructure as Code (IaC), CI/CD automation, and observability. The ideal candidate will play a critical role in platform stability, scalability, security, and reliability while leading high-severity incidents and mentoring junior engineers. Strong communication skills and the ability to collaborate with US-based stakeholders are essential.
Key Responsibilities
Infrastructure & Platform Management
Manage and maintain enterprise-scale Azure infrastructure platforms, including:
OS and platform patching
Service upgrades
Certificate lifecycle management
Ensure high availability, security, and compliance of cloud infrastructure.
Azure & Kubernetes (AKS) Operations
Design, deploy, and manage infrastructure using Infrastructure as Code (IaC):
Terraform
ARM Templates
Hands-on experience with Azure Cloud Services, including:
Azure Networking
Azure Firewall integrations
Identity and access management
Operate and maintain Azure Kubernetes Service (AKS):
Cluster upgrades using N-1 upgrade strategy
Node pool management and autoscaling
Troubleshooting Istio service mesh
Implementing and managing network policies
Certificate management within Kubernetes environments
CI/CD & Automation
Design and manage CI/CD pipelines for both infrastructure and application deployments using YAML pipelines
Govern and manage agent pools (legacy and modern)
Automate:
Image updates
Scaling strategies
Deployment validations
Ensure pipeline reliability, security, and efficiency.
Incident Management & Reliability
Lead and manage high-severity (P0/P1) incidents, including:
Rapid triage and root cause analysis (RCA)
Coordinating cross-functional teams
Driving resolution through break-fix and long-term corrective actions
Improve platform reliability through proactive monitoring and continuous improvement.
Observability & Monitoring
Hands-on experience with Observability and Monitoring tools, including:
Dynatrace
Prometheus
Grafana
Build and maintain:
SLO dashboards
Alerting strategies
Alert routing and escalation mechanisms
Automate service discovery and monitoring integrations.
Leadership & Collaboration
Provide technical leadership and guidance to junior engineers
Mentor team members on best practices in cloud, Kubernetes, and DevOps
Collaborate effectively with customers and stakeholders in US time zones
Participate in architectural discussions and contribute to platform roadmap decisions
Required Skills & Qualifications
8 12 years of experience in Infrastructure, Cloud, or Platform Engineering
Strong expertise in Azure Cloud and AKS
Proficiency in Terraform, ARM, and IaC best practices
Solid understanding of Kubernetes networking, security, and service mesh
Experience with CI/CD pipelines and automation
Strong troubleshooting and incident management skills
Excellent communication and stakeholder management skills
Proven experience in team leadership and mentoring
Preferred Qualifications
Experience in SRE practices
Exposure to multi-cluster or multi-region Kubernetes setups
Azure certifications (AZ-104, AZ-305, or similar)
E: