Kubernetes Engineer/Administrator - Azure AKS Position Overview We are seeking an experienced Kubernetes Engineer/Administrator to join our team. This role focuses on managing and scaling our enterprise-grade Azure Kubernetes Service (AKS)infrastructure. You will be responsible for designing, implementing, and maintaining production Kubernetes clusters that support critical enterprise workloads across multiple Azure regions. Primary Responsibilities Azure Kubernetes Service (AKS) Management Design, deploy, and manage enterprise-scale AKS clusters across multiple Azure regions Implement and maintain private AKS clusters with advanced networking configurations Configure and manage customer-managed encryption keys (CMK) for cluster disk encryption Implement blue/green deployment strategies for zero-downtime cluster upgrades Manage AKS cluster lifecycle including upgrades, node pool scaling, and disaster recovery Optimize cluster performance, cost, and resource utilization Implement AKS Fleet Manager for multi-cluster management and orchestration Configure AKS Automatic for simplified cluster operations and auto-scaling Manage AKS Managed Namespaces for improved multi-tenancy and resource isolation.
Required Qualifications
- Technical Skills - Azure & Kubernetes
- 5+ years of hands-on Kubernetes experience in production environments
- 2+ years of Azure Kubernetes Service (AKS) experience required
- Strong Terraform expertise with proven ability to build reusable, production-ready modules
- Deep understanding of Kubernetes architecture, networking, storage, and security
- Experience with private AKS clusters and Azure Private Link/Private Endpoints
- Proficiency with Azure networking: VNets, subnets, NSGs, private DNS zones, VNet peering
- Strong understanding of Azure managed identities, Workload Identity, and RBAC
- Experience with Azure Key Vault integration (CSI driver, disk encryption sets)
- Hands-on experience with customer-managed encryption keys in Azure
- Experience with Azure Container Registry including geo-replication and vulnerability scanning
- Knowledge of AKS advanced features (Fleet Manager, AKS Automatic, Managed Namespaces) isa plus
Infrastructure as Code & Automation
Advanced Terraform skills with module development experience
Git version control and branching strategies (GitHub)
GitOps tools: ArgoCD
GitHub Actions for CI/CD pipelines
Infrastructure testing and validation practices
Platform & Tools
Azure CLI and Azure PowerShell
kubectl, helm, kustomize
Linux system administration
Scripting: Bash, Python, or PowerShell
Container technologies: Docker, containerd
GitHub workflows and Actions
Soft Skills
Azure Certifications:
- Azure Solutions Architect Expert (AZ-305)
- Azure Security Engineer Associate (AZ-500)
- Azure Administrator Associate (AZ-104)
Strong analytical and troubleshooting abilities
- Excellent documentation skills with focus on knowledge sharing
- Collaborative team player with mentoring capabilities
- Effective communication for both technical and business audiences
- Self-motivated with ability to manage complex projects
Addition Details
Security & Compliance
- Implement and maintain private networking architectures with Azure Private Endpoints
- Configure and manage Workload Identity (OIDC) and user-assigned managed identities
- Integrate Azure Policy for governance, compliance, and security enforcement
- Implement Kubernetes RBAC and Azure RBAC integration
- Manage secrets integration with Azure Key Vault using CSI drivers
- Ensure secure communication between AKS and Azure PaaS services
- Implement network policies and pod security standards
Service Mesh & Advanced Networking
- Deploy and manage Linkerd service mesh for secure service-to-service communication
- Implement mTLS between services with automatic certificate rotation
- Configure traffic splitting, load balancing, and observability with Linkerd
- Troubleshoot service mesh networking and performance issues
- Integrate service mesh metrics with Azure Monitor
Infrastructure as Code (IaC)
- Develop and maintain Terraform modules for AKS and supporting Azure infrastructure
- Build reusable, production-ready Terraform patterns following Azure best practices
- Implement infrastructure automation and GitOps workflows
- Manage Terraform state, version control, and module lifecycle
- Create and maintain comprehensive documentation for infrastructure patterns
GitOps & CI/CD
- Design and implement GitOps workflows using ArgoCD for application deployments
- Build and maintain CI/CD pipelines using GitHub Actions for Kubernetes workloads
- Integrate AKS with Azure Container Registry (ACR) for container image management
- Implement automated testing and validation for infrastructure and application changes
- Manage deployment strategies (rolling updates, blue/green, canary)
- Maintain GitHub Actions workflows for infrastructure provisioning and testing
Azure Platform Integration
- Integrate AKS with Azure services including
- Configure and maintain private endpoints for all Azure services
- Implement VNet integration and subnet delegation patterns
- Design and implement service connectivity across Azure regions
Monitoring, Observability & Operations
- Implement comprehensive monitoring and alerting with Azure Monitor
- Configure Log Analytics workspaces and integrate with AKS
- Build dashboards and alerts for cluster health, performance, and security
- Leverage Linkerd metrics and distributed tracing for service observability
- Troubleshoot complex cluster, networking, and application issues
- Conduct capacity planning and cost optimization
- Participate in on-call rotation for production support
- Perform post-incident analysis and implement preventive measures