Lead Associate Principal, AWS Cloud Engineering
SALARY: $175k - $190k plus 15% bonus
LOCATION: CHICAGO, IL
Hybrid 3 days onsite
Open to h1b
Looking for a lead cloud engineer with heavy kubernetes. Experience working in a financial services or highly regulated environment. This will involve utilizing best practices for the management, architecture, configuration, high availability, disaster recovery, administration, and automation of Kubernetes clusters and containerized workloads with cloud-native technologies. The ideal candidate is passionate about cloud-native technologies and Kubernetes ecosystem tools to accomplish complex project initiatives and implement mission critical systems, while keeping current with trends in the Kubernetes and CNCF spaces for areas to improve, with a steady eye towards the extensive regulatory/compliance demands on our company (e.g. CIS, NIST, etc).
- Reports to the Director of Platform Automation and Cloud Engineering
- Design, configure, implement and manage Kubernetes clusters and maintain a fully automated workflow for provisioning and managing a complex, highly available container orchestration environment using infrastructure as code
- Develop and maintain Kubernetes operators, controllers, and custom resources to extend cluster functionality and automate application lifecycle management
- Manage DevOps development activities and complex development tasks that will involve working with tools such as Docker, Kafka, container runtimes, and Kubernetes ecosystem tools
- Lead and participate in Kubernetes cluster build-outs, upgrades, software installation, maintenance and support, including but not limited to, patches, security fixes, end-of-life preparation, and version upgrades
- Implement and manage Kubernetes networking solutions, service mesh architectures, runtime security policies, and RBAC configurations to ensure secure and efficient cluster operations
- Ensure the reliability of Kubernetes platforms and containerized services your area of responsibility provide and manage to both specific and implied SLAs to help the organization achieve both internal and external quality standard excellence for the cloud platform
- Assess and plan for capacity needs within Kubernetes clusters and the underlying cloud platform and forecast accordingly
- Implement and manage initiatives within your assigned area of responsibility with accountability for results and compliance with all controls and security requirements
- Lead in the development of technology roadmaps and end-of-life technology plans for Kubernetes versions, container runtimes, and related cloud-native technologies
- Write and maintain documentation of relevant Kubernetes architectures, systems, procedures and processes
- Effectively communicate project and operational service issues to senior management promptly with observations, decisions, and recommendations for corrective measures
- Manage and participate in the implementation of production changes during defined maintenance windows and support on call rotation
- Maintain appropriate work/personal balance within your team
- Serve as a point of escalation within the team for Kubernetes and containerization support issues
Qualifications & Experience
The requirements listed are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the primary functions.
- [Required] Good consultative, communication, team player and analytical skills are a must, as you will be regularly interacting between various teams distributed across the US
- [Required] Working knowledge of Kubernetes architecture, container orchestration, and cloud-native infrastructure design and components, such as: etcd, networking, storage, and container runtimes
- [Required] Extensive hands-on experience with Kubernetes cluster creation, maintenance, support, and administration in production environments
- [Required] Deep understanding and practical implementation experience with Kubernetes networking (CNI plugins, service types, ingress controllers), runtime security (Pod Security Standards, OPA/Gatekeeper, network policies), and Role-Based Access Control (RBAC)
- [Required] Experience with architecting, implementing and maintaining highly available mission critical Kubernetes environments for 24/7 availability
- [Required] Experience working in an environment with a defined production change control process
- [Required] Demonstrates history of working within deadlines and ability to work well under pressure
Technical Skills & Background
- [Required] Production-level hands-on experience with AWS cloud services and implementing Kubernetes on AWS (EKS or self-managed clusters)
- [Required] Extensive experience with Infrastructure as Code using Terraform for provisioning and managing cloud infrastructure and Kubernetes resources
- [Required] Strong hands-on development skills with demonstrable coding experience in Go or Python (Go strongly preferred for Kubernetes operator/controller development). Candidates must be able to provide specific examples of production code they have written.
- [Required] Hands-on experience with Kubernetes ecosystem tools including: Helm, kubectl, container runtimes (containerd, CRI-O), and monitoring/observability tools
- [Required] Experience with CI/CD tools such as Jenkins, GitLab CI, or GitHub Actions.
- [Required] Experience with version control using GitHub or similar platforms
- [Required] Experience with configuration management tools such as Ansible, Puppet, or Chef
- [Strongly Preferred] Hands-on experience with Kubernetes operator/controller development using operator frameworks (Kubebuilder, Operator SDK, or similar). This can be demonstrated through either contributions to open-source Cloud Native Computing Foundation (CNCF) projects, OR Development of in-house Kubernetes operators/controllers. Note: If you have contributed to open-source CNCF projects, please include your GitHub profile link or links to notable Pull Requests in your resume.
- Engineering, etc.), or equivalent combination of education and experience required
- [Required] 7+ years experience in IT systems installation, operations, administration, and maintenance of cloud systems / virtualized servers, with demonstrated significant experience in Kubernetes and container orchestration platforms
- [Preferred] Experience working in a financial services or highly regulated environment preferred