Job ID: T#5870 - Lead Kubernetes Platform Engineer
PLEASE NOTE: This is a 12 month contract-to-hire and needs to meet Client full-time conversion policies. Those dependent on a work permit sponsor now or anytime in the future (ie H1B, OPT, CPT, etc) do not meet Client requirements for this opening.
We are seeking a Senior Kubernetes Platform Engineer to join our organization as we grow and transform our Technology landscape. This individual will serve as a technical lead responsible for designing, building, and operating enterprise-grade Kubernetes platforms on AWS. The role encompasses advanced platform engineering tasks including architecture design, infrastructure as code development, containerized system deployment, monitoring/alerting, troubleshooting, and documentation. The individual will provide technical leadership and mentorship to junior engineers, consult with development teams to determine platform requirements, manage cloud resources effectively, act as a subject matter expert in container orchestration and AWS services, and leverage deep technical knowledge to drive the planning and execution of complex cloud-native initiatives.
What Will You Do?
Design, build, and maintain scalable Kubernetes platforms on AWS that enable development teams to deploy and run containerized applications efficiently and reliably.
Take the lead on architecting and implementing cloud-native solutions using Kubernetes, AWS services, and infrastructure as code that meet business and technical requirements.
Develop and maintain Terraform modules and configurations for provisioning and managing Kubernetes clusters, AWS infrastructure, and related platform services.
Deliver platform engineering efforts both independently and by leading other team members through complex technical challenges.
Act as a technology advocate, independently seeking opportunities where Kubernetes, containerization, and cloud technologies can be utilized to improve platform capabilities and developer experience.
Provide technical guidance and mentorship to junior and mid-level engineers, fostering a collaborative team environment and promoting best practices in container orchestration and cloud architecture.
Seek opportunities to expand technical knowledge in emerging Kubernetes features, AWS services, and cloud-native technologies.
Take ownership of platform services and infrastructure at scale. Make informed decisions on evolving, modernizing, and optimizing the Kubernetes platform and AWS architecture.
Implement and maintain monitoring, logging, and alerting solutions for containerized workloads and platform health.
Support platform releases and critical incidents that occur outside of business hours.
Collaborate with security teams to implement and maintain security best practices for Kubernetes and AWS environments.
Perform other duties as assigned.
What Will Our Ideal Candidate Have?
Five years of hands-on experience with Kubernetes in production environments, including cluster administration, troubleshooting, and optimization.
Extensive AWS expertise including EKS, EC2, VPC, IAM, S3, RDS, CloudWatch, Load Balancers, and other core services.
Advanced Terraform experience - ability to design and implement infrastructure as code strategies, create reusable modules, and manage complex cloud infrastructure. Container expertise - deep knowledge of Docker, container networking, service mesh technologies (Istio, Linkerd), and container security best practices.
CNCF ecosystem experience - hands-on work with common Kubernetes controllers and CNCF projects such as cert-manager, external-dns, cluster-autoscaler, ingress-nginx, Kyverno, Crossplane, or similar tools for platform automation and management.
Platform Engineering & Delivery - Advanced skills including the ability to determine platform architecture and design strategies, implement automated testing and validation, use monitoring and feedback loops to ensure platform reliability, and maintain high availability of production systems.
CI/CD pipeline experience - hands-on work with Jenkins, GitLab CI, ArgoCD, Flux, or similar tools for building automated deployment pipelines for containerized applications.
Experience with microservices architectures and supporting infrastructure including API gateways, service discovery, and distributed tracing.
Observability tools - experience with Prometheus, Grafana, ELK/EFK stack, Datadog, or similar monitoring and logging platforms.
Experience with Helm charts and Kubernetes operators for application deployment and management.
Strong understanding of networking including Kubernetes networking models, CNI plugins, ingress controllers, and AWS networking services.
Experience working with highly collaborative, Agile teams in a DevOps or platform engineering capacity.
Eagerness and willingness to learn new cloud-native technologies and stay current with the Kubernetes ecosystem.
Problem Solving - Strong problem solver who utilizes data, metrics, and proofs of concepts to find creative solutions to complex infrastructure challenges, reflects on solutions by measuring their impact, and uses that information to optimize platform performance and reliability. Adept at making architectural decisions that involve numerous factors with broad implications.
Communication - Strong communicator who can explain complex Kubernetes and AWS concepts to both technical and non-technical audiences, document platform architecture and procedures clearly, collaborate effectively across teams, quickly identify core issues in technical discussions, give and receive constructive feedback, and ensure all voices are heard; an attentive and empathetic listener.
Leadership - Advanced leadership skills with the ability to take ownership when there is no clear direction, inspire and motivate team members, and effectively influence technical decisions across the organization.
What is a Must Have?
Bachelor's degree in Computer Science or a related field, or its equivalent in work experience.
3 years minimum of production Kubernetes experience.
3 years minimum of AWS cloud engineering experience.
Hands-on experience with infrastructure as code tools (Terraform required).
Proven experience designing and operating containerized systems at scale