Software Engineer, Cloud Engineering (Kubernetes)

  • Mountain View, CA
  • Posted 5 hours ago | Updated 5 hours ago

Overview

On Site
Up to $240,000
Full Time
Accepts corp to corp applications

Skills

GCP
Argo
Terraform
Google Cloud Platform
Kubernetes
Artificial Intelligence
flux
Traefik
Istio
Docker swarm
Distributed systems
DevOps

Job Details

Role: Software Engineer - Cloud Engineering, Kubernetes
Location: Bay Area, CA (Full-Time)
Job Description
The Cloud Infrastructure team at our client manages and scales a Kubernetes-based, cloud-native AI platform across multiple cloud providers.

They define service level objectives (SLOs), optimize resource allocation, enforce security compliance, and drive cost efficiency for the Multi-Cloud Platform. As a key team member, you will architect and operate a highly scalable, resilient Kubernetes infrastructure to support Big Data and AI workloads.
Your role involves designing advanced cluster management strategies, optimizing fleet capacity, and enhancing observability at scale.
Key Responsibilities
Design, build, and scale Kubernetes-based infrastructure for a multi-cloud AI platform.
Architect and optimize large-scale Kubernetes clusters (scheduling, networking, workload orchestration).
Develop Kubernetes controllers/operators in Python/Go/Rust to automate cluster management.
Enhance observability and monitoring for real-time cluster health tracking.
Automate fleet management (node pools, autoscaling) across AWS/Google Cloud Platform/Azure.
Implement Kubernetes security policies, RBAC, and compliance standards.
Collaborate with ML teams to optimize Kubernetes for AI/ML workloads.
Required Skills
5-7+ years managing large-scale Kubernetes clusters (EKS/GKE/AKS) in production.
Expertise in Kubernetes internals: controllers, operators, CNI, security policies.
Hands-on experience with Terraform, Ansible, and multi-cloud infrastructure (AWS/Azure/Google Cloud Platform).
Proficiency in Go/Python/Rust for developing Kubernetes operators.
BS/MS in Computer Science (PhD preferred).
Nice-to-Have
Chaos engineering, SOC2/GDPR compliance, open-source contributions.
PrometheGrafana, Argo/Flux, and service meshes (Istio/Envoy).
Compensation: Competitive salary + equity + benefits.
Let me know if you d like to explore this further or need additional details.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About TekReliance