Must-Haves:
Google Cloud Platform
CI/CD
Terraform
Kubernetes Clusters
MLOps is a big nice-to-have.
Job description: KEY RESPONSIBILITIES
Own and maintain CI/CD pipelines across multiple engineering teams (GitHub Actions, Cloud Build, or equivalent)
Manage Google Cloud Platform infrastructure using Terraform and infrastructure-as-code tooling
Administer GKE clusters, containerized workloads, and Kubernetes infrastructure
Support MLOps pipeline needs including model deployment workflows and container registry management
Implement and maintain monitoring, alerting, and observability tooling (Cloud Monitoring, Grafana, etc.)
Drive reliability engineering practices: SLOs, incident response, and runbooks
Manage secrets, IAM policies, and platform security configurations
REQUIRED QUALIFICATIONS
3+ years of DevOps or platform engineering experience
Hands-on experience with Google Cloud Platform services: GKE, Cloud Build, Artifact Registry, Cloud Storage, IAM
Proficiency in Terraform or equivalent infrastructure-as-code tooling
Strong Kubernetes administration skills (deployment, scaling, networking, RBAC)
Experience building and maintaining CI/CD pipelines
Solid scripting ability in Python, Bash, or similar
PREFERRED QUALIFICATIONS
Experience supporting ML/AI workloads (model serving infrastructure, GPU provisioning)
Google Cloud Platform Professional DevOps Engineer or Cloud Architect certification
Familiarity with ArgoCD, Helm, or GitOps workflows
Retail or high-transaction-volume system experience