Overview
Skills
Job Details
We are hiring an experienced MLOps Engineer with hands-on expertise in Google Cloud Platform (Google Cloud Platform) and Vertex AI. You ll be responsible for building and maintaining scalable machine learning infrastructure, automating workflows, and enabling robust AI/ML deployments in production environments.
Key Responsibilities:
Develop, automate, and manage ML pipelines using Vertex AI Pipelines, Kubeflow, and Cloud Composer
Deploy and monitor models in production using Vertex AI and CI/CD workflows (Cloud Build, GitHub Actions, etc.)
Work closely with ML engineers and data scientists to productionize models and manage model versioning, retraining, and rollback strategies
Manage infrastructure-as-code using Terraform, Deployment Manager, or similar tools
Implement observability and monitoring (logging, metrics, alerts) using Cloud Monitoring, Prometheus, or Grafana
Ensure security, governance, and compliance of ML workflows within the Google Cloud Platform ecosystem
Optimize cost, performance, and scalability of ML systems in production
Required Skills:
5+ years in DevOps/MLOps or Cloud ML Engineering, with recent Google Cloud Platform production experience
Strong hands-on experience with Vertex AI, Cloud Functions, BigQuery, and GCS
Proficiency with tools like TFX, Kubeflow, Docker, and Kubernetes (GKE preferred)
Expertise in CI/CD, GitOps, and workflow orchestration
Programming skills in Python (ML workflows) and Bash/Terraform (infra scripting)
Solid understanding of model lifecycle, pipeline automation, and ML monitoring
Bachelor's or Master s in Computer Science, Data Engineering, or related field
Nice to Have:
Google Cloud Platform Professional Machine Learning Engineer or DevOps Engineer certification
Familiarity with LLMs, RAG, or Vertex AI Search & Conversation
Experience with multi-region deployments or hybrid cloud setups
Exposure to Data Governance and Responsible AI practices