Overview
Skills
Job Details
DevOps Engineer – Cloud, Automation, and MLOps
Job Description:
We are looking for an experienced DevOps Engineer with strong expertise in cloud infrastructure automation, CI/CD, and MLOps. The role involves designing and implementing scalable solutions for infrastructure provisioning, configuration management, and machine learning workflows. You will work with modern DevOps tools and best practices to ensure secure, reliable, and automated deployments across cloud environments.
Responsibilities:
- Automate infrastructure provisioning and deployments using Terraform, CloudFormation, and other IaC tools.
- Architect and implement MLOps pipelines for machine learning models using Kubeflow, MLflow, or similar frameworks.
- Build and maintain CI/CD pipelines using GitLab, Jenkins, and integrate automated testing and deployment.
- Manage configuration automation with Ansible, Puppet, or Chef.
- Implement monitoring and alerting using Prometheus, Grafana, ELK Stack, and cloud-native tools.
- Ensure security and compliance across infrastructure and deployments (IAM, encryption, vulnerability scanning).
- Document cloud onboarding processes and maintain knowledge bases using Confluence.
- Collaborate with development and operations teams to transition support and maintain operational excellence.
Technical Skills & Tools:
- Cloud Platforms: AWS, Azure, Google Cloud Platform
- Infrastructure as Code: Terraform, CloudFormation
- CI/CD: GitLab, Jenkins, GitHub Actions
- Configuration Management: Ansible, Puppet, Chef
- Containers & Orchestration: Docker, Kubernetes
- MLOps: Kubeflow, MLflow
- Monitoring & Logging: Prometheus, Grafana, ELK Stack
- Scripting: Python, Bash
- Version Control: Git
- Security: IAM, SSL/TLS, compliance frameworks
DevOps, Terraform, CloudFormation, AWS, Azure, Google Cloud Platform, CI/CD, GitLab, Jenkins, GitHub Actions, Ansible, Puppet, Chef, Docker, Kubernetes, MLOps, Kubeflow, MLflow, Python, Bash, Prometheus, Grafana, ELK, Monitoring, Logging, Cloud Security, IAM, Automation, Infrastructure as Code, IaC, Cloud Infrastructure, Machine Learning Operations, Confluence, Git, Cloud Deployment, Continuous Integration, Continuous Delivery