Overview
On Site
Full Time
Skills
Reporting
Innovation
Training
Testing
Workflow
Collaboration
Machine Learning Operations (ML Ops)
Python
Machine Learning (ML)
TensorFlow
PyTorch
scikit-learn
Kubernetes
Docker
Continuous Integration
Continuous Delivery
DevOps
Jenkins
GitHub
Grafana
Cloud Computing
Microsoft Azure
Artificial Intelligence
Auditing
Regulatory Compliance
Computer Science
Data Science
Job Details
Can work in Plano, TX or Camas, WA
As a Machine Learning Operations Engineer, you will design, implement, and maintain the systems that bridge the gap between machine learning development deployment. You will work with data scientists, data engineers, and platform teams to ensure models are monitored, versioned, governed, and continuously improved. You will report to the Vice President, Technology Innovation.
The Day-to-Day:
Your Qualifications:
As a Machine Learning Operations Engineer, you will design, implement, and maintain the systems that bridge the gap between machine learning development deployment. You will work with data scientists, data engineers, and platform teams to ensure models are monitored, versioned, governed, and continuously improved. You will report to the Vice President, Technology Innovation.
The Day-to-Day:
- Build, maintain, and improve Machine Learning pipelines for training, testing, deployment, and monitoring
- Develop CI/CD workflows tailored for ML environments, including model versioning and reproducibility
- Implement monitoring systems for model drift, performance, and reliability
- Automate retraining and deployment workflows using Kubernetes, Docker, and cloud services
- Collaborate with security and governance teams to ensure compliance with internal and external regulations
- Partner with data scientists to translate experiments into production-ready pipelines
- Contribute to the evolution of scalable AI platforms using Azure Machine Learning, NVIDIA NIMs, and NeMo services
Your Qualifications:
- 10+ years experience developing data-related solutions and software
- 5+ years of experience with Machine Learning Operations, DevOps, or related disciplines
- 5+ years of proficient experience with Python, strong experience in ML frameworks (TensorFlow, PyTorch, Scikit-learn)
- Hands-on expertise with Kubernetes, Docker, and CI/CD tools (Azure DevOps, Jenkins, GitHub Actions)
- Experience with monitoring tools (Prometheus, Grafana, MLflow, Weights & Biases)
- Deep knowledge of cloud-native AI services, especially Microsoft Azure AI
- Practical experience with NVIDIA NIMs and NeMo services for deployment and fine-tuning of foundation models
- Familiarity with model governance, audit, and compliance frameworks
- Bachelor's degree in Computer Science, Data Science, or equivalent work experience
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.