MLOps Engineer Distributed Systems (Ray)

Overview

On Site
$60 - $65
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 6 Month(s)

Skills

Ray

Job Details

Job Title: MLOps Engineer Distributed Systems (Ray)

Location: Austin-TX
Duration: Long Term

Job Summary:

We are looking for a skilled and motivated MLOps Engineer with deep expertise in distributed machine learning and a strong understanding of Ray for scalable and efficient ML infrastructure. You will be responsible for automating and maintaining ML pipelines, enabling reproducible training and inference, and working alongside Data Scientists and ML Engineers to productionize models at scale.

Key Responsibilities:

Design, build, and manage end-to-end MLOps pipelines using Ray for training, tuning, serving, and monitoring ML models

Leverage Ray Train for distributed model training across CPU/GPU clusters

Implement scalable hyperparameter optimization using Ray Tune

Deploy ML models into production using Ray Serve with FastAPI/Flask

Integrate data preprocessing pipelines with Ray Data and orchestrate workflows via Airflow or Kubernetes

Maintain and monitor deployed models, ensuring performance and accuracy over time (model drift, data quality)

Collaborate with Data Science and DevOps teams to align on scalable ML architecture

Ensure reproducibility, versioning, and CI/CD practices using MLflow, GitHub Actions, Jenkins, or similar tools

Develop infrastructure on cloud (AWS/Google Cloud Platform/Azure) with autoscaling and cost optimization in mind

Required Skills & Experience:

3+ years in MLOps, Machine Learning Engineering, or related roles

Hands-on experience with Ray Core, Ray Tune, Ray Train, Ray Serve

Strong programming experience in Python, and familiarity with libraries like PyTorch, TensorFlow, Scikit-learn

Solid understanding of containerization and orchestration using Docker, Kubernetes, Helm

Experience building CI/CD pipelines for ML using tools like Jenkins, GitHub Actions, Azure DevOps

Familiarity with cloud platforms (AWS, Azure, or Google Cloud Platform) and distributed compute environments

Experience with model tracking/versioning tools like MLflow, DVC, or Weights & Biases

Strong communication skills and ability to work in cross-functional teams

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.