MLOps Platform Engineer

Reston, VA, US • Posted 9 hours ago • Updated 1 hour ago
Contract W2
Travel Required
On-site
$62/hr
Company Branding Image
Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

  • S3
  • CloudWatch
  • IAM
  • EC2
  • 3+ years of hands-on experience with AWS services
  • including EKS
  • and ECR. Strong experience operating and troubleshooting Kubernetes (preferably AWS EKS). Proficiency in containerization (Docker) and orchestration concepts. Strong programming/scripting experience in Python and Bash. Experience building
  • including training
  • inference
  • and model monitoring. Experience with infrastructure-as-code (Terraform or CloudFormation). Experience supporting production platforms
  • including incident management and root cause analysis.

Summary

MLOps Platform Engineer

Reston VA Onsite

W2

Description:



The Data Modeling Analytics & AI Engineering team is seeking an experienced MLOps Platform Engineer to design, build, and support enterprise-grade machine learning operations capabilities. This role will play a key part in enabling scalable, reliable, and secure ML model development and deployment across our cloud and container platforms.

This is a hands-on engineering role requiring strong expertise in AWS, Kubernetes (EKS), CI/CD automation, containerization, and ML platform operations. The ideal candidate will have solid engineering fundamentals combined with practical knowledge of ML workflows, deployment patterns, and platform reliability.

Key Responsibilities


Platform Engineering & Operations

Engineer, manage, and support MLOps platform components across AWS and EKS-based environments.

Oversee deployment, configuration, and operation of infrastructure used for ML training, batch inference, and real-time model serving.

Ensure platform availability, resilience, and performance across dev, test, and production environments.

Implement role-based access controls (RBAC), network policies, and scalable namespace designs within EKS.

Model Deployment & CI/CD Automation

Build and support CI/CD pipelines (GitLab) for model packaging, container image builds, vulnerability scanning, and automated deployment flows.

Enable standardized model release processes including environment promotion, versioning, and rollback workflows.

Integrate CI/CD with ML frameworks, model repositories, artifacts, and runtime environments.

Container & Kubernetes Workloads

Design and manage EKS workloads supporting containerized ML jobs and microservices.

Implement auto-scaling, resource quotas, cluster optimization, and multi-tenant workload isolation.

Support GPU and CPU-based training/inference workloads.

Monitoring, Observability & Optimization

Implement logging, monitoring, and alerting for ML pipelines, model endpoints, batch jobs, and platform components.

Analyze compute, storage, and data transfer usage to optimize cost efficiency across ML workloads.

Perform incident response, root cause analysis, and long-term remediation planning.

Collaboration & Enablement

Partner with Data Scientists, ML Engineers, and application teams to operationalize end-to-end machine learning solutions.

Provide technical guidance on best practices for ML model lifecycle management, deployment patterns, and scalable architectures.

Contribute to documentation, runbooks, onboarding materials, and internal knowledge bases.


Required Qualifications


3+ years of hands-on experience with AWS services, including EKS, EC2, S3, IAM, CloudWatch, and ECR.

Strong experience operating and troubleshooting Kubernetes (preferably AWS EKS).

Proficiency in containerization (Docker) and orchestration concepts.

Strong programming/scripting experience in Python and Bash.

Experience building and managing CI/CD pipelines (GitLab or equivalent).

Familiarity with machine learning workflows, including training, inference, and model monitoring.

Experience with infrastructure-as-code (Terraform or CloudFormation).

Experience supporting production platforms, including incident management and root cause analysis.


Preferred Qualifications


Experience managing Data Analytics Platforms / Tools (e.g., Domino, SageMaker)

Experience with ML lifecycle tools such as MLflow, or similar.

Experience supporting GPU-based workloads or distributed training environments.

Familiarity with enterprise MLOps architectures and patterns (batch, real-time, microservices).

Understanding of data processing frameworks and feature pipelines.


Other Competencies


Strong analytical, troubleshooting, and problem-solving skills.

Effective communication and documentation abilities.

Ability to collaborate across engineering, analytics, and product teams.

Self-motivated with the ability to drive initiatives independently.

Ability to work in a complex, regulated enterprise environment.

Thanks & Regards

Nitin Sharma
E:

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91140876
  • Position Id: 2026-169
  • Posted 9 hours ago

Company Info

About Cliff Services Inc

Cliff Services Inc. is an IT services and consulting company into planning and implementing cutting-edge IT business solutions and services for various business problems, in retail, healthcare, finance, education, food and various other industries. With our vast technology and industry expertise we provide scalable business solutions and assist our clients in achieving their business objectives with the use of technology.

About_Company_OneAbout_Company_Two
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Reston, Virginia

Today

Easy Apply

Contract

$57

Reston, Virginia

Today

Easy Apply

Contract

Reston, Virginia

Today

Easy Apply

Contract

$57

Search all similar jobs