Apply Now

MLOps Engineer

Remote • Posted 1 day ago • Updated 1 day ago

Full Time

No Travel Required

Remote

$120,000 - $155,000/yr

Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

TensorFlow
PyTorch
Amazon SageMaker
Natural Language Processing
Machine Learning Operations (ML Ops)

Summary

MLOps Engineer

Location: Remote (Preference for East Coast collaboration hours

Salary: DOE

The MLOps Engineer ensures that models and inference pipelines built by ML Engineers can be safely promoted across Dev, QA, and Prod, meet operational SLAs, and evolve without introducing instability or uncontrolled cost.

This is a production operations role, focused on runtime behavior, infrastructure, and reliability.

What You’ll Do

Design, deploy, and operate end‑to‑end production ML pipelines across Dev, QA, and Prod environments.
Set up and manage AWS SageMaker pipelines, endpoints, and monitoring for large scale inference workloads, including embedding generation, named entity recognition, reranking, and video processing.
Own GPU and CPU infrastructure selection, scaling, and optimization, including instance benchmarking, autoscaling behavior, and load testing.
Deploy, monitor, and operate inference services that support hundreds of thousands of queries per day across text, image, and video pipelines.
Establish standardized ML deployment patterns at AP, including:

Containerization and orchestration strategies
Environment isolation (Dev / QA / Prod)
Versioned promotion, rollback, and recovery mechanisms

Implement monitoring, alerting, drift detection, and evaluation metrics for production ML systems, tracking latency, error rates, throughput, and model/data drift.
Enable A/B testing and controlled rollout strategies for ML models in production, in partnership with engineering and product teams.
Partner closely with ML Engineers, Data Scientists, DevOps, and Platform teams to:

Operationalize new models and pipeline improvements
Promote systems across environments safely
Ensure deployments meet reliability, scale, and cost targets

Manage high-throughput I/O and data movement for large collections of media assets (text, images, video), avoiding CPU, network, and storage bottlenecks.
Reduce operational risk by enforcing reproducibility, observability, security, and cost controls across all production ML systems.

What This Role Owns (and What It Does Not)

This role owns:

Deployment, scaling, and runtime operation of ML systems
ML infrastructure configuration and orchestration
Monitoring, alerting, A/B testing infrastructure, and drift detection
Reliability, cost control, and production governance

Required Skills & Experience

Hands‑on experience deploying and operating ML inference systems in production.
Strong experience with AWS SageMaker, including pipelines, endpoints, monitoring, and multi‑environment deployments.
Expertise deploying ML models using PyTorch and TensorFlow from an operational and serving perspective.
Proven experience with model deployment and orchestration, including containerized inference and autoscaling.
Experience selecting, evaluating, and optimizing compute resources (GPU/CPU) for production ML workloads.
Experience setting up monitoring, evaluation metrics, and A/B testing frameworks for ML systems in production.
Ability to collaborate effectively with ML Engineers, Data Scientists, and platform teams in a shared ownership model.

Strongly Preferred

Operational experience supporting ML systems involving:

Transformer‑based NLP models (e.g., BERT‑family models)
Computer vision models
Ranking and reranking systems

Familiarity operating systems that use common ML model types such as:

Convolutional and feed‑forward neural networks
Ranking algorithms
Approximate Nearest Neighbor methods (e.g., HNSW)

Experience running ML workloads over large‑scale text, image, and video datasets.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: saicon
Position Id: 8989026
Posted 1 day ago

Contact the job poster

Sujeet Salver

Recruiter @ Saicon Consultants Inc.

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Remote

•

9d ago

100% remote role - work on EST We are seeking a Senior MLOps Engineer to support large-scale production machine learning environments focused on text, image, and video processing workloads in AWS. This is a highly operational and infrastructure-focused role. The ideal candidate has hands-on experience deploying, monitoring, scaling, and optimizing ML systems in production environments particularly within AWS SageMaker ecosystems. This is NOT a data science or model research role. The focus is pr

Easy Apply

Full-time

$140,000 - $180,000

MLOps Engineer

Remote or McLean, Virginia

•

Today

Job Family: Data Science & Analysis Travel Required: Up to 25% Clearance Required: Active Secret What You Will Do: As an MLOps Engineer, you will design, implement, and support the platforms, pipelines, and operational processes that enable scalable, secure, and reliable deployment of machine learning solutions for federal clients. You will partner closely with data scientists, AI engineers, data engineers, and government stakeholders to operationalize models across development, testing, and

Full-time

USD 113,000.00 - 188,000.00 per year

Lead Machine Learning Engineer

Remote or Boston, Massachusetts

•

Today

Lead ML Engineer We are Lennar Lennar is one of the nation's leading homebuilders, dedicated to making an impact and creating an extraordinary experience for their Homeowners, Communities, and Associates by building quality homes and providing exceptional customer service, giving back to the communities in which we work and live in, and fostering a culture of opportunity and growth for our Associates throughout their career. Lennar has been recognized as a Fortune 500 company and consistently

Full-time

USD 152,600.00 - 190,700.00 per year

MLOps Engineer

Remote

•

30+d ago

Key Responsibilities Develop and maintain ML pipelines using tools like MLflow, Kubeflow, or Vertex AI.Automate model training, testing, deployment, and monitoring in cloud environments (e.g., Google Cloud Platform, AWS, Azure).Implement CI/CD workflows for model lifecycle management, including versioning, monitoring, and retraining.Monitor model performance using observability tools and ensure compliance with model governance frameworks (MRM, documentation, explainability)Collaborate with engin

Easy Apply

Full-time, Third Party

Depends on Experience

Search all similar jobs

MLOps Engineer

Dice Job Match Score™

Job Details

Skills

Summary

Sujeet Salver

Similar Jobs