MLOps Engineer

Overview

On Site
$60+
Contract - W2
Contract - 12 Month(s)

Skills

Python
MLflow
Kubeflow

Job Details

Job Title: MLOps Engineer
Location: South San Francisco, CA (Hybrid 3 days/week onsite)
About the Role:
We re seeking a seasoned Machine Learning Engineer (Operations) with deep expertise in AWS-native tools, machine learning pipelines, and production-level model deployment. You will be responsible for building and optimizing scalable, secure, and efficient ML systems in a hybrid environment, with a focus on automation, monitoring, and integration.
ML Model Lifecycle:
Build, train, deploy, and manage ML models using AWS services like SageMaker, EC2, S3, and Lambda.
Optimize performance and resource usage for cost-effective ML operations.
Leverage Amazon Bedrock for deploying and managing foundation models in GenAI use cases.
Data Processing & Pipelines:
Design and maintain automated ML pipelines using AWS Glue and Step Functions.
Implement seamless data ingestion, transformation, and storage strategies.
LLM & GenAI Experience:
Implement and optimize Large Language Models (LLMs) for real-world applications.
Monitor performance, fine-tune parameters, and enhance model accuracy via prompt engineering.
Infrastructure & Deployment:
Containerize ML models with Docker and manage orchestration via Kubernetes (EKS/ECS).
Use CloudFormation or Terraform for infrastructure as code (IaC) setup.
Integrate CI/CD tools to automate ML workflow deployments.
Monitoring & Reliability:
Implement robust logging, monitoring, and alerting via AWS CloudWatch.
Ensure production-grade model reliability and scalability.
Security & Governance:
Apply security best practices in containerization, data handling, and infrastructure.
Ensure compliance with data governance policies.
Integration & External Systems:
Integrate ML workflows with external systems (e.g., Veeva Promomat or similar platforms).
Required Qualifications:
Strong understanding of machine learning concepts, algorithms, and best practices.
Proven hands-on experience with:
Amazon SageMaker, EC2, S3, Lambda
AWS Textract for document data extraction
AWS Glue, Step Functions
Amazon Bedrock, RDS, DynamoDB
Docker, Kubernetes (EKS/ECS)
AWS CloudWatch
Python for data processing, automation, and scripting
Demonstrable experience with LLM optimization, prompt engineering, and GenAI applications.
Experience implementing testing practices (unit/integration) in Python.
Familiarity with CI/CD tools and external system integrations.
Preferred Qualifications:
AWS Certifications (e.g., ML Specialty, DevOps Engineer)
Experience with Terraform or AWS CloudFormation
Background in content management/regulatory systems like Veeva
Knowledge of container security, resource optimization, and performance tuning
Soft Skills:
Strong analytical and problem-solving mindset.
Self-driven with a proactive, results-oriented approach.
Commitment to continuous improvement in MLOps practices

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About ePace Technologies, Inc