Overview
Skills
Job Details
MLOps Lead with AWS , Python , Devops exp - Dallas (Preferred) for onsite work/Remote for exceptional candidate but must work in CET time
In person Interview is must
Requisition Name: C&DE-AUTO-MLOps Lead
Start Date: 9/15/2025
Duration: 28 Weeks
Services Location: TX/Dallas
Description Of Services:
Build & Automate ML Pipelines: Design, implement, and maintain CI/CD pipelines for machine learning models, ensuring automated data ingestion, model training, testing, versioning, and deployment. Operationalize Models: Collaborate closely with data scientists to containerize, optimize, and deploy their models to production, focusing on reproducibility, scalability, and performance. Infrastructure Management: Design and manage the underlying cloud infrastructure (AWS) that powers our MLOps platform, leveraging Infrastructure-as-Code (IaC) tools to ensure consistency and cost optimization. Monitoring & Observability: Implement comprehensive monitoring, alerting, and logging solutions to track model performance, data integrity, and pipeline health in real-time. Proactively address issues like model or data drift. Governance & Security: Establish and enforce best practices for model and data versioning, auditability, security, and access control across the entire machine learning lifecycle. Tooling & Frameworks: Develop and maintain reusable tools and frameworks to accelerate the ML development process and empower data science teams.
Deliverables:
-Process Flows -Mentor and Knowledge transfer to client project team members -Participate as primary, co and/or contributing author on any and all project deliverables associated with their assigned areas of responsibility -Participate in data conversion and data maintenance -Provide best practice and industry specific solutions -Advise on and provide alternative (out of the box) solutions -Provide thought leadership as well as hands on technical configuration/development as needed. -Participate as a team member of the functional team -Perform other duties as assigned.
Acceptance Criteria:
Cloud Expertise: Extensive hands-on experience in designing and implementing MLOps solutions on AWS. Proficient with core services like SageMaker, S3, ECS, EKS, Lambda, SQS, SNS, and IAM. Coding & Automation: Strong coding proficiency in Python. Extensive experience with automation tools, including Terraform for IaC and GitHub Actions. MLOps & DevOps: A solid understanding of MLOps and DevOps principles. Hands-on experience with MLOps frameworks like Sagemaker Pipelines, Model Registry, Weights and Bias, MLflow or Kubeflow and orchestration tools like Airflow or Argo Workflows. Containerization: Expertise in developing and deploying containerized applications using Docker and orchestrating them with ECS and EKS. Model Lifecycle: Experience with model testing, validation, and performance monitoring. Good understanding of ML frameworks like PyTorch or TensorFlow is required to effectively collaborate with data scientists. Communication: Excellent communication and documentation skills, with a proven ability to collaborate with cross-functional teams (data scientists, data engineers, and architects