Title: ML Ops Engineer
Location: Houston, TX 77002
Duration: 8 months
Pay Range: $70/hr - $78/hr
The Company offers the following benefits for this position, subject to applicable eligibility requirements: medical insurance, dental insurance, vision insurance, 401(k) retirement plan, life insurance, long-term disability insurance, short-term disability insurance, paid parking/public transportation, (paid time , paid sick and safe time , hours of paid vacation time, weeks of paid parental leave, paid holidays annually - AS Applicable)
Must-have: Strong MLOps experience, Hands-on experience with AWS, MS Azure, and Snowflake in building or supporting production Machine Learning /data platforms.
Job Summary
We are seeking an MLOps Engineer to design, deploy, monitor, and maintain machine learning solutions in production across AWS, MS Azure, and Snowflake environments. This role will partner with data scientists and cloud teams to operationalize Machine Learning models, automate pipelines, and build reliable, secure, and scalable Machine Learning platforms.
The ideal candidate has strong experience in the end-to-end Machine Learning lifecycle, cloud-native deployment, CI/CD automation, model monitoring, and production data pipelines, with hands-on expertise in AWS, Azure, and Snowflake.
Key Responsibilities
Design and implement end-to-end Machine Learning pipelines for data ingestion, feature engineering, model training, validation, deployment, and monitoring
Deploy and manage Machine Learning models in production across AWS, Azure, and Snowflake-based ecosystems
Build batch and real-time inference pipelines using cloud-native and platform-native services
Automate model packaging, testing, release, and rollback using CI/CD best practices
Integrate Machine Learning workflows with services such as AWS SageMaker, AWS Lambda, Azure Machine Learning, Azure Data Factory, and Snowflake
Build and maintain orchestration workflows using tools such as Airflow, Azure Data Factory, or similar platforms
Implement experiment tracking, model registry, and model governance processes
Monitor model accuracy, drift, latency, throughput, pipeline failures, and infrastructure usage
Establish deployment strategies such as canary, shadow, blue-green, and rollback mechanisms
Collaborate with cross-functional teams to move models from research to production
Ensure security, compliance, traceability, and access control for models and data across cloud environments
Optimize platform performance, reliability, and cost across AWS, Azure, and Snowflake
Document architecture, deployment standards, and operational procedures
Required Qualifications
Master s or Advanced degree (PhD) in Computer Science, Computer Engineering, or Similar
Five or more years of relevant experiences
Proven experience in MLOps, Machine Learning engineering, platform engineering, or DevOps
Strong hands-on experience with AWS, MS Azure, and Snowflake
Strong programming skills in Python and SQL
Experience deploying and managing Machine Learning models in production
Experience with cloud Machine Learning services such as AWS SageMaker and Azure Machine Learning
Experience building data pipelines and integrating with Snowflake
Knowledge of CI/CD pipelines, infrastructure automation, and model versioning
Experience with containerization and orchestration tools such as Docker and Kubernetes
Experience with workflow orchestration tools such as Airflow, Azure Data Factory, or similar
Familiarity with model monitoring, logging, alerting, and observability
Solid understanding of data engineering concepts, APIs, and distributed processing
Strong troubleshooting, communication, and cross-team collaboration skills
Preferred Qualifications
Experience with Snowflake Cortex AI, Snowpark, or Machine Learning workloads in Snowflake
Experience with AWS Bedrock, Azure OpenAI, or production LLM workflows
Experience with real-time inference, event-driven pipelines, and serverless architectures
Familiarity with feature stores, vector databases, and RAG-based systems
Experience with Terraform, CloudFormation, or Azure infrastructure-as-code tools
Understanding of security, compliance, and governance requirements for regulated environments
Experience with production A/B testing, shadow deployment, and rollback strategies