Overview
Skills
Job Details
We have Contract role ML Ops Support Engineer-Hybrid for our client at Reading PA. Please let me know if you or any of your friends would be interested in this position.
Position Details:
ML Ops Support Engineer-Hybrid-Reading PA
Location : Reading, PA 19607 (Hybrid)
Project Duration : 8+ Months of contract
Job Description:
ML Ops L2 Support Engineer to provide 24/7 production support for machine learning (ML) and data pipelines. The role requires on-call support, including weekends, to ensure high availability and reliability of ML workflows. The candidate will work with Dataiku, AWS, CI/CD pipelines, and containerized deployments to maintain and troubleshoot ML models in production.
Key Responsibilities:
Incident Management & Support:
- Provide L2 support for ML Ops production environments, ensuring uptime and reliability.
- Troubleshoot ML pipelines, data processing jobs, and API issues.
- Monitor logs, alerts, and performance metrics using Dataiku, Prometheus, Grafana, or AWS tools such Cloud Watch.
- Perform root cause analysis (RCA) and resolve incidents within SLAs.
- Escalate unresolved issues to L3 engineering teams when needed.
Dataiku Platform Management:
- Manage Dataiku DSS workflows, troubleshoot job failures, and optimize performance.
- Monitor and support Dataiku plugins, APIs, and automation scenarios.
- Collaborate with Data Scientists and Data Engineers to debug ML model deployments.
- Perform version control and CI/CD integration for Dataiku projects.
Deployment & Automation:
- Support CI/CD pipelines for ML model deployment (Bamboo, Bitbucket etc).
- Deploy ML models and data pipelines using Docker, Kubernetes, or Dataiku Flow.
- Automate monitoring and alerting for ML model drift, data quality, and performance.
Cloud & Infrastructure Support:
- Monitor AWS-based ML workloads (Sage Maker, Lambda, ECS, S3, RDS).
- Manage storage and compute resources for ML workflows.
- Support database connections, data ingestion, and ETL pipelines (SQL, Spark, Kafka).
Security & Compliance:
- Ensure secure access control for ML models and data pipelines.
- Support audit, compliance, and governance for Dataiku and ML Ops workflows.
- Respond to security incidents related to ML models and data access.
Required Skills & Experience:
- Experience: 5+ years in ML Ops, Data Engineering, or Production Support.
- Dataiku DSS: Strong experience in Dataiku workflows, scenarios, plugins, and APIs.
- Cloud Platforms: Hands-on experience with AWS ML services (Sage Maker, Lambda, S3, RDS, ECS, IAM).
- CI/CD & Automation: Familiarity with GitHub Actions, Jenkins, or Terraform.
- Scripting & Debugging: Proficiency in Python, Bash, SQL for automation & debugging.
- Monitoring & Logging: Experience with Prometheus, Grafana, Cloud Watch, or ELK Stack.
- Incident Response: Ability to handle on-call support, weekend shifts, and SLA-based issue resolution.
Preferred Qualifications:
- Containerization: Experience with Docker, Kubernetes, or Open Shift.
- ML Model Deployment: Familiarity with Tensor Flow Serving, ML flow, or Dataiku Model API.
- Data Engineering: Experience with Spark, Data bricks, Kafka, or Snowflake.
- ITIL/DevOps Certifications: ITIL Foundation, AWS ML certifications; Dataiku certification
Work Schedule & On-Call Requirements:
- Rotational on-call support (including weekends and nights).
- Shift-based monitoring for ML workflows and Dataiku jobs.
- Flexible work schedule to handle production incidents and critical ML model failures.
Process Flows
- Mentor and Knowledge transfer to client project team members
- Participate as primary, co and/or contributing author on any and all project deliverables associated with their assigned areas of responsibility
- Participate in data conversion and data maintenance
- Provide best practice and industry specific solutions
- Advise on and provide alternative (out of the box) solutions
- Provide thought leadership as well as hands on technical configuration/development as needed.
- Participate as a team member of the functional team
- Perform other duties as assigned.