Overview
Skills
Job Details
Must Skills Python, Scala, Kubernetes, Scikit, tensor Flow and ML background.
Title- Machine learning Engineer
Location- mclean, VA
Duration- 10 Months contract
Role Overview
We are seeking a highly skilled Machine Learning Engineer (Contractor) to support the operations, maintenance, and scaling of end-to-end machine learning infrastructure for an enterprise ML decisioning platform. This role is heavily focused on ML platform operations, model training pipelines, and infrastructure reliability, rather than ML feature development.
The contractor will work closely with full-time ML engineers, data engineers, and software engineers to ensure the stability, scalability, and efficiency of model training and deployment pipelines that support mobile and web customer experiences.
Team & Platform Context
- The team owns an ML decisioning and recommender system platform used for marketing, servicing, and campaign arbitration across mobile and web channels.
- Responsibilities span data pipelines, APIs, feature development, and ML training pipelines.
- The platform currently runs weekly model refits and is expected to scale to 2x volume due to enterprise growth and integration initiatives.
- ML training pipelines are built on Kubeflow and Kubernetes.
- Current models are Random Forest based, with a roadmap to move toward Transformer models later in the year.
Key Responsibilities
- Own operations and maintenance (O&M) of ML model training pipelines (approx. 80% of the role)
- Support and optimize Kubeflow-based training pipelines running on Kubernetes
- Perform cluster optimization, version upgrades, and infrastructure scaling
- Enhance and extend automation across ML pipelines
- Ensure compliance with tech mandates, cyber/vulnerability remediation, and platform upgrades
- Support weekly model refit and release cycles
- Partner closely with:
- Full-time ML engineers on feature delivery when bandwidth allows
- Existing data engineering contractor to jointly oversee end-to-end ML infrastructure
- Help scale pipelines to support increased customer volume and enterprise integrations
Required Skills & Qualifications
- Strong hands-on experience with Python
- Solid expertise in Kubernetes (required)
- Experience supporting ML training pipelines in production environments
- Familiarity with Kubeflow or similar ML orchestration frameworks
- Experience with ML frameworks such as:
- Scikit-learn
- PyTorch
- TensorFlow
- Spark ML / MLlib
- Strong understanding of ML infrastructure, reliability, and operational excellence
- Ability to work in fast-paced environments with frequent model releases
Nice to Have
- Experience with Transformer models
- Prior exposure to high-frequency ML model release cycles
- Experience working in large-scale, enterprise ML platforms
- Financial services or regulated-industry experience