Overview
On Site
Depends on Experience
Accepts corp to corp applications
Contract - Independent
Contract - W2
Contract - 12 Month(s)
Able to Provide Sponsorship
Skills
Machine Learning (ML)
Cloud Computing
Continuous Delivery
Continuous Improvement
Continuous Integration
DevOps
Docker
Grafana
Kubernetes
PyTorch
TensorFlow
Terraform
scikit-learn
Google Cloud Platform
Open Source
Regulatory Compliance
Management
Conflict Resolution
Computer Science
Data Science
Effective Communication
Amazon Web Services
Apache Spark
Collaboration
Java
Mentorship
Python
Workflow
Problem Solving
FOCUS
Pivotal
Microsoft Azure
Programming Languages
MLOps
Job Details
Lead ML Infrastructure Engineer
SFO, CA/ Dallas, TX / Chicago, IL / NYC, NY
About the Role
We are seeking a highly skilled Lead ML Infrastructure Engineer to spearhead the development, deployment, and scaling of machine learning infrastructure. This pivotal role involves collaborating closely with data scientists, ML engineers, and operations teams to build robust, scalable, and efficient machine learning pipelines. The ideal candidate will be passionate about pushing the boundaries of ML infrastructure, and possess a deep understanding of cloud platforms, containerization, and big data technologies.
Responsibilities
- Lead the design, implementation, and maintenance of scalable ML infrastructure solutions.
- Collaborate with data science and ML teams to optimize model deployment workflows.
- Develop and manage CI/CD pipelines to automate deployment processes.
- Architect and implement containerized environments using Docker and Kubernetes.
- Ensure infrastructure security, reliability, and compliance across cloud platforms.
- Optimize resource utilization and cost-efficiency in cloud environments.
- Drive best practices in Infrastructure as Code (IaC) with tools like Terraform.
- Stay current with the latest advancements in ML frameworks, cloud services, and infrastructure tooling.
- Mentor junior team members and promote a culture of continuous improvement.
Requirements
- Proven experience as a Machine Learning Engineer or Infrastructure Engineer, with a focus on ML infrastructure.
- Strong expertise in programming languages Python and Java.
- Hands-on experience working with cloud platforms, with a strong preference for Google Cloud Platform; AWS and Azure experience are also valuable.
- Familiarity with popular machine learning frameworks such as TensorFlow and PyTorch, along with libraries like scikit-learn.
- Solid understanding of DevOps principles and experience with CI/CD pipelines.
- Experience with Infrastructure as Code tools, especially Terraform.
- Proficiency in containerization technologies including Docker and Kubernetes.
- Knowledge of big data processing tools like Apache Spark and Hadoop is highly preferred.
- Excellent problem-solving abilities combined with effective communication skills.
- Ability to work collaboratively in a fast-paced, dynamic environment.
Preferred Qualifications:
- Master's or Bachelor's or Master's in Computer Science, Data Science, or related field.
- Certifications in cloud platforms (e.g., Google Cloud Platform Professional Cloud Architect, AWS Certified Solutions Architect).
- Demonstrated experience leading a team or managing complex infrastructure projects.
- Contributions to open-source ML or DevOps projects.
- Experience with monitoring and logging tools such as Prometheus, Grafana, ELK stack.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.