Overview
Remote
Contract - W2
Skills
MLOps/LLMOps Engineer
Job Details
Key Responsibilities:
- Design and implement LLM-specific deployment architectures with Docker containers for both batch and real-time inference
- Configure GPU infrastructure on-premises or in the cloud with appropriate CI/CD pipelines for model updates
- Build comprehensive monitoring and observability systems with appropriate logging, metrics, and alerts
- Implement load balancing and scaling solutions for LLM inference, including model sharding if necessary
- Create automated workflows for model retraining, versioning, and deployment
- Optimize infrastructure costs through intelligent resource allocation, spot instances, and efficient compute strategies
- Collaborate with the Cyber team on implementing appropriate security controls for GenAI applications
- Develop automated testing frameworks to ensure consistent output quality across model updates
Expected Skillset:
- DevOps + ML: Expertise in Kubernetes, Docker, CI/CD tools, and MLflow or similar platforms
- Cloud & Infrastructure: Understanding of GPU instance options, cloud services (AWS/Azure/Google Cloud Platform), and optimization techniques
- Automation: Proficiency in Python, Bash, and infrastructure-as-code tools like Terraform or Ansible
- LLM-Specific Frameworks: Experience with tools like TensorBoard, MLFLow, or equivalent for scaling LLMs
- Performance Optimization: Knowledge of techniques to monitor and improve inference speed, throughput, and cost
- Collaboration: Ability to work effectively across technical teams while adhering to enterprise architecture standards
Oscar Associates Limited (US) is acting as an Employment Business in relation to this vacancy.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.