MLOps Engineer
MLOps Engineer to productionize and scale ML and GenAI systems, with a focus on LLM deployment, orchestration, and reliability in production environments.
Key Responsibilities
Deploy, manage, and scale ML/DL models in production
Build and operate Kubernetes-based infrastructure for ML workloads
Handle model packaging, serialization, and versioning
Design scalable inference systems (batch and real-time)
Deploy and optimize local LLMs (latency, throughput, cost)
Implement GenAI workflows (RAG, prompt pipelines, orchestration)
Build and manage agentic systems with tool integration
Design and manage LLM memory (short-term, long-term, vector stores)
Integrate and manage API gateways for model access, routing, and rate limiting
Monitor performance, drift, and system reliability
Requirements
Strong Kubernetes fundamentals (pods, services, autoscaling, deployments
Hands-on experience with ML/DL models and serialization
Proven experience in model deployment, scaling, and monitoring
Experience with local LLM deployment and optimization
Solid understanding of LLM memory patterns (context windows, retrieval, persistence)
Experience with API gateways, load balancing, and service routing
Familiarity with GenAI workflows (RAG, orchestration frameworks)
Experience building agentic / multi-step LLM systems
Proficiency in Python and modern ML/infra tooling