Overview
Skills
Job Details
Role: ML Ops and LLM Ops Architect
Location: Remote (U.S.)
Job Type: Full-Time
Experience: 10+ Years
Job Description
We are seeking an experienced ML Ops and LLM Ops Architect to design, implement, and optimize large-scale AI/ML pipelines and infrastructure. The ideal candidate will have strong expertise in AI model deployment, automation, and large language model (LLM) operations, along with hands-on experience in MLOps tools and frameworks to ensure production-grade reliability and scalability.
Key Responsibilities
- Automate AI/ML model deployment and set up real-time monitoring for ML pipelines.
- Integrate existing and new codebases into customer CI/CD pipelines.
- Implement best practices and Proof of Concepts (POCs) for efficient, scalable model operations.
- Work with databases for querying and model testing; use version control systems such as Git.
- Develop APIs and services using frameworks like Flask and FastAPI.
- Utilize and fine-tune Large Foundation Models (LLMs) with toolchains such as LangChain and LLM APIs.
- Implement experiment tracking and reproducibility using frameworks like MLflow.
- Design and automate data pipelines using tools such as Airflow, Kafka, and RabbitMQ.
- Automate CI/CD pipelines to manage data, code, and model changes.
- Ensure robust containerization and orchestration using Docker and Kubernetes.
- Collaborate with customer architecture teams to drive deployment strategy and coordinate with AI/ML developers to deliver optimized solutions.
Required Skills & Qualifications
- 10+ years of experience in Machine Learning Operations (MLOps) or related fields.
- Proven experience in LLM-based system design, deployment, and optimization.
- Strong programming skills in Python and working knowledge of AI/ML frameworks (TensorFlow, PyTorch, etc.).
- Hands-on expertise with CI/CD, MLflow, LangChain, Docker, Kubernetes, Airflow, Kafka, and FastAPI.
- Experience with data engineering and real-time model monitoring pipelines.
- Ability to lead discussions with architecture and engineering teams and drive project implementation.
- Excellent communication and technical leadership skills.
Preferred Skills
- Experience working in cloud AI environments such as Azure ML, AWS Sagemaker, or Google Cloud Platform Vertex AI.
- Familiarity with RAG (Retrieval-Augmented Generation) systems and vector databases.
- Exposure to AIOps and large-scale distributed systems.
V2 Innovations Inc is an Equal Opportunity Employer and welcomes applicants from all backgrounds. We provide equal employment opportunities to all employees and applicants and comply with all EEO and affirmative action guidelines, embracing diversity, inclusion, and fairness in our hiring process. Diversity fuels innovation. Inclusion powers success.