AI / ML (LLM & Kubernetes)

Hybrid in Jersey City, NJ, US • Posted 16 hours ago • Updated 7 minutes ago

Contract Independent

Contract W2

75% Travel Required

Hybrid

Depends on Experience

Fitment

Dice Job Match Score™

✨ Finding the perfect fit...

Job Details

Skills

Large Language Models (LLMs)
Kubernetes
Machine Learning (ML)
Machine Learning Operations (ML Ops)
TRTLLM
TensorRT-LLM

Summary

Job Title: AI Operations Platform Consultant (LLM & Kubernetes)
Experience: 8+ Years
Location & Work Schedule

Location: Charlotte, NC OR Jersey City, NJ (candidate may choose location)
Work Model: Hybrid – 3 days per week onsite
Business Hours: Monday–Friday, normal business hours

Job Overview

We are seeking an experienced AI Operations Platform Consultant to support the deployment, operation, and optimization of Large Language Model (LLM) inference platforms in a mission-critical, enterprise environment. The ideal candidate will have strong hands-on expertise with Kubernetes (OpenShift) and LLM deployment frameworks such as TensorRT-LLM and Triton Inference Server, along with experience managing MLOps/LLMOps pipelines in production.

This role focuses on ensuring high availability, performance, scalability, and operational excellence for AI inference services.

Must-Have Skills

Large Language Models (LLMs)
Kubernetes / OpenShift

Key Responsibilities

AI Platform Deployment & Operations

Deploy, manage, operate, and troubleshoot containerized AI services at scale on Kubernetes (OpenShift) for mission-critical applications
Deploy, configure, tune, and optimize LLMs using TensorRT-LLM and Triton Inference Server
Manage scalable infrastructure for deploying and operating LLM-based inference services
Support production-grade AI inference platforms with high availability and performance requirements

MLOps / LLMOps

Design, operate, and support MLOps / LLMOps pipelines for production inference workloads
Deploy inference services using TensorRT-LLM and Triton Inference Server
Monitor, maintain, and improve inference pipelines across environments
Ensure reliable model lifecycle management, including updates and rollbacks

Monitoring, Performance & Reliability

Set up and operate monitoring solutions for AI inference services (performance, availability, latency, throughput)
Troubleshoot issues related to model performance, scalability, load balancing, and container orchestration
Implement best practices for observability, alerting, and system health monitoring

Model Optimization & Inference

Apply model optimization techniques including:

Pruning
Quantization
Knowledge distillation

Optimize models using Triton Inference Server with TensorRT-LLM (TRTLLM)
Ensure efficient GPU utilization and inference performance tuning

Enterprise Operations & Governance

Follow standard enterprise operational processes:

Incident Management
Change Management
Event Management

Support operational readiness and production stability for AI platforms
Collaborate with cross-functional teams including infrastructure, security, and AI/ML teams

Required Qualifications

Strong hands-on experience with Kubernetes, preferably OpenShift
Proven experience deploying and operating LLMs in production environments
Expertise with:

TensorRT-LLM
Triton Inference Server (architecture, configuration, deployment)

Experience with containerization, microservices, and API-based inference services
Strong troubleshooting skills in distributed, containerized systems
Experience managing scalable AI infrastructure
Understanding of enterprise-grade operational best practices

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91159266
Position Id: 8882681
Posted 16 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

AI / ML (LLM & Kubernetes)

Dice Job Match Score™

Job Details

Skills

Summary

Similar Jobs