Apply Now

Lead MLOps / AI Platform Engineer

Hybrid in Charlotte, NC, US • Posted 3 days ago • Updated 3 days ago

Contract Independent

Contract W2

12 Months

Hybrid

$60 - $70/hr

Fitment

Dice Job Match Score™

🔗 Matching skills to job...

Job Details

Skills

MLOps
Large Language Model
Cloud
Tensor
Triton

Summary

Job Description: Lead MLOps / AI Platform Engineer

Location: Charlotte, NC

Duration: Long Term

Visa Type: & Candidates

Role Overview

We are seeking a highly skilled Lead MLOps / AI Platform Engineer to design, build, and optimize our next-generation Generative AI and Large Language Model (LLM) infrastructure. This role is pivotal in bridging the gap between cutting-edge AI research and robust production deployment. You will be responsible for orchestrating high-performance GPU environments (specifically leveraging Nvidia H200s), optimizing LLM inference, and maintaining enterprise-grade infrastructure across both Cloud (Google Cloud Platform/Azure) and On-Premise environments.

Key Responsibilities

AI Inference Optimization & Serving

Deploy, scale, and manage large-scale language models using advanced inference frameworks such as vLLM, TensorRT-LLM, SGLang, and Triton Inference Server.
Implement and fine-tune performance optimization strategies, including Continuous Batching and advanced Parallelism techniques.
Conduct load testing, benchmarking, and profiling of LLM deployments using GuideLLM and Locust to ensure optimal latency and throughput.

Cloud & Infrastructure Orchestration

Architect and maintain scalable, secure infrastructure on Google Cloud Platform and Azure using Infrastructure as Code (Terraform).
Design and execute Cloud Networking, Landing Zones, and Organization Policies/Governance.
Manage secrets and secure workloads utilizing HashiCorp Vault.
Develop and champion Internal Developer Portals to streamline workflows for data science and product teams.

On-Premise & Kubernetes Engineering

Orchestrate ML workloads on Kubernetes, utilizing KServe, OpenShift AI / OpenShift Functions, and Helm charts/Operators.
Manage compute clusters with a heavy focus on advanced GPU Orchestration (Nvidia H200 ecosystems).
Demonstrate deep hands-on expertise in architecture and "know-how to unfold an LLM" into highly constrained or custom on-premise hardware setups.

Observability & SRE

Implement end-to-end ML Observability and monitoring frameworks using Arize AI.
Establish Site Reliability Engineering (SRE) best practices, maintaining strict SLOs/SLIs for model deployment pipelines and production APIs.

Required Skills & Qualifications

Core AI / MLOps Stack:

Inference Engines: vLLM, TensorRT-LLM, Triton Inference Server, SGLang
ML Frameworks/Ops: KServe, OpenShift AI, Arize AI, GenAI Platforms, RAG architecture
Performance & Testing: GuideLLM, Locust, Continuous Batching, Parallelism optimization
Infrastructure & Cloud Stack:
Cloud Providers: Google Cloud Platform (Google Cloud Platform), Microsoft Azure
Containerization & Orchestration: Kubernetes, OpenShift, Helm/Operators, GPU Orchestration
IaC & Automation: Terraform, Python
Security & Networking: HashiCorp Vault, Landing Zones, Org Policy & Governance
Hardware Sanity Check:
Mandatory Experience: Direct, hands-on experience working with Nvidia H200 GPUs and optimizing workloads specifically for this architecture.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 90767752
Position Id: 8978823
Posted 3 days ago

Contact the job poster

Satyasri Bhanuteja

Recruiter @ SATCON Inc

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Charlotte, North Carolina

•

Today

Job Title: LLM Platform Engineer Location: Charlotte, NC (Onsite) Required Skills: On-premise requirements: (Arize AI, Claude Cowork, Google Cloud Platform, Terraform), Nvidia GPU Environment Primary Skills OpenShift Functions OpenShift AI Kubernetes LLM Deployment & Serving Google Cloud Platform Terraform Arize AI Claude Cowork Key Responsibilities Deploy, configure, and manage LLM workloads in on-premise OpenShift environments. Design scalable AI/ML infrastructure using OpenShift Functions

Easy Apply

Contract, Third Party

AI & LLM Architect

Hybrid in Charlotte, North Carolina

•

4d ago

Job Title: AI ArchitectLocation: Charlotte, NC (Hybrid)Duration: 1 year Experience Level: Lead (8-10 years) Job Summary We are seeking a highly skilled and hands-on AI Architect to lead the design and implementation of next-generation enterprise AI platforms powered by Large Language Models (LLMs) and advanced agentic architectures. The ideal candidate will possess deep expertise in scalable AI system design, AWS-native AI services, multi-agent orchestration, LLM evaluation frameworks, guardrail

Easy Apply

Contract, Third Party

Depends on Experience

LLM Inference & GPU Systems Consultant

Charlotte, North Carolina

•

7d ago

Role : LLM Inference & GPU Systems Consultant Location : Charlotte , NC ( Locals only) We are seeking an AI Infrastructure Runtime Engineer to build and maintain large-scale on-prem LLM infrastructure. This is an enterprise private GenAI environment running on NVIDIA H200 GPU clusters and an OpenShift AI deployment ecosystem. You will manage production inference internally, including self-hosting open-source LLMs like Llama. We are focused exclusively on inferencing; this role involves no model

Easy Apply

Contract, Third Party

Depends on Experience

NVIDIA H200 -- LLM Inference & GPU Systems Consultant

Hybrid in Charlotte, North Carolina

•

5d ago

Role Overview: We are seeking an AI Infrastructure Runtime Engineer to build and maintain large-scale on-prem LLM infrastructure. This is an enterprise private GenAI environment running on NVIDIA H200 GPU clusters and an OpenShift AI deployment ecosystem. You will manage production inference internally, including self-hosting open-source LLMs like Llama. We are focused exclusively on inferencing; this role involves no model training infrastructure or fine-tuning pipelines. Key Responsibilities N

Easy Apply

Contract

70 - 80

Search all similar jobs

Lead MLOps / AI Platform Engineer

Dice Job Match Score™

Job Details

Skills

Summary

Satyasri Bhanuteja

Similar Jobs