Senior On-Prem GenAI Platform Engineer

Charlotte, NC, US • Posted 7 hours ago • Updated 7 hours ago
Contract W2
Contract Independent
24 Months
On-site
$50 - $60/hr
Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

  • Artificial Intelligence
  • CUDA
  • Caching
  • Cloud Computing
  • Collaboration
  • DevOps
  • GPU
  • Generative Artificial Intelligence (AI)
  • Good Clinical Practice
  • Google Cloud Platform
  • Grafana
  • Kubernetes
  • Large Language Models (LLMs)
  • MIG
  • Machine Learning (ML)
  • Management
  • Microsoft Azure
  • Optimization
  • Orchestration
  • Research

Summary

Senior On-Prem GenAI Platform Engineer

Location: Charlotte, NC

< data-start="168" data-end="184">Job Summary

We are seeking an experienced On-Prem GenAI Platform Engineer to build, optimize, and manage enterprise AI/ML platforms supporting Large Language Models (LLMs) and Generative AI workloads. The ideal candidate will have expertise in Kubernetes/OpenShift AI, GPU infrastructure, distributed systems, and LLM inference optimization.

< data-start="520" data-end="545">Key Responsibilities
  • Build and operate on-prem Kubernetes/OpenShift AI platforms for GenAI and LLM workloads.
  • Design and optimize inference solutions using vLLM, TensorRT-LLM, Triton Inference Server, and SGLang.
  • Implement advanced optimization techniques including continuous batching, speculative decoding, KV caching, FP8, AWQ, and GPTQ.
  • Manage GPU orchestration using Run:AI, CUDA, NCCL, MIG, and tensor parallelism.
  • Deploy scalable ML serving frameworks using KServe, Helm, and Kubernetes Operators.
  • Monitor platform performance using Prometheus, Grafana, and Arize AI.
  • Collaborate with ML and research teams to productionize GenAI solutions.
< data-start="1208" data-end="1228">Required Skills
  • LLM Inference: vLLM, TensorRT-LLM, Triton, SGLang
  • GPU & Distributed Systems: CUDA, NCCL, MIG, Tensor Parallelism
  • Platforms: Kubernetes, OpenShift AI, KServe, Helm
  • GPU Orchestration: Run:AI
  • Observability: Prometheus, Grafana, Arize AI
  • Performance Tuning: GuideLLM, Locust
  • GenAI/LLMOps & Platform Engineering
< data-start="1579" data-end="1600">Preferred Skills
  • Hybrid Cloud (Azure/Google Cloud Platform)
  • Inferentia or alternative AI accelerators
  • Service Mesh and GPU Cluster Networking
  • Experience with enterprise AI/ML platform engineering
< data-start="1771" data-end="1786">Experience
  • 8+ years in Platform Engineering, DevOps, SRE, ML Platform Engineering, or related fields.
  • Strong experience supporting enterprise-scale AI/ML and Generative AI environments.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10410658
  • Position Id: 8986959
  • Posted 7 hours ago
Contact the job poster
SP

Sirisha Potluri

Recruiter @ Zuven Technologies Inc
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Charlotte, North Carolina

Today

Easy Apply

Third Party, Contract

65 - 75

Charlotte, North Carolina

5d ago

Easy Apply

Contract

$60 - $65

Charlotte, North Carolina

Today

Easy Apply

Full-time, Contract

USD0 - USD0

Charlotte, North Carolina

Today

Easy Apply

Contract

DOE

Search all similar jobs