Apply Now

Senior On-Prem GenAI Platform Engineer

Charlotte, NC, US • Posted 7 hours ago • Updated 7 hours ago

Contract W2

Contract Independent

24 Months

On-site

$50 - $60/hr

Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

Artificial Intelligence
CUDA
Caching
Cloud Computing
Collaboration
DevOps
GPU
Generative Artificial Intelligence (AI)
Good Clinical Practice
Google Cloud Platform
Grafana
Kubernetes
Large Language Models (LLMs)
MIG
Machine Learning (ML)
Management
Microsoft Azure
Optimization
Orchestration
Research

Summary

Senior On-Prem GenAI Platform Engineer

Location: Charlotte, NC

< data-start="168" data-end="184">Job Summary

We are seeking an experienced On-Prem GenAI Platform Engineer to build, optimize, and manage enterprise AI/ML platforms supporting Large Language Models (LLMs) and Generative AI workloads. The ideal candidate will have expertise in Kubernetes/OpenShift AI, GPU infrastructure, distributed systems, and LLM inference optimization.

< data-start="520" data-end="545">Key Responsibilities

Build and operate on-prem Kubernetes/OpenShift AI platforms for GenAI and LLM workloads.
Design and optimize inference solutions using vLLM, TensorRT-LLM, Triton Inference Server, and SGLang.
Implement advanced optimization techniques including continuous batching, speculative decoding, KV caching, FP8, AWQ, and GPTQ.
Manage GPU orchestration using Run:AI, CUDA, NCCL, MIG, and tensor parallelism.
Deploy scalable ML serving frameworks using KServe, Helm, and Kubernetes Operators.
Monitor platform performance using Prometheus, Grafana, and Arize AI.
Collaborate with ML and research teams to productionize GenAI solutions.

< data-start="1208" data-end="1228">Required Skills

LLM Inference: vLLM, TensorRT-LLM, Triton, SGLang
GPU & Distributed Systems: CUDA, NCCL, MIG, Tensor Parallelism
Platforms: Kubernetes, OpenShift AI, KServe, Helm
GPU Orchestration: Run:AI
Observability: Prometheus, Grafana, Arize AI
Performance Tuning: GuideLLM, Locust
GenAI/LLMOps & Platform Engineering

< data-start="1579" data-end="1600">Preferred Skills

Hybrid Cloud (Azure/Google Cloud Platform)
Inferentia or alternative AI accelerators
Service Mesh and GPU Cluster Networking
Experience with enterprise AI/ML platform engineering

< data-start="1771" data-end="1786">Experience

8+ years in Platform Engineering, DevOps, SRE, ML Platform Engineering, or related fields.
Strong experience supporting enterprise-scale AI/ML and Generative AI environments.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10410658
Position Id: 8986959
Posted 7 hours ago

Contact the job poster

Sirisha Potluri

Recruiter @ Zuven Technologies Inc

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

On-prem Platform Engineer

Charlotte, North Carolina

•

Today

Role :: On-prem Platform Engineer Location: Charlotte, NC Key Skills: Must-Have Skills (Mandatory Keywords) LLM Inference & Optimization vLLM, TensorRT-LLM, Triton Inference Server, SGLangInference optimization techniques:Continuous batchingSpeculative decodingKV cache / Prefix cachingModel optimization:FP8, AWQ, GPTQDistributed & GPU Systems Tensor parallelism and large model scalingCUDA, NCCL, GPU architectureGPU partitioning & optimization (MIG)Kubernetes & ML Serving Kubernetes-based ML serv

Easy Apply

Third Party, Contract

65 - 75

On-Premises LLM Inference & GPU Systems Engineer

Charlotte, North Carolina

•

5d ago

Job Description - We are urgently looking to onboard a top-tier On-Premises LLM Inference & GPU Systems Engineer for an exciting project with one of our premium clients. We are specifically seeking high-caliber professionals with deep, hands-on experience in On-Premises LLM Inference & GPU Systems Engineering. Key Requirements: Experience:10+ years of total experience is mandatory.Location:Local to Charlotte, NC only. There are no relocation or remote options for this role.Interview Process: Can

Easy Apply

Contract

$60 - $65

Senior Software Engineer (GenAI)

Charlotte, North Carolina

•

Today

STRATEGIC STAFFING SOLUTIONS HAS AN OPENING! This is a Contract Opportunity with our company that MUST be worked on a W2 Only. No C2C eligibility for this position. Visa Sponsorship is Available! The details are below. Beware of scams. S3 never asks for money during its onboarding process. Job Title: Senior Software Engineer (GenAI) Contract Length: 6+ Months Location: CHARLOTTE NC 28202 On Site Work Ref# 246440 Seeking a Senior Software Engineer to join a Consumer Intelligent Automation & AI

Easy Apply

Full-time, Contract

USD0 - USD0

AI Platform Engineer

Charlotte, North Carolina

•

Today

JOB SUMMARY The AWS Data & AI Platform Engineer will be responsible for designing, building, and operating secure and scalable AI platforms on AWS within enterprise and regulated environments. This role involves developing and managing generative AI solutions, containerized AI services on Amazon EKS, and serverless AI workflows. A key aspect of the position is implementing robust security measures, integrating various data services for AI pipelines, and establishing comprehensive CI/CD, monitori

Easy Apply

Contract

DOE

Search all similar jobs