Search Jobs | Dice.com

LLM Inference / AI Infrastructure Engineer

Charlotte, North Carolina

•

Yesterday

LLM Inference / AI Infrastructure Engineer Location: Charlotte, NC Duration: 9-12 Month JD: vLLM TensorRTLLM Triton Inference Server SGLang Inference Optimization Continuous Batching Speculative Decoding KV Cache / Prefix Caching FP8 / AWQ / GPTQ Tensor Parallelism Kubernetes ML Serving KServe OpenShift AI Helm / Operators GPU Orchestration Run:AI Performance Benchmarking CUDA / NCCL / MIG Prometheus / Grafana ML Observability skills sanity check: HAVE YOU WORKED ON Nvidia H200? If yes, chance

Easy Apply

Contract

Depends on Experience

LLM Inference & GPU Systems Consultant

Charlotte, North Carolina

•

9d ago

Role : LLM Inference & GPU Systems Consultant Location : Charlotte , NC ( Locals only) We are seeking an AI Infrastructure Runtime Engineer to build and maintain large-scale on-prem LLM infrastructure. This is an enterprise private GenAI environment running on NVIDIA H200 GPU clusters and an OpenShift AI deployment ecosystem. You will manage production inference internally, including self-hosting open-source LLMs like Llama. We are focused exclusively on inferencing; this role involves no model

Easy Apply

Contract, Third Party

Depends on Experience

Cloud GenAI Governance & Observability Consultant

Hybrid in Charlotte, North Carolina

•

13d ago

Job Description: Role Overview We are seeking a Senior Cloud GenAI Governance Engineer to manage enterprise-scale GenAI platform engineering. This role focuses on the critical gateway and security layers, managing the architecture flow from User API requests, through Model Armor, to the inference endpoint, and finally into telemetry and Arize AI. The core of this position is building responsible AI infrastructure and AI runtime governance. Key Responsibilities Enterprise AI Security: Act as th

Easy Apply

Third Party, Contract

$84

Senior Python Developer (with heavy AWS/Terraform focus)

Charlotte, North Carolina

•

6d ago

Primary Talent Partners has a new contract opening for a Senior Python Developer (with heavy AWS/Terraform focus)with our large power and utilities client in Charlotte, NC. This is a 9-month contract with a potential for extension. Pay: $81.00 - $91.00/hr;W2 contract, no PTO, no Benefits. ACA-compliant supplemental package available for enrollment.Candidates must be legally authorized to work in the United States and must be able to sit on Primary Talent Partners W2 without sponsorship. Descript

Easy Apply

Contract

$81 - $91

LLM Inference & GPU Systems Consultant

Charlotte, North Carolina

•

Today

Job Title: LLM Inference & GPU Systems Consultant Location: Charlotte-NC Local candidates only Duration: Long Term Must have : RunAI /LLM Inference & GPU / vLLM and TensorRT-LLM. Required Skills & Experience Required Qualifications 8+ years experience working as an LLM Systems Engineer or AI Infrastructure Runtime Engineer. 8+ years hands-on experience with NVIDIA H200 clusters and runtime optimization techniques (KV Cache, prefill/decode). Proficiency in OpenShift AI and GPU orchestration tool

Easy Apply

Contract

Hybrid || LLM Inference & GPU Systems Consultant || Charlotte, NC

Charlotte, North Carolina

•

Today

TECHNOGEN, Inc. is a Proven Leader in providing full IT Services, Software Development and Solutions for 15 years. TECHNOGEN is a Small & Woman Owned Minority Business with GSA Advantage Certification. We have offices in VA; MD & Offshore development centers in India. We have successfully executed 100+ projects for clients ranging from small business and non-profits to Fortune 50 companies and federal, state and local agencies. Description: Local candidates preferred. Role Overview: We are se

Easy Apply

Contract, Third Party

$0,00/-

Cloud Infrastructure Engineer

Charlotte, North Carolina

•

Today

Job Title: Cloud Infrastructure Engineer Location: Charlotte, NC (5 Days onsite) Duration: 12+ months Primary Skills vLLM TensorRT-LLM Triton Inference Server SGLang Kubernetes ML Serving KServe OpenShift AI GPU Orchestration Google Cloud Platform Terraform Nvidia Key Responsibilities Design and manage scalable AI/ML infrastructure for GenAI and LLM workloads. Deploy and optimize LLM inference pipelines using vLLM, TensorRT-LLM, Triton Inference Server, and SGLang. Implement inference optimi

Easy Apply

Third Party, Contract

$$55/hr - $60/hr

Python lead with gen AI

Charlotte, North Carolina

•

21d ago

About the Role We are seeking an experienced Python Lead with strong expertise in Generative AI to spearhead the design, development, and deployment of AI-powered solutions. This leadership role involves guiding a team of developers, collaborating with data scientists, and driving innovation by integrating cutting-edge AI models into enterprise applications. Key Responsibilities Lead a team of Python developers in building scalable AI-driven applications.Architect and implement solutions using

Easy Apply

Full-time

Depends on Experience

Sr. AI Software Engineer (C#/.Net, infrastructure & startup exp. req'd) - 100% REMOTE

Remote

•

Today

100% REMOTE This Jobot Job is hosted by: Katherine Krull Are you a fit? Easy Apply now by clicking the "Apply Now" button and sending us your resume. Salary: $170,000 - $270,000 per year A bit about us: Come join a fast-growing, remote-first technology company building modern AI-driven infrastructure products for enterprise users! Job Title: Sr. AI Software Engineer Job Location: 100% REMOTE (*Must currently reside in the US) Pay Range: $170k-$270k Base Salary + Equity + Comprehensive Medica

Full-time