AI Infrastructure & Experience Engineer

Mountain View, CA, US • Posted 2 days ago • Updated 2 days ago
Contract Corp To Corp
Contract W2
6 Months
No Travel Required
On-site
$79 - $79/hr
Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

  • Agile & Adaptable
  • Architectural Vision
  • Problem Solver
  • PYTHON
  • MODERN INFERENCE ENGINES
  • ORCHESTRATION FRAMEWORKS
  • SANDBOX ENVIRONMENTS
  • FRONTEND UI DEVELOPMENT
  • WEBSOCKETS
  • DEVICE-TO-DEVICE COMMUNICATION

Summary

Job Category: Technical

Job Title: AI Infrastructure & Experience Engineer

Duties: Key Responsibilities

 

Inference Optimization: Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.

Systems Engineering & CUDA: Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low cost GPU compute.

Orchestration & Integration: Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.

Rapid Prototyping: Build functional, high-fidelity demos showcasing model memory capabilities, agentic workflows, and context-aware web search.

Peripheral Connectivity: Implement communication protocols to bridge local AI compute with peripheral devices, including smart TVs, household appliances, and XR hardware.

 

Skills: Technical Qualifications

 

Recent experience in model optimization required Hardware & Compute: Proven experience with NVIDIA eco-systems and ARM64 architecture.

Systems Programming: Advanced proficiency in C++, Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks.

AI/ML Frameworks: Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM).

Software Engineering: Robust understanding of asynchronous programming (FastAPI), containerization (Docker/Kubernetes), sandbox environments, and API design for low-latency communication.

Full-Stack Prototyping: Ability to quickly spin up modern frontend UIs (React, Next.js, or similar) to present AI-driven intelligence to end users.

Communication Protocols: Familiarity with WebSockets, gRPC, and REST for device-to-device communication in a local network environment.

 

Keywords:

Education: Ideal Candidate Profile

 

The "Builder" Mindset: You are energized by the prospect of building proofs-of-concept in days rather than months. You thrive in environments where speed and creativity are paramount.

Problem Solver: You approach unsolved, messy engineering challenges with enthusiasm rather than trepidation.

Architectural Vision: You see the "big picture" of how AI becomes part of the consumer's daily life, not just how the model generates text.

Agile & Adaptable: You are comfortable working in a fast-paced environment where priorities shift based on the results of rapid experimentation.

Degree in Computer Science, Machine Learning or Artificial Intelligence Specialization preferred, but not required

3 years of relevant industry experience required

 

Skills and Experience:

Required Skills: 

INFERENCE OPTIMIZATION

NVIDIA ECOSYSTEMS

CUSTOM CUDA KERNEL DEVELOPMENT

ARM64 ARCHITECTURE

PYTHON

Additional Skills:

RUST

CUDA

MODERN INFERENCE ENGINES

LLAMA.CPP

TENSORRT-LLM

OLLAMA

ORCHESTRATION FRAMEWORKS

LITELLM

ASYNCHRONOUS PROGRAMMING

FASTAPI

CONTAINERIZATION

DOCKER

KUBERNETES

SANDBOX ENVIRONMENTS

API DESIGN

LOW-LATENCY COMMUNICATION

FRONTEND UI DEVELOPMENT

REACT

NEXT.JS

WEBSOCKETS

GRPC

REST

DEVICE-TO-DEVICE COMMUNICATION

PROBLEM SOLVING

ARCHITECTURAL VISION

AGILITY

ADAPTABILITY

Languages:

English

                Read

                Write

                Speak

Minimum Degree Required: Bachelor's Degree

Patents: No

Publications: No

Veteran Status: No

# of Positions: 1

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10110849
  • Position Id: 1681-10990-3874
  • Posted 2 days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Mountain View, California

Today

Full-time

USD 175,000.00 - 270,000.00 per year

San Jose, California

Today

Full-time

USD 45.00 - 60.00 per hour

Palo Alto, California

13d ago

Full-time

USD 150,000.00 - 190,000.00 per year

Foster City, California

Today

Full-time

USD 169,100.00 - 270,800.00 per year

Search all similar jobs