Apply Now

AI Infrastructure & Experience Engineer

Mountain View, CA, US • Posted 2 hours ago • Updated 2 hours ago

Contract W2

4 Months

No Travel Required

On-site

Depends on Experience

OSI Engineering, Inc.

Fitment

Dice Job Match Score™

🔢 Crunching numbers...

Job Details

Skills

API
ARM
Artificial Intelligence
C++
Caching
CUDA
Computer Hardware
GPU
Docker
Design Of Experiments
JavaScript
Machine Learning (ML)
Kubernetes
Optimization
React.js
Python
SPIN
Orchestration
Performance Metrics
WebSocket
Computer Science
Network
SEC
Debugging
Communication
Workflow
Rust

Summary

A global consumer device company based in Mountain View, CA is looking for AI Infrastructure & Experience Engineer to join their team!

Key Responsibilities

Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.
Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low cost GPU compute.
Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.
Build functional, high-fidelity demos showcasing model memory capabilities, agentic workflows, and context-aware web search.
Implement communication protocols to bridge local AI compute with peripheral devices, including smart TVs, household appliances, and XR hardware.

Qualifications:

3 years of relevant industry experience required
Recent experience in model optimization required
Proven experience with NVIDIA eco-systems and ARM64 architecture.
Advanced proficiency in C++, Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks.
Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM).
Robust understanding of asynchronous programming (FastAPI), containerization (Docker/Kubernetes), sandbox environments, and API design for low-latency communication.
Ability to quickly spin up modern frontend UIs (React, Next.js, or similar) to present AI-driven intelligence to end users.
Familiarity with WebSockets, gRPC, and REST for device-to-device communication in a local network environment.
Degree in Computer Science, Machine Learning or Artificial Intelligence Specialization preferred, but not required.

Type: Contract
Duration: 4 months with extension
Work Location: Mountain View, CA (onsite)
Pay range: $ 64.00 - $ 79.00 (DOE)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10365912
Position Id: 9005646
Posted 2 hours ago

Company Info

About OSI Engineering, Inc.

OSI Engineering delivers professional engineering consultants and contractors to enable you to meet your time-to-market demands. Our technical knowledge of your specific technology, streamline the process to deliver the right engineer with the right technical expertise to add value with minimal ramp up time. Additionally, on-call access to our highly-skilled engineering pool enables your business to stay ahead of the curve.

Go to company profile

Contact the job poster

Darshana Desai

Recruiter @ OSI Engineering, Inc.

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

It looks like there aren't any Similar Jobs for this job yet.

Search all similar jobs

More jobs at OSI Engineering, Inc. in Mountain View, CA