AI Infrastructure & Experience Engineer


OSI Engineering, Inc.
Dice Job Match Score™
🔢 Crunching numbers...
Job Details
Skills
- API
- ARM
- Artificial Intelligence
- C++
- Caching
- CUDA
- Computer Hardware
- GPU
- Docker
- Design Of Experiments
- JavaScript
- Machine Learning (ML)
- Kubernetes
- Optimization
- React.js
- Python
- SPIN
- Orchestration
- Performance Metrics
- WebSocket
- Computer Science
- Network
- SEC
- Debugging
- Communication
- Workflow
- Rust
Summary
A global consumer device company based in Mountain View, CA is looking for AI Infrastructure & Experience Engineer to join their team!
Key Responsibilities
- Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.
- Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low cost GPU compute.
- Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.
- Build functional, high-fidelity demos showcasing model memory capabilities, agentic workflows, and context-aware web search.
- Implement communication protocols to bridge local AI compute with peripheral devices, including smart TVs, household appliances, and XR hardware.
Qualifications:
- 3 years of relevant industry experience required
- Recent experience in model optimization required
- Proven experience with NVIDIA eco-systems and ARM64 architecture.
- Advanced proficiency in C++, Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks.
- Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM).
- Robust understanding of asynchronous programming (FastAPI), containerization (Docker/Kubernetes), sandbox environments, and API design for low-latency communication.
- Ability to quickly spin up modern frontend UIs (React, Next.js, or similar) to present AI-driven intelligence to end users.
- Familiarity with WebSockets, gRPC, and REST for device-to-device communication in a local network environment.
- Degree in Computer Science, Machine Learning or Artificial Intelligence Specialization preferred, but not required.
Type: Contract
Duration: 4 months with extension
Work Location: Mountain View, CA (onsite)
Pay range: $ 64.00 - $ 79.00 (DOE)
- Dice Id: 10365912
- Position Id: 9005646
- Posted 2 hours ago
Company Info
About OSI Engineering, Inc.
OSI Engineering delivers professional engineering consultants and contractors to enable you to meet your time-to-market demands. Our technical knowledge of your specific technology, streamline the process to deliver the right engineer with the right technical expertise to add value with minimal ramp up time. Additionally, on-call access to our highly-skilled engineering pool enables your business to stay ahead of the curve.
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs