Mountain View, California
•
4d ago
Recent experience in model optimization requiredHardware & Compute:Proven experience with NVIDIA eco-systems and ARM64 architecture.Systems Programming:Advanced proficiency in C++, Python, and Rust. Deep familiarity with CUDA and the ability to author/debug custom CUDA kernels for compute-intensive tasks.AI/ML Frameworks:Extensive experience with modern inference engines (llama.cpp, TensorRT-LLM, Ollama) and orchestration frameworks (LiteLLM).Software Engineering:Robust understanding of asynchro
Easy Apply
Contract
Depends on Experience



