Remote or Santa Clara, California
•
Today
We are seeking highly skilled and motivated software engineers to join our vLLM & MLPerf team. You will define and build benchmarks for MLPerf Inference, the industry-leading benchmark suite for inference system-level performance, as well as contribute to vLLM and optimize its performance to the extreme for those benchmarks on NVIDIA's latest GPUs. What you'll be doing: Design and implement highly efficient inference systems for large-scale deployments of generative AI models.Define inference b
Full-time
USD 184,000.00 per year