Santa Clara, California
•
Today
Help design and ship an Always-On, low-overhead GPU profiling service that runs in production, scales across cluster environments, and delivers actionable insights for ML workloads. You will lead the architecture and hands-on delivery across system software, drivers, and CUDA to make profiling continuously available and reliable. What you'll be doing: Design the architecture for an Always-On profiling service, defining interfaces, data flows, and scalability guarantees for multi-process/GPU/node
Full-time
USD 272,000.00 - 431,250.00 per year
