SDET/SWE Lead

  • Sunnyvale, CA
  • Posted 5 days ago | Updated 1 day ago

Overview

On Site
$150,000 - $180,000
Full Time

Skills

LLMs
AI frameworks
Python
Nvidia DGX
Kubernetes
GPU platforms
H100

Job Details

Responsibilities

  • Design and develop high-performance AI frameworks for large-scale distributed computation
  • Optimize scalability and efficiency using Nvidia Dynamo Framework
  • Work with distributed dataflow programming to orchestrate GPU workloads using Python and Kubernetes
  • Integrate advanced LLMs into real-world applications, shaping the future of AI-driven software
  • Contribute to building test-automation infrastructure for Kubernetes on large-scale GPU clusters.
  • Help develop detailed test plans for different milestones and operationalize them in test-automation infrastructure.
  • Own and conduct end-end system, scale and stress testing.
  • Working together with SW leads and Technical Program Manager, qualify the releases.
  • Attract and help build downstream production engineering talent.
  • Role model and foster a culture of humility and innovation for product delivery.

Experience:

  • 3 8+ years of experience in software engineering, ideally at a staff level
  • Strong expertise in distributed dataflow programming and distributed systems
  • Hands-on experience with LLMs and AI frameworks
  • Proficiency in Python, with experience orchestrating GPU workloads
  • Experience with Kubernetes for containerized application deployment and orchestration
  • Experience working in systems & systems SW, Cloud and Kubernetes.
  • Experience with production-testing and automation of Kubernetes deployments.

Preferred Qualifications:

  • Master's or similar qualification in a relevant field.
  • Experience with scalable test and automation infrastructure to productionize workloads.
  • Experience with GPU platforms (e.g., Nvidia DGX, H100) and high-performance computing environments.
  • Experience triaging customer bugs, prioritizing, and resolving issues in production.
  • Familiarity with AI developer frameworks, tools, and automation systems
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.