Senior System Software Engineer, Enterprise MODS

Overview

On Site
USD 224,000.00 per year
Full Time

Skills

GPU
Robotics
Debugging
Field Operations
Storage
Collaboration
Computer Hardware
ROOT
Oracle Data Mining
Object Data Manager
IT Management
x86
ARM
Linux
Microsoft Windows
Firmware
UEFI
BIOS
BMC
Programming Languages
C
C++
Python
PCI Express
InfiniBand
Ethernet
Communication
Computer Science
Electrical Engineering
Embedded Systems
Cloud Computing
Roadmaps
Mentorship
Artificial Intelligence
HPC
Visualization
Innovation
Recruiting
Promotions
SAP BASIS
Law

Job Details

At NVIDIA, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, you'll be immersed in a diverse, supportive environment where everyone is inspired to do their best work.

The data center platforms like GB200 NVL72 by NVIDIA are redefining AI, HPC, and cloud computing. To accommodate leading workloads globally, our diagnostic systems need to evolve across diverse hardware technologies. We're in search of a visionary technical leader to engineer and propel innovation in diagnostics for NVIDIA's partner ecosystem. This role is essential in crafting how we validate, debug, and optimize complex server platforms across ODM factories, Cloud Service Provider (CSP) deployments, and field operations.

What You'll Be Doing:
  • Develop diagnostic systems for NVIDIA data center platforms, which involve hardware and software tools to develop the worst case stress workloads for CPUs, GPUs, memory, storage, and interconnects.
  • Lead platform bring-up and integration, ensuring diagnostics are embedded early and effectively across the server lifecycle.
  • Drive hardware validation strategy in collaboration with architecture and hardware teams, crafting robust validation plans for new server generations.
  • Analyze root causes of complex failures, acting as a Level 2 engineering contact for critical issues and offering scalable solutions across the stack.
  • Develop diagnostics software to ensure quality and performance at scale across ODM and partner production lines.
  • Mentor and grow engineering teams, providing technical leadership and encouraging a culture of innovation and excellence.
  • Influence the long-term strategy by developing diagnostic architecture and roadmaps for the upcoming products of NVIDIA and its partners.

What we need to see:
  • Proven experience architecting diagnostics for complex server systems, especially at the SW/HW interface.
  • Deep systems knowledge: x86/ARM architectures, Linux/Windows OS internals, firmware (UEFI/BIOS), BMC, and platform security.
  • Ability to weigh tradeoffs in system development and drive the most optimum solutions with customers and multi-disciplinary teams
  • Expertise in programming languages like C, C++, and Python for tool development and automation.
  • Familiarity with high-speed interconnects such as PCIe, Infiniband, NVLink, and Ethernet.
  • Strong communication skills to engage with technical and executive team.
  • BS/MS or equivalent experience in Computer Science, Electrical Engineering, or related field.
  • 12+ years of engineering experience in diagnostics, embedded systems, or cloud platforms.

Ways to stand out from the crowd:
  • Experience driving diagnostics across rack-level or cluster-level deployments.
  • Background in cloud-scale infrastructure and partner engagement.
  • Demonstrated success in influencing product direction and vendor roadmaps.
  • Passion for mentoring and building high-performing teams.

NVIDIA is at the forefront of AI, HPC, and visualization. Our diagnostics are the nervous system of our platforms-ensuring reliability, performance, and innovation at scale. If you're a creative, driven architect ready to shape the future of diagnostics, we want to hear from you.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD for Level 5, and 272,000 USD - 425,500 USD for Level 6.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until September 13, 2025.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.