Overview
On Site
USD 136,000.00 - 212,750.00 per year
Full Time
Skills
Customer Support
Network Operations
Computer Hardware
IPC
Management
Software Design
FOCUS
Collaboration
Research and Development
Spectrum
Performance Metrics
ROOT
Software Development
Debugging
Electrical Engineering
Software Engineering
C
Python
Embedded Systems
Linux
Routers
SR-IOV
Firmware
BIOS
Operating Systems
Docker
Communication
Motivation
Computer Networking
Switches
InfiniBand
Ethernet
Network
HPC
Performance Testing
Artificial Intelligence
Stacks Blockchain
MPI
CUDA
Recruiting
Promotions
SAP BASIS
Law
Job Details
The NVIDIA Enterprise Experience (NVEX) Solutions Engineering team is looking for a senior Computer or Software Engineer who is ready to become an authority in ground-breaking network technology used in AI clusters. Our team of software engineers bridge the gap between the customer support teams and R&D, focusing on resolution of tough problems from the front lines and providing the highest level of support for InfiniBand, NVLink, and Spectrum-X network systems that interconnect GPUs and AI compute infrastructure.
Candidates must have a software development background in the networking industry either for a network hardware manufacturer or software integrator. It is essential to have a proven grasp of in-field, production network operations and have experience in root-causing customer-found issues down to the source code level, primarily C and Python. Breadth of experience is key. We want to see experience in multiple areas such as network operating systems (NOS), Linux network drivers and internals, network hardware, NIC software, Smart NICs, DPUs, embedded firmware, Software Defined Networking, and infrastructure management technologies. IPC, race conditions, finite state machines, event processing loops, queue management, network traffic and flow analysis, and software design gaps will be common areas of focus. The individual will get to work across many NVIDIA teams and often interact with both internal and external customers, so superb interpersonal and communication skills are essential. Candidates will need to understand, root cause, and resolve complex issues, and provide detailed explanations of what you find.
What you will be doing:
What we need to see:
Ways to stand out from the crowd:
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 136,000 USD - 212,750 USD for Level 3, and 168,000 USD - 264,500 USD for Level 4.
You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until August 11, 2025.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
Candidates must have a software development background in the networking industry either for a network hardware manufacturer or software integrator. It is essential to have a proven grasp of in-field, production network operations and have experience in root-causing customer-found issues down to the source code level, primarily C and Python. Breadth of experience is key. We want to see experience in multiple areas such as network operating systems (NOS), Linux network drivers and internals, network hardware, NIC software, Smart NICs, DPUs, embedded firmware, Software Defined Networking, and infrastructure management technologies. IPC, race conditions, finite state machines, event processing loops, queue management, network traffic and flow analysis, and software design gaps will be common areas of focus. The individual will get to work across many NVIDIA teams and often interact with both internal and external customers, so superb interpersonal and communication skills are essential. Candidates will need to understand, root cause, and resolve complex issues, and provide detailed explanations of what you find.
What you will be doing:
- Assist various network and AI cluster support teams in reproducing, resolving, and root causing sophisticated customer issues
- Work with R&D teams to develop bug fixes, workarounds, and solutions for critical customers using NVIDIA's network technologies
- Become an authority in NVIDIA network technologies used in AI clusters such as Infiniband, NVLink, and Spectrum-X
- Analyze network performance metrics and make tuning recommendations for high-performance, lossless networks
- Develop support and analysis tools to help analyze and root cause field issues
- Daily use of ground breaking AI tools for software development, log and trace analysis, and source code debugging
- Occasional work on weekends or holidays to support customers
What we need to see:
- Minimum of a BS in Computer, Electrical, or Software Engineering (or equivalent experience)
- 5-10 years of experience in C programming in Linux and embedded systems
- Proficiency in Python
- At least 5 years of experience developing software for one or more of the following:
Linux NIC drivers, switch ASICs and SDKs, embedded network device firmware, Linux based network equipment (routers, switches, gateways, etc), network operating systems, virtual routers, SDN stacks, virtual switching, DPDK, SRIOV stacks - At least 5 years of experience directly supporting end-customers, partners, or integrators for network equipment and infrastructures
- Strong system software (firmware, BIOS, kernel, driver, operating system) expertise
- Experience with container environments (K8s and Docker)
- Professional-level communication skills, including adjusting communication to the technical level of the audience, and staying calm and focused in negative situations.
- Passion for learning innovative tech and motivation to work hard on ground-breaking products
Ways to stand out from the crowd:
- Background with AI infrastructure and HPC networking
- Experience programming switch and NIC ASICs and SDKs
- Experience with Infiniband or other non-Ethernet network technologies
- Experience developing or supporting DPUs or SmartNICs
- Knowledge of HPC performance test tools and NVIDIA AI stacks (NCCL, MPI, DOCA, CUDA)
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 136,000 USD - 212,750 USD for Level 3, and 168,000 USD - 264,500 USD for Level 4.
You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until August 11, 2025.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.