Software Development Engineer, AI Infrastructure

Overview

On Site
USD 99,400.00 per year
Full Time

Skills

Embedded Systems
Innovation
Management
Network
Switches
Training
Computer Hardware
Open Source
GPU
Artificial Intelligence
FOCUS
Specification Gathering
UPS
Presentations
Collaboration
System Testing
Customer Support
Python
Java
Data Structure
Algorithms
Operating Systems
Concurrent Computing
Linux
Git
Cloud Computing
Software Development
Microservices
Kubernetes
Analytical Skill
Problem Solving
Conflict Resolution
Communication
Software Engineering
Computer Science
Electrical Engineering
SAP BI
Military
Law
Recruiting

Job Details

WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.

AMD together we advance_

THE ROLE:

We are looking for a dynamic, upbeat software engineer to join our growing team. The team develops software to deploy and manage AI clusters at scale. You will develop controllers, agents and operators to configure, monitor and troubleshoot large-scale distributed systems comprising tens of thousands of GPUs, CPUs, network cards and switches. You will help optimize training and inference performance across hardware and software components. You will contribute to cutting-edge open-source projects like the AMD GPU Operator, Metrics Exporter, Container Toolkit and more.

THE PERSON:

The ideal candidate possesses an innovative and problem-solving mindset, has a keen interest in software engineering, distributed systems and AI, and is not afraid to work in a challenging, fast-paced environment. We set high standards for ourselves and focus relentlessly on customer success.

KEY RESPONSIBILITIES:
  • Contribute to the design and development of new software features and components
  • Write unit, integration, end-to-end, performance and scale tests
  • Contribute to the design and implementation of future product architectures offering improved capabilities, scale and security
  • Collaborate with other engineering teams, share information in the form of specs, write-ups, presentations, etc.
  • Collaborate closely with system test and customer support teams to deploy the software and ensure customer success

PREFERRED EXPERIENCE:
  • Good knowledge and hands-on experience with Go (preferred), Python, Java or similar programming language
  • Good understanding of software engineering principles, data structures, algorithms, operating systems concepts and concurrency
  • Familiarity with Linux, git and modern software development tools and techniques
  • Familiarity with cloud-native software development principles and tools (microservice architectures, protobuf, gRPC, containers, Kubernetes, etc.)
  • Good analytical, problem-solving and communication skills

ACADEMIC CREDENTIALS:
  • Bachelor's or Master's degree in Computer/Software Engineering, Computer Science, Electrical Engineering or related technical discipline

LOCATION:

Santa Clara, CA

#LI-BW1

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.