Software Engineer, AI Systems Performance Modeling, Dojo

  • Palo Alto, CA
  • Posted 17 hours ago | Updated 5 hours ago

Overview

On Site
USD 132,000.00 per year
Full Time

Skills

Art
Mapping
Distribution
FSD
Dojo
Training
Artificial Intelligence
Development Testing
Performance Analysis
Debugging
Testing
Computer Science
Electrical Engineering
C++
Deep Learning
Data Flow
Parallel Computing
Optimization
Computer Hardware
Analytical Skill
Modeling
Conflict Resolution
Problem Solving
Communication
Collaboration
Machine Learning (ML)
Palo Alto
PPO
Payroll
Health Care
FSA
Finance
Apache Flex
Legal
Insurance

Job Details

Join Tesla's Dojo Performance Team to design and optimize cutting-edge system-level simulation frameworks for AI accelerators. You will simulate the performance of thousands of Dojo compute nodes operating in parallel to drive state-of-the-art machine learning (ML) workloads. This role centers on modeling large-scale AI training systems, to evaluate performance of new kernels and mapping strategies. By analyzing trade-offs between memory, compute, and communication across system resources, you will help push the boundaries of AI performance and efficiency.

Responsibilities
  • Develop system-level simulation frameworks to model the performance of massively parallel AI accelerators, including compute distribution, memory hierarchy, interconnects, and dataflow
  • Simulate and analyze how large-scale ML workloads, from FSD to LLMs, are mapped and executed across thousands of Dojo compute nodes
  • Collaborate with ML architects, kernel developers, and system engineers to ensure simulations reflect real-world AI training requirements
  • Design and implement tests to evaluate trade-offs in system resources, including memory bandwidth, capacity, latency, and compute, to optimize performance for large-scale AI workloads
  • Build and maintain software tools and frameworks to support simulation development, testing, and integration
  • Conduct performance analysis to identify bottlenecks and propose system-level optimizations
  • Stay current with advancements in ML model architectures, parallel computing, and system-level simulation techniques
  • Participate in code reviews, debugging, and testing to ensure robust and scalable simulation frameworks

Requirements
  • Degree in Computer Science, Electrical Engineering, or proof of exceptional skills in related fields, or equivalent experience
  • Strong proficiency in C++ for developing high-performance simulation frameworks
  • Solid understanding of ML/deep learning model architectures, including how models are partitioned and mapped across multiple devices. Good understanding in Compute Architecture, Memory Hierarchy, and Dataflows
  • Experience in system-level simulation, parallel computing, or ML workload optimization
  • Knowledge of kernel development processes and how ML workloads are deployed on hardware accelerators
  • Familiarity with analytical simulation techniques for modeling high-level system behavior
  • Excellent problem-solving skills, with the ability to analyze complex systems and propose innovative solutions
  • Strong communication and collaboration skills to work effectively with cross-functional teams, including ML researchers, architects, and engineers
  • Ability to work onsite in our Palo Alto, CA office

Compensation and Benefits
Benefits

Along with competitive pay, as a full-time Tesla employee, you are eligible for the following benefits at day 1 of hire:
  • Aetna PPO and HSA plans > 2 medical plan options with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental (including orthodontic coverage) and vision plans, both have options with a $0 paycheck contribution
  • Company Paid (Health Savings Account) HSA Contribution when enrolled in the High Deductible Aetna medical plan with HSA
  • Healthcare and Dependent Care Flexible Spending Accounts (FSA)
  • 401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits
  • Company paid Basic Life, AD&D, short-term and long-term disability insurance
  • Employee Assistance Program
  • Sick and Vacation time (Flex time for salary positions), and Paid Holidays
  • Back-up childcare and parenting support resources
  • Voluntary benefits to include: critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance
  • Weight Loss and Tobacco Cessation Programs
  • Tesla Babies program
  • Commuter benefits
  • Employee discounts and perks program
    • Expected Compensation

      $132,000 - $390,000/annual salary + cash and stock awards + benefits
      Pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. The total compensation package for this position may also include other elements dependent on the position offered. Details of participation in these benefit plans will be provided if an employee receives an offer of employment.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.