Machine Learning Compute Efficiency Lead, Infrastructure & Planning

Cupertino, CA, US • Posted 1 day ago • Updated 6 hours ago
Full Time
On-site
Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

  • FOCUS
  • Software Engineering
  • Management
  • Collaboration
  • Root Cause Analysis
  • Customer Experience
  • Scheduling
  • Cost Reduction
  • Artificial Intelligence
  • Machine Learning (ML)
  • Cloud Computing
  • Training
  • PyTorch
  • JAX
  • Computer Cluster Management
  • Kubernetes
  • GPU
  • Computer Hardware
  • Capacity Management
  • Negotiations
  • Roadmaps
  • Finance
  • Decision-making
  • Modeling
  • Optimization

Summary

Apple's Platform Acceleration & Compute Efficiency (PACE) is a high-leverage team operating at the critical intersection of our ML organizations, underlying compute infrastructure, and core platform tooling. Our mission is to empower Apple's software engineering teams with efficient, scalable compute. By driving out operational friction and optimizing the broader machine learning ecosystem, we directly accelerate the pace of development across the company.\\n\\nAs foundation models become increasingly central to Apple's user experiences, maximizing the efficiency of our ML compute is paramount. In this role, you will focus relentlessly on compute efficiency, ensuring that Apple's models run as fast, reliably, and cost-effectively as possible. You will tackle massive optimization challenges, from maximizing hardware utilization across GPUs, TPUs, and custom Apple Silicon, to shaping workload scheduling and capacity allocation for large model serving.\\n\\nWe are seeking a Senior Architect with deep expertise in ML infrastructure to act as a linchpin for Apple's foundational inference strategy. You will be instrumental in defining, establishing, and monitoring compute efficiency metrics across the software engineering organization. By partnering closely with model developers and infrastructure providers, your work will directly reduce serving costs, shape core engineering decisions, and enable the highly efficient, scalable inference required to power Apple Intelligence for hundreds of millions of users.

- Own and support ML compute management for Apple's inference workloads (GPU, TPU, and custom silicon) to enable large-scale model serving.\n- Collaborate closely with Apple Intelligence and ML engineering teams to understand roadmaps and resource pain points to develop and implement resource strategies.\n- Optimize Apple's ML workloads by driving performance improvements, maximizing resource utilization, and reducing service costs through deep root cause analysis that shapes both engineering decisions and the end customer experience.\n- Architect solutions for large-scale optimization problems, including capacity allocation, workload scheduling, and cost reduction, enabling Apple's AI-driven experiences.\n- Advocate on behalf of Apple's ML engineers to bring a consolidated view of ML platform and model inference requirements to Apple's internal infrastructure platform providers and 3rd party public cloud providers.

MS or PhD in a relevant field\nDirect experience with foundation model serving, inference, and training at scale\nFamiliarity with PyTorch, JAX, cluster management (Slurm, Kubernetes), or GPU/TPU hardware\nPrior experience in efficiency, FinOps, or capacity planning\nExperience negotiating technical roadmaps with platform or infrastructure teams\nBackground in technical and financial decision-making (TCO modeling, cost optimization)

MS or PhD in a relevant field\nDirect experience with foundation model serving, inference, and training at scale\nFamiliarity with PyTorch, JAX, cluster management (Slurm, Kubernetes), or GPU/TPU hardware\nPrior experience in efficiency, FinOps, or capacity planning\nExperience negotiating technical roadmaps with platform or infrastructure teams\nBackground in technical and financial decision-making (TCO modeling, cost optimization)
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90733111
  • Position Id: f218202f534a2d2708badddf47b79fcc
  • Posted 1 day ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Cupertino, California

Today

Full-time

Santa Clara, California

Today

Full-time

Cupertino, California

Today

Full-time

Santa Clara, California

Today

Full-time

USD 152,000.00 - 241,500.00 per year

Search all similar jobs