Apply Now

Machine Learning Compute Efficiency Lead, Infrastructure & Planning

Cupertino, CA, US • Posted 30+ days ago • Updated 9 hours ago

Full Time

On-site

Fitment

Dice Job Match Score™

⏳ Almost there, hang tight...

Job Details

Skills

FOCUS
Software Engineering
Management
Collaboration
Root Cause Analysis
Customer Experience
Cost Reduction
Cloud Computing
Computer Science
Computer Engineering
Systems Architecture
Scheduling
Fluency
Artificial Intelligence
Workflow
Analytical Skill
Communication
Presentations
Machine Learning (ML)
Training
PyTorch
JAX
Computer Cluster Management
Kubernetes
GPU
Computer Hardware
Capacity Management
Negotiations
Roadmaps
Finance
Decision-making
Modeling
Optimization

Summary

Apple's Platform Acceleration & Compute Efficiency (PACE) is a high-leverage team operating at the critical intersection of our ML organizations, underlying compute infrastructure, and core platform tooling. Our mission is to empower Apple's software engineering teams with efficient, scalable compute. By driving out operational friction and optimizing the broader machine learning ecosystem, we directly accelerate the pace of development across the company.\\n\\nAs foundation models become increasingly central to Apple's user experiences, maximizing the efficiency of our ML compute is paramount. In this role, you will focus relentlessly on compute efficiency, ensuring that Apple's models run as fast, reliably, and cost-effectively as possible. You will tackle massive optimization challenges, from maximizing hardware utilization across GPUs, TPUs, and custom Apple Silicon, to shaping workload scheduling and capacity allocation for large model serving.\\n\\nWe are seeking a Senior Architect with deep expertise in ML infrastructure to act as a linchpin for Apple's foundational inference strategy. You will be instrumental in defining, establishing, and monitoring compute efficiency metrics across the software engineering organization. By partnering closely with model developers and infrastructure providers, your work will directly reduce serving costs, shape core engineering decisions, and enable the highly efficient, scalable inference required to power Apple Intelligence for hundreds of millions of users.

- Own and support ML compute management for Apple's inference workloads (GPU, TPU, and custom silicon) to enable large-scale model serving.\n- Collaborate closely with Apple Intelligence and ML engineering teams to understand roadmaps and resource pain points to develop and implement resource strategies.\n- Optimize Apple's ML workloads by driving performance improvements, maximizing resource utilization, and reducing service costs through deep root cause analysis that shapes both engineering decisions and the end customer experience.\n- Architect solutions for large-scale optimization problems, including capacity allocation, workload scheduling, and cost reduction, enabling Apple's AI-driven experiences.\n- Advocate on behalf of Apple's ML engineers to bring a consolidated view of ML platform and model inference requirements to Apple's internal infrastructure platform providers and 3rd party public cloud providers.

BS in Computer Science, Computer Engineering, or equivalent practical experience\n7+ years in ML infrastructure, systems architecture, or efficiency/optimization roles at scale\nStrong conceptual understanding of foundation model inference/serving at scale and distributed training (data/tensor/pipeline parallelism), GPU/TPU utilization, memory hierarchies, and cluster scheduling\nAI-fluent and capable of quickly adapting to AI workflows and empowerment\nProven track record of driving complex cross-org technical initiatives through influence, not authority\nStrong analytical skills with experience designing or interpreting utilization analyses, capacity models, or efficiency metrics\nClear written and verbal communication, comfortable presenting to VPs and white-boarding with senior ML engineers

MS or PhD in a relevant field\nDirect experience with foundation model serving, inference, and training at scale\nFamiliarity with PyTorch, JAX, cluster management (Slurm, Kubernetes), or GPU/TPU hardware\nPrior experience in efficiency, FinOps, or capacity planning\nExperience negotiating technical roadmaps with platform or infrastructure teams\nBackground in technical and financial decision-making (TCO modeling, cost optimization)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 90733111
Position Id: f218202f534a2d2708badddf47b79fcc
Posted 30+ days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Cupertino, California

•

Today

Apple's Platform Acceleration & Compute Efficiency (PACE) is a high-leverage team operating at the intersection of our ML organizations, underlying compute infrastructure, and core platform tooling. Our mission is to empower Apple's software engineering teams with efficient, scalable compute. By driving out operational friction and optimizing the broader machine learning ecosystem, we directly accelerate the pace of development for our Software and AIML organization.\\n\\nFoundation models are c

Full-time

Staff/Sr. ML Compute Efficiency Engineer

Santa Clara, California

•

Today

Scaling machine learning workloads across thousands of GPUs and TPUs creates challenges that few engineers ever encounter. In Apple's Machine Learning Platform Technologies organization, we build the infrastructure that powers large-scale ML training and inference workloads, bringing together expertise in distributed systems, machine learning infrastructure, and high-performance computing. As a performance engineer in the ML Compute Efficiency team, you'll tackle ambiguous systems challenges, i

Full-time

Mgr, Engineering Program Management, AI Platforms & Infrastructure

Santa Clara, California

•

Today

Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Do you love taking on challenges that create a positive impact? Are you passionate about empowering many ground-breaking intelligent experiences to be made? The Apple Services Engineering org is building groundbreaking technology and we are looking for people like you! Apple offers a collaborative work environment that fosters creativity and innovati

Full-time

Manager, Machine Learning Infrastructure - SIML

Cupertino, California

•

Today

Do you think Computer Vision and Machine Learning can change the world? Do you think it can transform the way millions of people collect, discover and share the most special moments of their lives? We truly believe it can. And we are looking for hardworking engineers who can contribute to building the ecosystem of tooling necessary to create these exciting technologies.\\n\\nWe are the System Intelligent and Machine Learning (SIML) group that provides foundational computer vision and machine lea

Full-time

Search all similar jobs

More jobs at Apple, Inc. in Cupertino, CA

Machine Learning Compute Efficiency Lead, Infrastructure & Planning

Dice Job Match Score™

Job Details

Skills

Summary

Similar Jobs