Senior Site Reliability Engineer, Apple Data Platform SRE / Apple Services Engineering

Cupertino, CA, US • Posted 2 days ago • Updated 12 hours ago
Full Time
On-site
Fitment

Dice Job Match Score™

⏳ Almost there, hang tight...

Job Details

Skills

  • Collaboration
  • Innovation
  • Mentorship
  • Analytics
  • Technical Direction
  • Computer Science
  • Reliability Engineering
  • IT Management
  • Apache Hadoop
  • HDFS
  • Apache HBase
  • Apache Spark
  • Data Lake
  • Amazon S3
  • Apache Airflow
  • Budget
  • Communication
  • IT Strategy
  • Leadership
  • Linux
  • Computer Networking
  • Management
  • Ceph
  • Storage
  • Kubernetes
  • Testing
  • Disaster Recovery
  • Capacity Management
  • Migration
  • Roadmaps
  • Data Security
  • Access Control
  • Regulatory Compliance

Summary

At Apple, we believe that innovation flourishes in an environment where ideas are challenged, collaboration is encouraged, and technology is pushed to its limits. This environment is only possible when diverse minds come together, bringing unique perspectives and experiences. Our people and their ideas inspire innovation in everything we do. Imagine what you could accomplish here! Join Apple and help us make the world a better place.

As a principal contributor and technical lead in our Apple Data Platform (ADP) SRE organization, you will apply SRE principles as you mentor and partner with our engineers and partner teams, ensuring large-scale analytics infrastructure runs reliably and efficiently. This role focuses on driving reliability standards, architectural consistency, and engineering excellence across peer SRE teams and partner engineering organizations - spanning Hadoop, HBase, Spark, Data Lakes, and Airflow ecosystems - through technical leadership, cross-functional alignment, and the development of platform-wide tooling, observability, and operational practices that raise the reliability bar for all of ADP. This role includes production on-call responsibilities.

Description

Apple Service Engineering (ASE) teams build and scale the platforms and infrastructure behind many of Apple's services - including iCloud, iTunes, Siri, and Maps. We are the foundation on which Apple's software developers build the products that our customers love. We are looking for a passionate and dedicated Technical Lead to drive SRE standards and engineering excellence across the entire Apple Data Platform organization. The Apple Data Platform (ADP) SRE Technical Lead partners with multiple SRE and engineering teams across the data platform - including teams responsible for Hadoop and HBase infrastructure, Spark, S3-compatible storage, and Airflow-orchestrated pipelines. Rather than owning a single vertical, this role sets the technical direction for how reliability is practiced across ADP: defining SLOs, establishing architectural review processes, developing shared tooling and automation, and ensuring that SRE principles are applied consistently as the platform scales. You will be a force multiplier - making every team around you more effective.

Minimum Qualifications

BS/MS in Computer Science or equivalent

12+ years of experience in Site Reliability Engineering, managing infrastructure and services at scale

5+ years of experience in technical leadership roles, with demonstrated ability to lead horizontally across teams without direct authority

Broad expertise across the data platform stack: Hadoop (HDFS, YARN), HBase, Apache Spark, Data Lake architectures, S3-compatible storage solutions, and Apache Airflow

History of defining and driving SLO/error budget frameworks and reliability practices across multiple teams or services

Demonstrable programming skills to develop shared tooling, lead code reviews, and set engineering standards

Strong written and verbal communication skills - able to present technical strategy to both engineers and leadership

Advanced knowledge of Linux, networking, and distributed systems fundamentals

Preferred Qualifications

15+ years of experience in SRE or related work managing infrastructure at scale

Experience with Ceph object storage operations

Kubernetes cluster operations experience, particularly running stateful data workloads

Experience with scale testing, disaster recovery, and capacity planning across distributed data systems

Experience driving multi-year platform migrations or large-scale architectural transitions

Ability to define the technical roadmap for a data platform organization and drive cross-functional alignment on architectural standards and best practices

Background in data security, access control, or compliance-sensitive data environments
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90733111
  • Position Id: 76c54d1c56bbdb05df227101063e7328
  • Posted 2 days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Menlo Park, California

Today

Full-time

USD 169,000.00 - 224,000.00 per year

Santa Clara, California

Today

Full-time

USD 165,500.00 - 289,600.00 per year

Sunnyvale, California

Today

Full-time

Cupertino, California

Today

Full-time

Search all similar jobs