Apply Now

Senior Site Reliability Engineer

Washington, WA, US • Posted 30+ days ago • Updated 9 hours ago

Full Time

On-site

Fitment

Dice Job Match Score™

🤯 Applying directly to the forehead...

Job Details

Skills

Art
Open Source
Apache Cassandra
Apache ZooKeeper
Apache Kafka
Redis
Fleet Management
Software Engineering
FOCUS
Computer Science
Computer Engineering
Kubernetes
Internet
Dragon NaturallySpeaking
DNS
DHCP
LDAP
Server Virtualization
Operating Systems
Budget
Reliability Engineering
Process Improvement
Computer Hardware
Bootstrap
PXE
BIOS
Total Productive Maintenance
TPM
Provisioning
OpenStack
xCAT
Storage
Caching
Configuration Management
Orchestration
Puppet
Progress Chef
Ansible
Cloud Computing
Amazon Web Services
Amazon S3
Amazon EC2
Amazon CloudFront

Summary

Apple Services Engineering team is one of the most exciting examples of Apple's long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users.

We are building and supporting new and existing critical infrastructural systems and frameworks which provide and support services like structured and unstructured storage, caching, queueing, searching, and much more at hyperscale. These form the platform upon which many iCloud and other backend systems at Apple are built. The team is responsible for the next generation platform that will power Apple's infrastructural services. These services operate at extremely large scale and store exabytes of data. The platform will support a variety of services based on open-source software, such as Kubernetes, Cassandra, Zookeeper, Kafka, Redis, etc, alongside internally developed services.

Description

The Apple Services Engineering Cloud Services SRE organization is looking for a strong, enthusiastic developer to join as a member of this group. This person will have a tremendous amount of individual responsibility and influence over the direction the core platform of many critical Apple internet services takes for years to come. You are someone with ideas and real passion for software delivered as a service to improve reuse, efficiency, and simplicity. This engineer's work will affect hundreds of millions of users and be essential to the success of some of the most visible current and future Apple features.

We are domain experts in fleet management, systems, and software engineering. We build automations, instrument reliability tools, and respond to alerts and incidents which may pose a risk to the reliability of the platform. Team's focus is on infrastructure capabilities and processes, improving the reliability and efficiency of the systems, at scale.We are looking for a strong, enthusiastic developer to join as a member of this group. This person will have a tremendous amount of individual responsibility and influence over the direction the core platform of many critical Apple internet services takes for years to come. You are someone with ideas and real passion for software delivered as a service to improve reuse, efficiency, and simplicity. This engineer's work will affect hundreds of millions of users and be essential to the success of some of the most visible current and future Apple features.

Minimum Qualifications

Bachelors or Masters in Computer Science, Computer Engineering, or equivalent experience.

5+ years of experience developing platform services

Experience with large scale server provisioning and maintenance (OpenStack Ironic, Metal3, MAAS, xCat, Netbox, Tinkerbell)

Experience with development within Kubernetes ecosystem, including operator framework, controllers and CRDs

Understanding of base internet infrastructure services including DNS, DHCP, LDAP, server virtualization, server monitoring in critical, large scale distributed systems experience, combining Hardware, Operating Systems and Software

Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts, with a keen eye for opportunities to eliminate toil by code and process improvements.

Preferred Qualifications

Hardware bootstrap and associated security (PXE, BIOS, TPM, secure boot, trusted computing)

Experience with hyperscale server provisioning and maintenance (OpenStack Ironic, Metal3, MAAS, xCat, Netbox, Tinkerbell)

Structured or unstructured storage and caching

Automating operations processes via services and tools

Configuration management and fleet orchestration via Puppet, Chef, Ansible, or others

Cloud Services (AWS S3/EC2/CloudFront or equivalent)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 90733111
Position Id: cc48c6ead916979fe7ba0cca37e5ca0d
Posted 30+ days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Site Reliability Engineer - Kafka

Washington

•

Today

The Apple Service Engineering - Data Streaming SRE team is looking for Site Reliability Engineers with experience developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software engineering, systems engineering, and Devops practices to build and run large-scale, massively distributed, fault-tolerant systems. Our software ensures that Apple's services are reliable, scalable, and secure, and we leverage both open-source and ho

Full-time

Software Engineer, Backend, Level 4

Washington

•

Today

Snap Inc is a technology company. We believe the camera presents the greatest opportunity to improve the way people live and communicate. Snap contributes to human progress by empowering people to express themselves, live in the moment, learn about the world, and have fun together. The Company operates Snapchat, a visual messaging app that enhances your relationships with friends, family, and the world, and Specs Inc., a wholly-owned subsidiary dedicated to making computing more human, in addit

Full-time

USD 157,000.00 - 235,000.00 per year

Sr. Site Reliability Engineer (Starshield)

Redmond, Washington

•

Today

SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars. SR. SITE RELIABILITY ENGINEER (STARSHIELD) Starshield leverages SpaceX's Starlink technology and launch capability to support national security efforts. While Starlink is designed for consumer and commercial use, Star

Full-time

USD 165,000.00 - 230,000.00 per year

Cloud Infrastructure Engineer III: Core

Remote

•

Today

At Jack Henry, we're more than a technology company, we're a force for good in financial services. We're redefining how community banks and credit unions connect with the people they serve. Our mission is rooted in people inspired innovation, empowering financial institutions to deliver seamless, secure, and human centered experiences. We deliver cutting-edge solutions that are paving the way for the next generation of digital banking and payments, but our true impact begins with our associates.

Full-time

Search all similar jobs

More jobs at Apple, Inc. in Washington, WA