Storage Site Reliability Engineer - Apple Service Engineering

Washington, WA, US • Posted 30+ days ago • Updated 7 hours ago
Full Time
On-site
Fitment

Dice Job Match Score™

👾 Reticulating splines...

Job Details

Skills

  • IaaS
  • IO
  • Stacks Blockchain
  • Optimization
  • Operational Excellence
  • Innovation
  • Software Engineering
  • Incident Management
  • Backup
  • Recovery
  • Dashboard
  • Communication
  • Virtual Team
  • Reliability Engineering
  • Analytical Skill
  • Problem Solving
  • Conflict Resolution
  • Attention To Detail
  • Data Storage
  • Rust
  • C++
  • Java
  • C#
  • Scripting
  • Bash
  • Python
  • Perl
  • Operating Systems
  • Multithreading
  • Management
  • Computer Networking
  • Scalability
  • Software Testing
  • Computer Science
  • Data Structure
  • Algorithms
  • Concurrent Computing
  • Database
  • Storage
  • Unix
  • Linux

Summary

Apple Services Engineering (ASE) designs, builds, and operates the cloud infrastructure, server systems, and platform technologies that power many of Apple's most beloved experiences.

Within ASE, the Storage Platforms organization develops the systems that store, protect, and serve Apple's data at massive scale, with a mission to deliver storage that is durable, secure, highly available, and operated with excellence. Engineers on this team will have the rare opportunity to work on storage device-optimized low-level storage, large-scale distributed systems, and high-performance IO stacks operating at mission-critical levels of availability and durability.

Each component is being built using first principles from the ground up to unlock optimization opportunities at every layer of the stack. Being part of Apple Services Engineering organization opens the door to exerting cross-functional influence and making a more significant organizational impact.

If you are passionate about large scale distributed systems, operational excellence, and creating resilient platforms that enable innovation across Apple, we would love to hear from you.

Description

We are seeking a highly skilled, collaborative, and pragmatic Storage Site Reliability Engineer to join our team. In this role, you will help build and operate reliable, scalable storage infrastructure that supports rapidly growing platform needs. You will partner with cross-functional teams across software engineering, compute, networking, and infrastructure to design and implement automation, improve observability, strengthen incident response, and enhance the overall reliability of the platform.

The team contributes to all major aspects of storage deployment infrastructure, including maintenance automation, backup and recovery services, monitoring and alerting tooling, dashboards, deployment architecture, and database improvements focused on stability, performance, and scale. You will also play an important role in shaping the evolution of the platform as it scales by orders of magnitude.

Success in this role requires a passion for large-scale distributed systems, strong problem-solving ability, excellent communication, and a strong customer-focused mindset when working with internal platform users. Experience working effectively in a distributed team environment is highly valued.

Minimum Qualifications

3+ years of experience in Site Reliability Engineering or infrastructure engineering

Strong analytical and problem-solving skills, with careful attention to detail

Experience designing, building, or operating storage systems

2+ years of programming experience in one or more of the following languages: Rust, C++, Java, or C#

Experience with scripting languages such as Bash, Python, or Perl

Strong understanding of operating systems fundamentals, including multithreading, memory management, networking, storage, performance, and scalability

Bachelor's degree in Computer Science, a related engineering field, or equivalent practical experience

Preferred Qualifications

Excellent knowledge of software testing methodologies & practices

Deep understanding of core computer science concepts, including data structures, algorithms, and concurrency.

Solid grasp of distributed systems fundamentals such as fault tolerance, consistency, and distributed rate limiting.

Experience designing and operating large-scale distributed systems such as databases or storage platforms.

Proficient with UNIX/Linux
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90733111
  • Position Id: 9732a485744105b60c66bea601950acf
  • Posted 30+ days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Washington

Today

Full-time

Washington

Today

Full-time

Washington

Today

Full-time

Seattle, Washington

Today

Full-time

Search all similar jobs