Storage Site Reliability Engineer - Apple Service Engineering

Washington, WA, US • Posted 4 days ago • Updated 1 day ago
Full Time
On-site
Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

  • IaaS
  • IO
  • Stacks Blockchain
  • Optimization
  • Operational Excellence
  • Innovation
  • Software Engineering
  • Incident Management
  • Backup
  • Recovery
  • Dashboard
  • Communication
  • Virtual Team
  • Reliability Engineering
  • Analytical Skill
  • Conflict Resolution
  • Problem Solving
  • Attention To Detail
  • Data Storage
  • Rust
  • C++
  • Java
  • C#
  • Scripting
  • Bash
  • Python
  • Perl
  • Operating Systems
  • Multithreading
  • Management
  • Computer Networking
  • Scalability
  • Software Testing
  • Computer Science
  • Data Structure
  • Algorithms
  • Concurrent Computing
  • Database
  • Storage
  • Unix
  • Linux

Summary

Apple Services Engineering (ASE) designs, builds, and operates the cloud infrastructure, server systems, and platform technologies that power many of Apple's most beloved experiences. \\n\\nWithin ASE, the Storage Platforms organization develops the systems that store, protect, and serve Apple's data at massive scale, with a mission to deliver storage that is durable, secure, highly available, and operated with excellence. Engineers on this team will have the rare opportunity to work on storage device-optimized low-level storage, large-scale distributed systems, and high-performance IO stacks operating at mission-critical levels of availability and durability. \\n\\nEach component is being built using first principles from the ground up to unlock optimization opportunities at every layer of the stack. Being part of Apple Services Engineering organization opens the door to exerting cross-functional influence and making a more significant organizational impact.\\n\\nIf you are passionate about large scale distributed systems, operational excellence, and creating resilient platforms that enable innovation across Apple, we would love to hear from you.

We are seeking a highly skilled, collaborative, and pragmatic Storage Site Reliability Engineer to join our team. In this role, you will help build and operate reliable, scalable storage infrastructure that supports rapidly growing platform needs. You will partner with cross-functional teams across software engineering, compute, networking, and infrastructure to design and implement automation, improve observability, strengthen incident response, and enhance the overall reliability of the platform. \n\nThe team contributes to all major aspects of storage deployment infrastructure, including maintenance automation, backup and recovery services, monitoring and alerting tooling, dashboards, deployment architecture, and database improvements focused on stability, performance, and scale. You will also play an important role in shaping the evolution of the platform as it scales by orders of magnitude. \n\nSuccess in this role requires a passion for large-scale distributed systems, strong problem-solving ability, excellent communication, and a strong customer-focused mindset when working with internal platform users. Experience working effectively in a distributed team environment is highly valued.

3+ years of experience in Site Reliability Engineering or infrastructure engineering\nStrong analytical and problem-solving skills, with careful attention to detail\nExperience designing, building, or operating storage systems\n2+ years of programming experience in one or more of the following languages: Rust, C++, Java, or C#\nExperience with scripting languages such as Bash, Python, or Perl\nStrong understanding of operating systems fundamentals, including multithreading, memory management, networking, storage, performance, and scalability\nBachelor's degree in Computer Science, a related engineering field, or equivalent practical experience

Excellent knowledge of software testing methodologies & practices\nDeep understanding of core computer science concepts, including data structures, algorithms, and concurrency.\nSolid grasp of distributed systems fundamentals such as fault tolerance, consistency, and distributed rate limiting.\nExperience designing and operating large-scale distributed systems such as databases or storage platforms.\nProficient with UNIX/Linux
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90733111
  • Position Id: 9732a485744105b60c66bea601950acf
  • Posted 4 days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Washington

Yesterday

Full-time

Washington

Yesterday

Full-time

Washington

Yesterday

Full-time

Washington

Yesterday

Full-time

Search all similar jobs