Site Reliability Engineer, Physical Infrastructure

Cupertino, CA, US • Posted 12 hours ago • Updated 12 hours ago
Full Time
On-site
Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

  • Unix Administration
  • Mentorship
  • Collaboration
  • Quality Management
  • DevOps
  • Software Development
  • Swift
  • Python
  • Shell Scripting
  • Bash
  • Management
  • Orchestration
  • Kubernetes
  • Docker
  • Computer Networking
  • TCP/IP
  • Dragon NaturallySpeaking
  • DNS
  • HTTP
  • Conflict Resolution
  • Problem Solving
  • Communication
  • Computer Science
  • Build Automation
  • Unix
  • Linux Administration
  • Command-line Interface
  • Performance Analysis
  • Capacity Management
  • Incident Management
  • Generative Artificial Intelligence (AI)
  • Large Language Models (LLMs)
  • Scripting

Summary

We are looking for a creative and highly motivated Site Reliability Engineer to join our team. Having depth and breadth of knowledge working in physical infrastructure in a large-scale distributed environment is a strength you'll need. You should have experience in unix systems administration, DevOps, and data center infrastructure. If you are passionate about solving complex problems at scale, we want to hear from you!

The Systems and Infrastructure team builds and manages world class services and physical infrastructure for Apple software engineers world wide to build, test, and release Apple's software. \n\nAbout Our Team: We are a team dedicated to engineering excellence, reusable design, and simplicity. We foster a supportive, growth-focused culture where we mentor each other and work together to build resilient, high-quality systems.

3+ years of experience as a Site Reliability Engineer, DevOps Engineer, or Systems Admin focused on physical infrastructure in a large-scale distributed environment\nStrong software development skills in a language like Swift, Go, or Python, and a high degree of comfort with shell scripting (Bash)\nHands-on experience building and managing systems with container orchestration tools (Kubernetes, Docker)\nDeep understanding of networking (TCP/IP, DNS, HTTP) and experience using observability tools (monitoring, logging, tracing) to diagnose complex issues\nExcellent problem-solving and communication skills, with a strong sense of ownership and drive\nBS/MS in Computer Science, Engineering or related field

Build automation tools that eliminate routine tasks. Every manual process is an opportunity to code a solution\nExperience with Unix/Linux systems administration and command-line diagnostic tools\nProven experience leading initiatives to reduce technical debt, refactor systems, or improve performance and latency\nExpertise in performance analysis and capacity planning for physical infrastructure.\nDemonstrated ability to lead incident response for high-impact outages\nFamiliarity with using Generative AI (GenAI) or Large Language Models (LLMs) to accelerate operational tasks, such as automating runbooks, generating scripts, or analyzing incident data
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90733111
  • Position Id: bf22b3e2f3bf762b634955c512947473
  • Posted 12 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Sunnyvale, California

Today

Full-time

Santa Clara, California

Today

Full-time

USD 165,500.00 - 289,600.00 per year

Cupertino, California

Today

Full-time

Cupertino, California

Today

Full-time

Search all similar jobs