The Triage Intelligence and Debug Engineering team operates at the deepest layers of Apple's software stack, with a mandate that spans the entire product ecosystem. The work we do directly shapes the reliability of over 2 billion active Apple devices - every crash triaged, every panic root-caused, and every automated pipeline we build contributes to the OS stability that hundreds of millions of people depend on every day. We partner closely with engineering teams across Apple to drive that reliability end-to-end, influencing how the entire organization detects, understands, and resolves systemic failures.
Description
We are looking for a curious and motivated Software Engineer with a passion for OS internals, automation, and intelligent systems. The impact of this role is broad and direct - your investigations will uncover failures affecting Apple's entire device ecosystem, your contributions will touch software running across iPhone, Mac, and Apple Silicon, and the automation pipelines you help build will shape how reliability engineering scales across Apple.
You will dig into the operating system to understand the true origin of crashes and panics - not just the symptom at the top of a stack trace, but the underlying system-level conditions that caused it. You will operationalize that understanding by encoding triage logic into automation pipelines, and explore how ML and AI techniques can make those pipelines smarter over time. You will collaborate across Software, Hardware, and Silicon teams to drive issues all the way to resolution.
This role is a strong fit for an engineer who is eager to learn how operating systems work at a deep level, excited to build scalable automation and intelligent tooling, and motivated by seeing their work have real, measurable impact.
Minimum Qualifications
Working knowledge of C and/or Swift for debug tooling or system-level development
Exposure to automation frameworks, CI/CD pipelines, or scalable test systems
Foundational understanding of OS internals: process and thread lifecycle, virtual memory, scheduling, synchronization, and system calls
Ability to read and reason through crash reports and kernel panics - backtraces, register state, and basic memory analysis
Hands-on experience with debugging tools such as LLDB, GDB, or Instruments
Proficiency in Python for scripting, automation, and test infrastructure
Preferred Qualifications
Experience applying ML or AI techniques to systems problems - crash clustering, log anomaly detection, failure classification, or intelligent alert prioritization
Familiarity with LLMs or generative AI tooling in an engineering context - prompt engineering, RAG pipelines, or AI-assisted debugging workflows
Prior coursework or project experience in OS internals, systems programming, or low-level debugging
Familiarity with Apple platform internals: XNU kernel, Darwin subsystems, IOKit, libdispatch, or dyld
Contributions to automation frameworks or developer tooling that improved engineering productivity
Experience or interest in collaborating across silicon, firmware, or platform systems teams
Enthusiasm for building tools and systems that empower other engineers
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 90733111
- Position Id: 973991db74223bb5e17d9dbc5412e875
- Posted 30+ days ago