Senior Site Reliability Engineer (Linux)
We are looking for a Senior SRE (Linux) to build and operate the systems that run on every host across our infrastructure.
Our team owns OS configuration, automation, AMIs, and container base images across a large-scale hybrid environment (AWS + on-prem). If you enjoy working across layers—from automation to low-level debugging—this role sits right at that boundary.
What You’ll Do
• Build automation for Linux host lifecycle (config, patching, images)
• Own system services, base images, and infrastructure components
• Debug production issues across OS, performance, and service layers
• Work across codebases (C, Go, Python, Ruby) to diagnose and fix issues
• Lead projects from ambiguous problems to production
• Improve reliability through automation and system design
• Partner on security and FedRAMP requirements
• Participate in a sustainable on-call rotation (~16 days/year)
What Makes This Role Interesting
• Own the Linux layer across the entire platform
• Work across the stack: kernel → containers → automation
• Solve real tradeoffs: reliability, security, and developer experience
• High ownership with direct impact on how engineers run services
What We’re Looking For
• 7+ years working with Linux in production
• Strong automation skills (Python and/or Ruby, Ansible preferred)
• Experience debugging complex systems issues
• Comfortable working across cloud + on-prem environments
• U.S. Person required (FedRAMP; U.S.-based work)
Nice to Have
• Docker / Kubernetes
• AMIs or container image building
• Go, C, or other systems-level languages
• Experience with compliance environments (FedRAMP, NIST, etc.)
Team Culture
We value engineers who take ownership, dive deep, and build systems that make the “right
thing” easy. Strong fundamentals matter more than specific tools