Berkeley Heights, New Jersey
•
Today
Job Description 8+ years in Site Reliability Engineering, Production Engineering, or equivalent roles. Deep expertise in distributed systems, resilience engineering, and largescale production operations. Strong proficiency with observability stacks: Metrics, logs, traces Splunk, ELK, New Relic, synthetic monitoring, APM Advanced experience with servicelevel objectives (SLOs), SLIs, error budgets, and reliability governance. Expertise in Kubernetes, container orchestration, and workload reliabi
Easy Apply
Contract
Depends on Experience

















