Houston, Texas
•
Today
Job Description As a Site Reliability Engineer, you will be responsible for: Operational Excellence & Incident Management - Maintain and monitor production systems for availability, latency, and performance. - Lead incident response efforts, including communication, resolution, and postmortem documentation. - Design and implement health checks, alerting systems, and automated remediation workflows. - Drive root cause analysis and implement permanent resolutions for recurring issues. Observ
Full-time