Overview
Skills
Job Details
Job Title: Site Reliability Engineer (SRE) Location: [Remote/Hybrid/On-site]
Employment Type: Contract
Key Responsibilities:
Lead incident triage calls, drive quick resolutions, and conduct RCAs.
Troubleshoot and resolve complex issues, documenting solutions for the team.
Collaborate with engineering, DevOps, and business teams to improve system reliability.
Optimize infrastructure using IaC (Terraform, Ansible), CI/CD, and observability tools.
Mentor junior SREs and establish SLOs, error budgets, and automation best practices.
Requirements:
Strong experience in SRE/DevOps, incident management, and cloud (AWS/Google Cloud Platform/Azure).
Expertise in monitoring tools (Prometheus, Grafana, Datadog) and automation.
Excellent stakeholder management and communication skills.