Overview
On Site
Depends on Experience
Contract - W2
Contract - Independent
Contract - 12 Month(s)
Skills
Site Reliability Engineer
SRE
Google Cloud Platform
GCP
Terraform
Ansible
Kubernetes
Job Details
Title: Site Reliability Engineer (SRE) Google Cloud Platform Cloud Location: Atlanta, GA Duration: Long-term Contract
Position Overview
We are seeking a Site Reliability Engineer (SRE) with strong experience in Google Cloud Platform (Google Cloud Platform). The ideal candidate must have hands-on experience in reliability engineering, production operations, and cloud infrastructure automation this is not a DevOps role but a true SRE position focused on scalability, reliability, and performance.
Key Responsibilities
- Design, build, and maintain scalable, reliable, and secure Google Cloud Platform infrastructure.
- Drive automation across deployment, monitoring, and incident response processes.
- Manage system uptime, reliability, and performance through proactive monitoring and capacity planning.
- Partner with development teams to improve application observability and operational readiness.
- Implement and maintain CI/CD pipelines focused on reliability and resilience.
- Troubleshoot complex production issues and drive post-incident analysis and RCA documentation.
- Define SLOs, SLIs, and SLAs to measure and improve system reliability.
- Optimize cost and performance across Google Cloud Platform environments.
Required Skills & Experience
- 8+ years of hands-on experience as a Site Reliability Engineer (not just DevOps).
- Strong expertise in Google Cloud Platform Cloud services
- Experience with infrastructure-as-code tools (Terraform, Ansible, or similar).
- Solid background in Linux administration, networking, and system performance tuning.
- Strong scripting skills in Python, Bash, or Go.
- Proficiency with monitoring and logging tools such as Prometheus, Grafana, and Cloud Monitoring.
- Experience implementing CI/CD pipelines using Jenkins, GitLab, or similar tools.
- Deep understanding of incident management, postmortem analysis, and reliability best practices.
Nice to Have
- Certification in Google Professional Cloud DevOps Engineer or Cloud Architect.
- Experience with Kubernetes (GKE) or container orchestration systems.
- Background in security and compliance in cloud environments.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.