Immediate Need !! Senior Site Reliability Engineer (Google Cloud Platform)

Hybrid in Atlanta, GA, US • Posted 1 day ago • Updated 1 day ago
Contract W2
Contract Independent
Contract Corp To Corp
12 Months
Hybrid
Depends on Experience
Fitment

Dice Job Match Score™

🔢 Crunching numbers...

Job Details

Skills

  • SRE
  • GCP
  • Terraform
  • Kubernetes
  • Harness

Summary

Greetings from SRINAV INC.

Kindly find the below requirement and share your interest.

Position : Senior Site Reliability Engineer (Google Cloud Platform)

Location : Atlanta GA Hybrid -3 days in a week
Duration : Long term
Senior Site Reliability Engineer (Google Cloud Platform)

Job Summary

We are seeking a highly skilled and proactive Senior Specialist, Site Reliability Engineering (SRE) to help drive reliability, scalability, and performance of our critical platforms while bringing deep technical expertise in Google Cloud Platform. This role is ideal for a senior-level engineer who combines deep technical expertise with a passion for automation, observability, and operational excellence and who is highly technical, thrives in distributed systems, and is passionate about operational excellence and modern cloud practices.

As a Senior Specialist, you ll work on complex reliability challenges, lead technical initiatives, and collaborate across engineering, product, and infrastructure teams to ensure our systems are resilient and efficient.

Job Description

  • Reliability Engineering

Architect and implement solutions that improve system reliability, scalability, and performance across Google Cloud Platform based services.

Define and manage SLIs, SLOs, and error budgets for critical systems.

Automate operational tasks, reduce toil, and improve the reliability posture of our environments.

Influence system and application architecture to ensure reliability is designed from the beginning.

  • Incident Management and Root Cause Analysis

Serve as the technical lead during major incidents and drive restoration efforts.

Conduct detailed root cause analysis and deliver long term corrective actions.

Champion and facilitate blameless postmortems and continuous improvement practices.

  • Cloud Architecture and Operations (Google Cloud Platform Focused)

Design Architect and improve Google Cloud Platform infrastructure including VPC design, Cloud DNS, load balancing, Cloud Armor equivalents for WAF and filtering, cloud storage patterns, managed compute platforms such as GKE and Cloud Run, and data warehouse platforms such as BigQuery.

Collaborate with teams to implement resilient multi zone and multi-region cloud architectures.

Lead the design and implementation of disaster recovery strategies and automated failover patterns within Google Cloud Platform.

Manage and optimize core Google Cloud Platform services such as IAM, service accounts, logging, and network controls.

Apply governance guardrails for secure multi project environments using tools such as Google Cloud Platform Organization policies, Cloud Identity, and related controls.

  • Automation and Infrastructure as Code

Build infrastructure using Terraform and maintain consistent, scalable IaC patterns.

Create automation using Python, Bash, PowerShell, or similar languages.

Participate in CI and CD pipeline improvements and ensure high quality deployments into Google Cloud Platform environments.

  • Monitoring & Tooling

Enhance observability through metrics, logs, and tracing using tools such as Prometheus, Grafana, Google Cloud Operations Suite, or similar solutions.

Build dashboards, alerts, and automated remediation systems that support reliability and performance goals.

Analyze cloud level logs such as VPC Flow Logs and Cloud Audit Logs to strengthen security and performance.

  • Technical Leadership

Collaborate with security and software engineering teams to drive reliability and cloud excellence.

Influence system design and architecture to embed reliability from the ground up.

Stay current with Google Cloud Platform capabilities and recommend improvements to enhance performance, security, and efficiency.

Job Description

  • 7 or more years of experience in SRE, DevOps, cloud engineering, or infrastructure engineering.
  • Strong experience with Google Cloud Platform architecture, networking, identity, and managed services.
  • Expertise with Kubernetes and container platforms.
  • Hands on experience implementing infrastructure as code using Terraform.
  • Strong proficiency with modern observability stacks.
  • Experience in Python, PowerShell, or similar languages.
  • Experience with orchestration platforms such as Harness.
  • Proven ability to diagnose and solve complex reliability problems in distributed systems.
  • Experience leveraging AI tools to enhance workflow automation, experimentation, and problem solving.
  • Excellent communication skills and the ability to influence outcomes across teams.

Preferred Skills

  • Experience in regulated or high-availability environments (e.g., financial services, healthcare).
  • Familiarity with chaos engineering, performance optimization, and capacity planning.
  • Software development background using languages such as Python or Go.
  • Experience designing multi-region fault tolerant architectures in Google Cloud Platform.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10114049
  • Position Id: 8996758
  • Posted 1 day ago
Contact the job poster
SV

Surendra Venkat

Recruiter @ Srinav Inc.
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Atlanta, Georgia

Today

Full-time

Atlanta, Georgia

Today

Full-time

USD 97,700.00 - 122,100.00 per year

Hybrid in Alpharetta, Georgia

10d ago

Easy Apply

Contract

60 - 66

Remote

Today

Full-time

Search all similar jobs