Apply Now

Immediate Need !! Senior Site Reliability Engineer (Google Cloud Platform)

Hybrid in Atlanta, GA, US • Posted 1 day ago • Updated 1 day ago

Contract W2

Contract Independent

Contract Corp To Corp

12 Months

Hybrid

Depends on Experience

Fitment

Dice Job Match Score™

🔢 Crunching numbers...

Job Details

Skills

SRE
GCP
Terraform
Kubernetes
Harness

Summary

Greetings from SRINAV INC.

Kindly find the below requirement and share your interest.

Position : Senior Site Reliability Engineer (Google Cloud Platform)

Location : Atlanta GA Hybrid -3 days in a week

Duration : Long term

Senior Site Reliability Engineer (Google Cloud Platform)

Job Summary

We are seeking a highly skilled and proactive Senior Specialist, Site Reliability Engineering (SRE) to help drive reliability, scalability, and performance of our critical platforms while bringing deep technical expertise in Google Cloud Platform. This role is ideal for a senior-level engineer who combines deep technical expertise with a passion for automation, observability, and operational excellence and who is highly technical, thrives in distributed systems, and is passionate about operational excellence and modern cloud practices.

As a Senior Specialist, you ll work on complex reliability challenges, lead technical initiatives, and collaborate across engineering, product, and infrastructure teams to ensure our systems are resilient and efficient.

Job Description

Reliability Engineering

Architect and implement solutions that improve system reliability, scalability, and performance across Google Cloud Platform based services.

Define and manage SLIs, SLOs, and error budgets for critical systems.

Automate operational tasks, reduce toil, and improve the reliability posture of our environments.

Influence system and application architecture to ensure reliability is designed from the beginning.

Incident Management and Root Cause Analysis

Serve as the technical lead during major incidents and drive restoration efforts.

Conduct detailed root cause analysis and deliver long term corrective actions.

Champion and facilitate blameless postmortems and continuous improvement practices.

Cloud Architecture and Operations (Google Cloud Platform Focused)

Design Architect and improve Google Cloud Platform infrastructure including VPC design, Cloud DNS, load balancing, Cloud Armor equivalents for WAF and filtering, cloud storage patterns, managed compute platforms such as GKE and Cloud Run, and data warehouse platforms such as BigQuery.

Collaborate with teams to implement resilient multi zone and multi-region cloud architectures.

Lead the design and implementation of disaster recovery strategies and automated failover patterns within Google Cloud Platform.

Manage and optimize core Google Cloud Platform services such as IAM, service accounts, logging, and network controls.

Apply governance guardrails for secure multi project environments using tools such as Google Cloud Platform Organization policies, Cloud Identity, and related controls.

Automation and Infrastructure as Code

Build infrastructure using Terraform and maintain consistent, scalable IaC patterns.

Create automation using Python, Bash, PowerShell, or similar languages.

Participate in CI and CD pipeline improvements and ensure high quality deployments into Google Cloud Platform environments.

Monitoring & Tooling

Enhance observability through metrics, logs, and tracing using tools such as Prometheus, Grafana, Google Cloud Operations Suite, or similar solutions.

Build dashboards, alerts, and automated remediation systems that support reliability and performance goals.

Analyze cloud level logs such as VPC Flow Logs and Cloud Audit Logs to strengthen security and performance.

Technical Leadership

Collaborate with security and software engineering teams to drive reliability and cloud excellence.

Influence system design and architecture to embed reliability from the ground up.

Stay current with Google Cloud Platform capabilities and recommend improvements to enhance performance, security, and efficiency.

Job Description

7 or more years of experience in SRE, DevOps, cloud engineering, or infrastructure engineering.
Strong experience with Google Cloud Platform architecture, networking, identity, and managed services.
Expertise with Kubernetes and container platforms.
Hands on experience implementing infrastructure as code using Terraform.
Strong proficiency with modern observability stacks.
Experience in Python, PowerShell, or similar languages.
Experience with orchestration platforms such as Harness.
Proven ability to diagnose and solve complex reliability problems in distributed systems.
Experience leveraging AI tools to enhance workflow automation, experimentation, and problem solving.
Excellent communication skills and the ability to influence outcomes across teams.

Preferred Skills

Experience in regulated or high-availability environments (e.g., financial services, healthcare).
Familiarity with chaos engineering, performance optimization, and capacity planning.
Software development background using languages such as Python or Go.
Experience designing multi-region fault tolerant architectures in Google Cloud Platform.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10114049
Position Id: 8996758
Posted 1 day ago

Contact the job poster

Surendra Venkat

Recruiter @ Srinav Inc.

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Senior Site Reliability Engineer I

Atlanta, Georgia

•

Today

Join Axon and be a Force for Good. At Axon, we're on a mission to Protect Life. We're explorers, pursuing society's most critical safety and justice issues with our ecosystem of devices and cloud software. Like our products, we work better together. We connect with candor and care, seeking out diverse perspectives from our customers, communities and each other. Life at Axon is fast-paced, challenging and meaningful. Here, you'll take ownership and drive real change. Constantly grow as you work

Full-time

Cloud Engineer - Cloud Operations

Atlanta, Georgia

•

Today

About Rivian Rivian is on a mission to keep the world adventurous forever. This goes for the emissions-free Electric Adventure Vehicles we build, and the curious, courageous souls we seek to attract. As a company, we constantly challenge what's possible, never simply accepting what has always been done. We reframe old problems, seek new solutions and operate comfortably in areas that are unknown. Our backgrounds are diverse, but our team shares a love of the outdoors and a desire to protect it

Full-time

USD 97,700.00 - 122,100.00 per year

Site Reliability Engineer

Hybrid in Alpharetta, Georgia

•

10d ago

Job Details: Job Title: Site Reliability Engineer (SRE) Duration: Long-Term Contract Location: Alpharetta, GA || Hybrid Job Description: Position Summary We are seeking a skilled Site Reliability Engineer (SRE) to join our team and help design, build, and maintain scalable, reliable, and secure cloud-native infrastructure. You will collaborate closely with development and operations teams to ensure system reliability, performance, and efficiency.The ideal candidate is passionate about automatio

Easy Apply

Contract

60 - 66

Senior Software Engineer: Site Reliability Engineering

Remote

•

Today

At Jack Henry, we're more than a technology company, we're a force for good in financial services. We're redefining how community banks and credit unions connect with the people they serve. Our mission is rooted in people inspired innovation, empowering financial institutions to deliver seamless, secure, and human centered experiences. We deliver cutting-edge solutions that are paving the way for the next generation of digital banking and payments, but our true impact begins with our associates.

Full-time

Search all similar jobs