Staff Platform Engineer – DevOps / SRE (GKE & Cloud Infrastructure)

Remote • Posted 2 hours ago • Updated 2 hours ago
Contract W2
Contract Independent
No Travel Required
Remote
Depends on Experience
Fitment

Dice Job Match Score™

🤯 Applying directly to the forehead...

Job Details

Skills

  • Cadence
  • Artificial Intelligence
  • Capacity Management
  • Continuous Integration
  • Continuous Delivery
  • GitHub
  • Google Cloud Platform
  • IaaS
  • Kubernetes
  • Regulatory Compliance
  • Python
  • Virtual Private Cloud
  • Virtual Private Network
  • Terraform
  • Helm

Summary

Staff Platform Engineer – DevOps / SRE (GKE & Cloud Infrastructure)

Location: Remote, USA Team: Medicare Claims Engineering – Platform Initiative


Role Overview

As a Staff Platform Engineer, you will serve as the senior technical authority for the cloud infrastructure, deployment automation, and operational reliability of our mission-critical claims processing platform. This is a high-impact individual contributor role focused on a strategic initiative to migrate an existing RPA portfolio (Automation Anywhere) onto a modern, code-driven architecture built on Temporal.io, Python/Playwright, and Google Kubernetes Engine (GKE).

You will own the "ground-to-cloud" lifecycle—from GKE cluster topology and Terraform-managed infrastructure to hybrid networking and full-stack observability. Operating under strict HIPAA governance, you will ensure the platform is resilient, secure, and capable of handling complex, high-volume healthcare workflows.


Key Responsibilities

  • Infrastructure Leadership: Own the GKE architecture (regional private clusters, autoscaling, network policies) and lead architectural decisions regarding resiliency, capacity planning, and cost-efficiency (CUD strategy).

  • Hybrid Connectivity: Design and manage complex hybrid networking from Google Cloud Platform to on-prem systems using Cloud VPN/Interconnect, VPC peering, and custom DNS resolution patterns.

  • Automation & IaC: Champion a pure Infrastructure as Code (Terraform) approach and manage robust CI/CD pipelines that incorporate container scanning, image signing, and progressive rollouts.

  • Security & Compliance: Partner with Security teams to implement HIPAA-aligned controls, including IAP for internal apps, private load balancers, and rigorous audit logging.

  • SRE & Observability: Establish the SRE practice from the ground up—defining SLIs/SLOs, managing on-call rotations, and building the end-to-end observability stack (Managed Prometheus, Grafana, OpenTelemetry, and Cloud Trace).

  • Mentorship: Lead design and code reviews for all infrastructure changes. Mentor mid-to-senior level engineers on cloud-native best practices and SRE discipline.


Required Qualifications

  • Extensive Experience: Multiple years of senior-level experience in DevOps, SRE, or Platform Engineering for large-scale production environments.

  • K8s Expertise: Deep production expertise with Kubernetes (GKE preferred), specifically node pool design, Pod Disruption Budgets, Workload Identity, and Helm.

  • Google Cloud Platform Mastery: Strong hands-on experience with the Google Cloud Platform ecosystem: Cloud SQL, VPC, Cloud Logging, Cloud Trace, and IAM.

  • Modern Tooling: Expert-level proficiency with Terraform and Helm.

  • Reliability Mindset: Proven track record with high-availability architectures, disaster recovery (RTO/RPO), and multi-zone failover strategies.

  • Problem Solving: Advanced troubleshooting skills across Linux, networking, and database layers.

  • AI-Assisted Development: Experience leveraging tools like GitHub Copilot to accelerate prototyping and increase test coverage.


Preferred Qualifications

  • Distributed Workflows: Experience operating Temporal.io or Cadence in a production setting.

  • Database Operations: Expertise in PostgreSQL at scale (High Availability, PITR, schema migrations via Cloud SQL).

  • Enterprise Networking: Hands-on experience with Dedicated/Partner Interconnect for hybrid cloud connectivity.

  • Data Systems: Familiarity with Elasticsearch operations for search/visibility workloads.

  • Advanced SRE: Experience with Chaos Engineering, AIOps, or resilience testing.

  • Regulated Industries: Background in Healthcare or Finance; deep understanding of HIPAA/PHI controls and audit requirements.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10430747
  • Position Id: 8966933
  • Posted 2 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Easy Apply

Contract, Third Party

60 - 65

Remote

Today

Easy Apply

Contract

$50 - $55

Remote

6d ago

Easy Apply

Contract

Depends on Experience

Remote or Marvin, North Carolina

5d ago

Easy Apply

Contract

$50 - $55

Search all similar jobs