Site Reliability Engineer (Edge Services), Infrastructure Services

Austin, TX, US • Posted 2 days ago • Updated 8 hours ago
Full Time
On-site
Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

  • Pivotal
  • Bridging
  • Continuous Integration
  • Continuous Delivery
  • Linux
  • Computer Networking
  • HTTP
  • HTTPS
  • TLS
  • Python
  • Grafana
  • FOCUS
  • Data Structure
  • Algorithms
  • Budget
  • Release Management
  • Incident Management
  • Management
  • Cloud Computing
  • Amazon Web Services
  • Google Cloud
  • Google Cloud Platform
  • Microsoft Azure
  • Terraform
  • Ansible
  • Kubernetes
  • Service Design
  • Fluency
  • Generative Artificial Intelligence (AI)
  • Software Engineering
  • Debugging
  • Workflow
  • Artificial Intelligence

Summary

We are seeking a proactive Site Reliability Engineer to champion the evolution of our production ecosystems. In this role, you will help drive the vision for our visibility, moving beyond simple uptime metrics to build a sophisticated, data-driven reliability framework. You will play a pivotal role in ensuring our services are resilient, scalable, and observable, bridging the gap between complex distributed systems and seamless user experiences.

As a key member of the SRE team, your mission is to treat operations as a software problem. You will focus on designing and implementing a next-generation observability and alerting strategy that prioritizes high-cardinality data and meaningful signals over noise. You will spend your time building \"self-healing\" systems, reducing toil through aggressive automation, and partnering with development teams to bake reliability into the CI/CD pipeline. Your goal is to move us toward a proactive stance where performance bottlenecks are identified and mitigated before they impact the customer.

Understanding of Linux internals and deep networking expertise, including HTTP/2, HTTP/3 (QUIC), and HTTPS/TLS. You should be comfortable debugging protocol-level issues and optimizing traffic flow.\nProven ability to automate repetitive tasks and complex workflows using Python or Go\nExperience configuring and managing modern monitoring suites (e.g., Prometheus, Grafana, ClickHouse) with a focus on creating actionable, high-signal quality alerting.\nGrasp of Data Structures and Algorithms (DSA) to write efficient, performant code and troubleshoot complex system bottlenecks.\nPractical knowledge of SLIs, SLOs, Error Budgets, Release Management and Incident Management to drive engineering priorities.

Experience managing cloud environments (AWS, Google Cloud Platform, or Azure) using Terraform, Ansible, or Pulumi.\nOrchestration: Hands-on experience scaling and securing containerized workloads via Kubernetes.\nA track record of leading \"blameless post-mortems\" and using those insights to harden the system against future failures.\nAbility to consult with product teams on service design to improve long-term maintainability.\nA proactive engineering mindset focused on shifting from \"fixing things when they break\" to \"designing things so they don't break\" (or so they fail gracefully).\nPractical fluency in applying Generative AI tools within SRE and software engineering workflows - from accelerating observability query construction and alert design to building AI-assisted debugging and triage capabilities that encode institutional knowledge into repeatable, context-aware workflows - with the engineering rigour to validate, own, and iterate on AI-assisted outputs in production-adjacent contexts
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90733111
  • Position Id: 242f1e3acf2deea8ea53305154cf664e
  • Posted 2 days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Austin, Texas

Today

Full-time

Austin, Texas

Today

Full-time

USD 129,600.00 - 232,200.00 per year

Austin, Texas

Today

Full-time

Austin, Texas

Today

Full-time

Search all similar jobs