Senior Customer Reliability Engineer (CRE)

  • California City, CA
  • Posted 5 days ago | Updated 1 day ago

Overview

On Site
Full Time
Part Time
Accepts corp to corp applications
Contract - W2
Contract - Independent

Skills

POSTGRES
DEVOPS
SCALA
CLOUD INFRASTRUCTURE
VPC
C++
AWS
NGINX
HASKELL
PYTHON

Job Details

Senior Customer Reliability Engineer (CRE)

Client: Arista, California City, CA (Hybrid 3 days/week as per client)

This is not a traditional operations role. You ll own critical, hands-on operational tasks while leading efforts to eliminate manual toil through automation and systems engineering.

You ll work closely with engineering, tooling and platform teams to ensure operational excellence and system reliability across customer deployments.

Key Responsibilities

Phase 1: Stabilize and Map (0 - 6 Months)

  • Own operational workload: deployments, upgrades, incident response.
  • Ensure stability while identifying manual pain points.

Phase 2: Automate and Influence (6 -18 Months)

  • Automate repetitive operational tasks using scripting and IaC.
  • Develop internal tooling; collaborate with platform teams to reduce manual effort.

Phase 3: Architect and Evangelize (2+ Years)

  • Define SLOs, improve observability and influence product design for reliability.
  • Promote SRE principles across engineering.

Required Skills

  • DevOps/SRE Experience: Strong background in Site Reliability or DevOps engineering.
  • Linux & Networking: Strong command of Linux systems, networking fundamentals (TCP/IP, DNS, routing).
  • Cloud Infrastructure: Hands-on experience with AWS (VPC, EC2, IAM, S3) and Terraform.
  • Monitoring & Observability: Build and manage telemetry pipelines (metrics, logs, traces).
  • Automation & Coding: Proficient in Python or Go, strong Bash scripting skills.
  • Incident Management: Skilled at stabilizing crises and designing long-term prevention systems.

Preferred Skills

  • Experience with Kafka, Postgres, nginx, systemd.
  • Familiarity with Nix/NixOS (training provided if new).
  • Exposure to functional programming (Scala, Haskell, Rust, etc.) is a plus.

You will directly impact customer success while driving the evolution of Arista s reliability engineering culture moving from manual fixes to automated, scalable systems.


Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.