Site Reliability Engineer

Overview

On Site

$60 - $65

Contract - W2

Skills

Reliability Engineer

Amazon Web Services

Ansible

Budget

CHAOS

Cloud Computing

Computer Science

Continuous Delivery

Continuous Integration

Dashboard

DevOps

Docker

Documentation

Employment Authorization

FOCUS

GitHub

Good Clinical Practice

Google Cloud Platform

Grafana

High Availability

Incident Management

Jenkins

Leadership

Management

Mentorship

Microsoft Azure

Network Monitoring

Operational Excellence

Optimization

Pivotal

Python

Reliability Engineering

Root Cause Analysis

Scalability

Scripting

Service Level

Terraform

Workflow

Job Details

Job Title: Principal Site Reliability Engineer

Location: Washington, DC (Onsite Only)
Talent must reside in Washington, DC at time of submission
Position Type: Contract

Job Summary

We are seeking a Principal Site Reliability Engineer for a key
Randstad client based in Washington, DC. This senior-level position
plays a pivotal role in ensuring the reliability, scalability,
security, and performance of the organization's critical systems and
services. The ideal candidate will have deep technical knowledge in
SRE practices, infrastructure automation, CI/CD security, and
observability, along with strong leadership and mentoring
capabilities.

Responsibilities

Reliability & Operations

Define and manage Service Level Objectives (SLOs) and Service Level
Indicators (SLIs)

Own the error budget process

Lead incident response, root cause analysis, and postmortem documentation

Infrastructure Automation

Design and maintain cloud environments using Infrastructure as Code
(IaC) tools such as Terraform, Ansible, and CloudFormation

CI/CD Optimization & Security

Architect secure, high-performing CI/CD pipelines (e.g., GitHub
Actions, Jenkins)

Implement deployment strategies like canary, blue/green, and automated rollback

Observability & Telemetry

Develop observability solutions with metrics, logs, and traces using
tools like Prometheus, Grafana, Datadog, or ELK

Configure dashboards, alerts, and synthetic monitoring

Security & Compliance

Integrate security scanning tools (SAST, DAST, SBOM) into pipelines

Enforce security policies-as-code and ensure regulatory compliance

Cost & Capacity Management

Monitor cloud usage trends, optimize infrastructure for cost-efficiency

Forecast resource requirements to maintain availability and performance

Internal Platform Enablement

Build reusable tools, platforms, and self-service frameworks

Improve developer workflows and consistency across teams

Mentorship & Technical Leadership

Serve as a technical mentor and thought leader

Establish and promote best practices in site reliability, operational
excellence, and secure system delivery

Required Qualifications
Education

Bachelor s degree in Computer Science, Engineering, or a related technical field

Experience

Minimum 5+ years in Site Reliability Engineering, DevOps, or Platform
Engineering

At least 3+ years managing high-availability cloud-native production
environments

Technical Skills

Cloud: Deep experience with AWS, Azure, or Google Cloud Platform (focus on Compute, IAM,
Networking, Monitoring)

IaC: Proficiency in Terraform, CloudFormation, Ansible

CI/CD: Hands-on experience with GitHub Actions, Jenkins, and modern
deployment strategies

Containers: Expertise with Docker, Kubernetes

Observability: Tools such as Prometheus, Grafana, ELK, Datadog, or CloudWatch

Programming & Scripting

Strong scripting skills in Python, Go, or Bash

Knowledge & Practices

Solid understanding of SRE principles (SLOs, incident management,
chaos engineering)

Experience building internal tools and documentation that promote best practices

Additional Information

Onsite Requirement: This is a non-remote role. Candidates must be
local to Washington, DC at the time of submission.

Work Authorization: [Insert if any restrictions apply e.g., USC/H1B, etc.]

Clearance Requirement: [Insert if applicable]

--

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share