Site Reliablity Engineer

Overview

Full Time

Part Time

Accepts corp to corp applications

Contract - W2

Contract - Independent

Skills

Grafana

Terraform

Ansible

Management

Continuous Integration

Continuous Delivery

Operational Efficiency

Collaboration

Capacity Management

Performance Tuning

Root Cause Analysis

Regulatory Compliance

Disaster Recovery

Linux

Unix

Computer Networking

Cloud Computing

Amazon Web Services

Microsoft Azure

Google Cloud

Google Cloud Platform

Scripting

Python

Bash

Orchestration

Docker

Kubernetes

Incident Management

Job Details

Design, build, and maintain highly available, scalable, and reliable systems.
Implement monitoring, alerting, and observability using tools like Prometheus, Grafana, ELK, Datadog.
Automate infrastructure and deployments using Terraform, Ansible, or similar IaC tools.
Manage CI/CD pipelines for continuous delivery and operational efficiency.
Collaborate with development teams to improve system performance, reliability, and incident response.
Conduct capacity planning, performance tuning, and root cause analysis.
Ensure security, compliance, and disaster recovery strategies.

Required Skills

Strong knowledge of Linux/Unix systems, networking, and cloud platforms (AWS, Azure, Google Cloud Platform).
Proficiency in scripting languages (Python, Bash, Go).
Experience with containers and orchestration (Docker, Kubernetes).
Familiarity with monitoring tools and incident management practices.
Understanding of SRE principles (SLIs, SLOs, SLAs).

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Required Skills

About Purple Drive Technologies LLC

Share