Site Reliability Engineer

Overview

Hybrid

Depends on Experience

Contract - W2

Contract - 12 Month(s)

Skills

Site Reliability Engineer

Red Hat OpenShift

Kubernetes

Azure

AKS

Terraform

Python

OpenShift

RKE

EKS

Ansible

Golang

Shell Scripting

Linux

DNS

DHCP

Kerberos

Windows Authentication expertise

Job Details

Title: Site Reliability Engineer
Location: Charlotte, NC Plano, TX Pennington, NJ
Hybrid: 3 Days Onsite (Mandatory)
Interview: Mandatory Onsite Interview
Shift: 9AM 5PM EST Must be flexible to work Tue Sat or Sun Thu (rotation every 2 3 months)
Duration: 12+ Months (Likely Extension)

Top Required Skills (Must Have)

Red Hat OpenShift / Kubernetes (5+ years in Enterprise Environment)
Azure / AKS (Strong Cloud Experience)
Terraform
Python

Job Description

Client is seeking a highly skilled Site Reliability Engineer (SRE) to join a mission-critical infrastructure engineering team supporting Kubernetes, OpenShift, cloud platforms, and automation frameworks.

You will be responsible for ensuring uptime, reliability, performance, and scalability of the container and cloud platforms. The role requires deep technical expertise across Kubernetes, cloud-native tools, Linux engineering, Terraform, Python automation, and enterprise-scale system reliability.

Responsibilities

Maintain and support large-scale Container Platforms (OpenShift, Kubernetes, RKE, AKS, EKS, GKE) across on-prem and cloud.
Monitor and troubleshoot performance, security, networking, and deployment issues.
Perform incident and problem management, conducting blameless RCAs.
Analyze and remediate vulnerabilities in container environments.
Collaborate with engineering, cloud, SRE, and operations teams.
Conduct deep-dive investigations into systemic reliability issues.
Implement automated solutions using Python, Ansible, Golang, Shell.
Support CI/CD pipelines including Git, Jenkins, and GitOps frameworks.
Manage IAM components including Active Directory, Azure AD, SSO / Ping Identity.
Provide Linux/Windows administration support across hybrid environments.
Utilize monitoring tools like Prometheus, Dynatrace, Splunk, Azure Monitor.

Required Skills

5+ years hands-on experience with Kubernetes, OpenShift, RKE, AKS, EKS.
Strong knowledge of Python, Ansible, Golang, Shell Scripting.
Advanced Linux, DNS, DHCP, Kerberos, Windows Authentication expertise.
Strong cloud experience: Azure, AWS, or Google Cloud Platform.
Experience with Terraform and CI/CD (Git, Jenkins, GitOps).
Experience with container security, vulnerability remediation, FinOps awareness.
Strong troubleshooting, problem-solving, and reliability engineering skills.
Excellent communication and ability to work independently.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About Black Rock Group

Share