Site Reliability Engineer

Overview

Hybrid
Depends on Experience
Contract - W2
Contract - 12 Month(s)

Skills

Site Reliability Engineer
Red Hat OpenShift
Kubernetes
Azure
AKS
Terraform
Python
OpenShift
RKE
EKS
Ansible
Golang
Shell Scripting
Linux
DNS
DHCP
Kerberos
Windows Authentication expertise

Job Details

Title: Site Reliability Engineer
Location: Charlotte, NC Plano, TX Pennington, NJ
Hybrid: 3 Days Onsite (Mandatory)
Interview: Mandatory Onsite Interview
Shift: 9AM 5PM EST Must be flexible to work Tue Sat or Sun Thu (rotation every 2 3 months)
Duration: 12+ Months (Likely Extension)

Top Required Skills (Must Have)

  1. Red Hat OpenShift / Kubernetes (5+ years in Enterprise Environment)
  2. Azure / AKS (Strong Cloud Experience)
  3. Terraform
  4. Python

Job Description

Client is seeking a highly skilled Site Reliability Engineer (SRE) to join a mission-critical infrastructure engineering team supporting Kubernetes, OpenShift, cloud platforms, and automation frameworks.

You will be responsible for ensuring uptime, reliability, performance, and scalability of the container and cloud platforms. The role requires deep technical expertise across Kubernetes, cloud-native tools, Linux engineering, Terraform, Python automation, and enterprise-scale system reliability.

Responsibilities

  • Maintain and support large-scale Container Platforms (OpenShift, Kubernetes, RKE, AKS, EKS, GKE) across on-prem and cloud.
  • Monitor and troubleshoot performance, security, networking, and deployment issues.
  • Perform incident and problem management, conducting blameless RCAs.
  • Analyze and remediate vulnerabilities in container environments.
  • Collaborate with engineering, cloud, SRE, and operations teams.
  • Conduct deep-dive investigations into systemic reliability issues.
  • Implement automated solutions using Python, Ansible, Golang, Shell.
  • Support CI/CD pipelines including Git, Jenkins, and GitOps frameworks.
  • Manage IAM components including Active Directory, Azure AD, SSO / Ping Identity.
  • Provide Linux/Windows administration support across hybrid environments.
  • Utilize monitoring tools like Prometheus, Dynatrace, Splunk, Azure Monitor.

Required Skills

  • 5+ years hands-on experience with Kubernetes, OpenShift, RKE, AKS, EKS.
  • Strong knowledge of Python, Ansible, Golang, Shell Scripting.
  • Advanced Linux, DNS, DHCP, Kerberos, Windows Authentication expertise.
  • Strong cloud experience: Azure, AWS, or Google Cloud Platform.
  • Experience with Terraform and CI/CD (Git, Jenkins, GitOps).
  • Experience with container security, vulnerability remediation, FinOps awareness.
  • Strong troubleshooting, problem-solving, and reliability engineering skills.
  • Excellent communication and ability to work independently.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Black Rock Group