Site Reliability Engineer (SRE)

Overview

On Site

$50 - $58

Contract - W2

Contract - 12 Month(s)

Skills

Amazon Web Services

Analytical Skill

Ansible

AppDynamics

Bash

Capacity Management

Change Management

Cloud Computing

Collaboration

Configuration Management

Continuous Delivery

Continuous Integration

Dashboard

Data Engineering

Data Flow

DevOps

Extract

Transform

Load

FOCUS

Management

Finance

Good Clinical Practice

Google Cloud Platform

Grafana

Incident Management

Microsoft Azure

Operational Excellence

Performance Tuning

Python

Root Cause Analysis

SAFE

SaaS

Scalability

Scripting

Splunk

Job Details

Job Title: Site Reliability Engineer (SRE) Data Applications
Location: Charlotte, NC
Experience: 8+ Years

Job Overview:
We are seeking an experienced Site Reliability Engineer (SRE) with strong expertise in full-stack observability across data applications within SaaS, hybrid cloud, and on-premises environments. This role will focus on ensuring reliability, scalability, and performance of ETL pipelines, integrations, and data systems through proactive monitoring, alerting, and automation.

Key Responsibilities:

Design and implement robust observability solutions for data applications across SaaS, hybrid, and on-prem environments.
Monitor ETL pipelines, integrations, and data flows to ensure stability and performance.
Develop proactive alerts and build dashboards for ETL failures, latency spikes, or data loss/integrity issues.
Drive root cause analysis, incident response, and resolution for ETL and data pipeline-related issues.
Enhance reliability and scalability of systems through performance tuning and automation.
Implement and manage change management processes ensuring safe deployment and coordination.
Collaborate closely with Data Engineering, DevOps, and Platform teams to ensure consistent operational excellence.

Required Skills:

Strong experience in full-stack observability and SRE support for large-scale data applications.
Hands-on expertise with observability tools such as Grafana, Splunk, Prometheus, and AppDynamics.
Proven experience with scripting and automation using Python and Bash.
Configuration management and automation experience with Ansible.
Deep understanding of ETL tools and frameworks, with a focus on reliability and performance.
Exposure to SaaS platform support, hybrid cloud, and on-prem infrastructure monitoring.
Experience in performance tuning, proactive monitoring, and capacity planning.
Strong understanding of incident management and change management best practices.

Preferred Qualifications:

Background in large-scale enterprise data systems or financial data environments.
Familiarity with CI/CD pipelines and cloud platforms (AWS, Azure, Google Cloud Platform).
Excellent analytical and troubleshooting skills.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share