SRE Operations Support Engineer

Remote • Posted 2 hours ago • Updated 2 hours ago

Contract Corp To Corp

Contract W2

Contract Independent

No Travel Required

Remote

$55 - $60/hr

Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

SRE
Azure
L1 Support
DB Support
Platform Support
SRE metrics

Summary

Role - SRE Operations Support Engineer

Remote

Min - 9+ IT Exp

Should have L1 Support / DB support / Platform Support

Own SRE metrics such as SLOs, SLIs, Error Budgets, MTTR, MTBF, availability KPIs, and system productivity metrics.

Analytical mindset with strong problem‑solving skills.

Ability to handle pressure in high‑severity incidents

We are seeking a highly skilled Site Reliability Engineer (SRE) to own the overall health, availability, performance, and resilience of our enterprise platform. The platform spans SQL Server, .NET, Java, React.js, Microservices, Kafka, and operates in a hybrid cloud environment on Azure and On‑Premises.
The SRE will lead reliability engineering practices across the stack, manage infrastructure deployment pipelines using Terraform, drive application deployments through GitHub and Azure DevOps, ensure timely remediation of security vulnerabilities, and implement world‑class observability using Dynatrace and Splunk.

Key Responsibilities

Platform Reliability & Operations

· Own the end‑to‑end health, uptime, performance, and reliability of the platform across cloud (Azure) and on‑prem environments.

Ensure resilience across application layers: .NET, Java, React.js, Microservices, and backend systems such as SQL Server and Kafka.
Lead incident management, root cause analysis, and post‑incident reviews with a focus on continuous improvement.

Infrastructure Engineering & Automation

· Design, implement, and maintain cloud and on‑prem infrastructure using Terraform (IaC).

Own and optimize CI/CD pipelines for infrastructure and applications in:

o GitHub Actions

o Azure DevOps

Improve deployment automation, reliability, and release processes across all teams.

Observability, Monitoring & Proactive Operations

· Implement and enhance monitoring, alerting, dashboards, and analytics using:

o Dynatrace (APM, RUM, synthetic monitoring, logs, metrics)

o Splunk (log search, correlation, alerting)

Build proactive monitoring workflows to detect issues before they impact customers.
Own SRE metrics such as SLOs, SLIs, Error Budgets, MTTR, MTBF, availability KPIs, and system productivity metrics.
Performance tuning of the database / application services.

Security & Compliance

· Ensure all platform and application security vulnerabilities are identified and remediated on time.

Partner with cybersecurity to ensure compliance with enterprise standards and policies.
Automate security scans and integrate them into CI/CD pipelines.

Performance & Scalability

· Conduct performance analysis, load testing, and tuning across:

o Microservices

SQL Server databases
Kafka clusters
Front‑end React.js applications

Partner with engineering teams to design scalable, reliable system architectures.

Collaboration & Leadership

· Collaborate with development, architecture, infrastructure, and security teams.

Advocate for SRE and DevOps culture—automation, reliability engineering, blameless postmortems.
Mentor developers and engineers on reliability best practices and tools.

Required Qualifications

· 5+ years of experience in SRE, DevOps, or Platform Engineering roles.

Strong expertise in:

o SQL Server administration and performance tuning

o .NET, Java, Microservices architectures

o React.js fundamentals

Hands‑on experience with:

o Azure Cloud services (VMs, AKS, App Services, Networking)

o On‑prem servers and hybrid integrations

o Terraform (writing, testing, maintaining modules)

CI/CD with GitHub and Azure DevOps

Proficiency with observability tools:

o Dynatrace (preferred)

o Splunk

Experience with Kafka (producers, consumers, performance, tuning).
Strong understanding of SRE fundamentals:

o SLO/SLI design

Error budgets
Distributed systems concepts
Incident response

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91134724
Position Id: 8907941
Posted 2 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

SRE Lead Platform Engineer- Remote

Remote

•

Today

Role Summary As a Lead SRE Platform Engineer, you will drive reliability engineering strategy and execution across critical IT Business Solutions platforms at Wegmans. This role focuses on improving uptime, performance, and operational efficiency through software enhancements, observability, automation, and data-driven root cause analysis (RCA). You will serve as the technical lead for SRE practices establishing monitoring standards, improving MELT (Metrics, Events, Logs, Traces) strategy, influ

Contract

75-95/hr

Site Reliability Engineer

Remote

•

18d ago

JOB DESCRIPTION Site Reliability Engineer Observability Overview: We are seeking a skilled Site Reliability Engineer to join our Platform Engineering team, focusing on building and maintaining a comprehensive observability platform. In this role, you will ensure that our microservices, Kubernetes clusters, and cloud infrastructure are consistently reliable, high-performing, and scalable. You will work closely with cross-functional teams to provide deep insights into system health, performance

Easy Apply

Contract

70 - 85

Java Backend Engineer with Reliability Engineering / Splunk ::: 100% Remote

Remote

•

Today

Java Backend Engineer with Reliability Engineering / Splunk 100% Remote We are looking for a highly skilled Java Backend Engineer with Reliability Engineering experience to help design, build, and maintain reliable, scalable, and observable backend services. The ideal candidate will have strong handson expertise in Java/Spring Boot, microservices, API development, and a deep understanding of Splunk for observability, monitoring, alerting, and troubleshooting. This role focuses on improving appli

Easy Apply

Contract

Depends on Experience

Azure SRE Engineer

Remote

•

Today

Job Title: Azure SRE Engineer Remote W2 only Job Summary: We are looking for an Azure Site Reliability Engineer (SRE) to maintain and improve the reliability, scalability, and performance of applications running on Microsoft Azure. The candidate will work with development and operations teams to ensure stable cloud environments. Responsibilities: Manage and monitor cloud infrastructure on Microsoft Azure.Ensure high availability, performance, and reliability of applications.Automate deployment

Easy Apply

Contract

Depends on Experience

Search all similar jobs