SRE Operations Support Engineer

Remote • Posted 2 hours ago • Updated 2 hours ago
Contract Corp To Corp
Contract W2
Contract Independent
No Travel Required
Remote
$55 - $60/hr
Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

  • SRE
  • Azure
  • L1 Support
  • DB Support
  • Platform Support
  • SRE metrics

Summary

Role -  SRE Operations Support Engineer
Remote
Min - 9+ IT Exp
 
Should  have L1 Support / DB support / Platform Support
Own SRE metrics such as SLOs, SLIs, Error Budgets, MTTR, MTBF, availability KPIs, and system productivity metrics.
Analytical mindset with strong problem‑solving skills.

Ability to handle pressure in high‑severity incidents

We are seeking a highly skilled Site Reliability Engineer (SRE) to own the overall health, availability, performance, and resilience of our enterprise platform. The platform spans SQL Server, .NET, Java, React.js, Microservices, Kafka, and operates in a hybrid cloud environment on Azure and On‑Premises.
The SRE will lead reliability engineering practices across the stack, manage infrastructure deployment pipelines using Terraform, drive application deployments through GitHub and Azure DevOps, ensure timely remediation of security vulnerabilities, and implement world‑class observability using Dynatrace and Splunk.

Key Responsibilities

Platform Reliability & Operations

·                Own the end‑to‑end health, uptime, performance, and reliability of the platform across cloud (Azure) and on‑prem environments.

  • Ensure resilience across application layers: .NET, Java, React.js, Microservices, and backend systems such as SQL Server and Kafka.
  • Lead incident managementroot cause analysis, and post‑incident reviews with a focus on continuous improvement.

Infrastructure Engineering & Automation

·                Design, implement, and maintain cloud and on‑prem infrastructure using Terraform (IaC).

  • Own and optimize CI/CD pipelines for infrastructure and applications in:

o        GitHub Actions

o        Azure DevOps

  • Improve deployment automation, reliability, and release processes across all teams.

Observability, Monitoring & Proactive Operations

·                Implement and enhance monitoring, alerting, dashboards, and analytics using:

o        Dynatrace (APM, RUM, synthetic monitoring, logs, metrics)

o        Splunk (log search, correlation, alerting)

  • Build proactive monitoring workflows to detect issues before they impact customers.
  • Own SRE metrics such as SLOs, SLIs, Error Budgets, MTTR, MTBF, availability KPIs, and system productivity metrics.
  • Performance tuning of the database / application services.

Security & Compliance

·                Ensure all platform and application security vulnerabilities are identified and remediated on time.

  • Partner with cybersecurity to ensure compliance with enterprise standards and policies.
  • Automate security scans and integrate them into CI/CD pipelines.

Performance & Scalability

·                Conduct performance analysis, load testing, and tuning across:

o        Microservices

    • SQL Server databases
    • Kafka clusters
    • Front‑end React.js applications
  • Partner with engineering teams to design scalable, reliable system architectures.

Collaboration & Leadership

·                Collaborate with development, architecture, infrastructure, and security teams.

  • Advocate for SRE and DevOps culture—automation, reliability engineering, blameless postmortems.
  • Mentor developers and engineers on reliability best practices and tools.

Required Qualifications

·                5+ years of experience in SRE, DevOps, or Platform Engineering roles.

  • Strong expertise in:

o        SQL Server administration and performance tuning

o        .NET, Java, Microservices architectures

o        React.js fundamentals

  • Hands‑on experience with:

o        Azure Cloud services (VMs, AKS, App Services, Networking)

o        On‑prem servers and hybrid integrations

o        Terraform (writing, testing, maintaining modules)

    • CI/CD with GitHub and Azure DevOps
  • Proficiency with observability tools:

o        Dynatrace (preferred)

o        Splunk

  • Experience with Kafka (producers, consumers, performance, tuning).
  • Strong understanding of SRE fundamentals:

o        SLO/SLI design

    • Error budgets
    • Distributed systems concepts
    • Incident response
  •  
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91134724
  • Position Id: 8907941
  • Posted 2 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Contract

75-95/hr

Remote

18d ago

Easy Apply

Contract

70 - 85

Remote

Today

Easy Apply

Contract

Depends on Experience

Remote

Today

Easy Apply

Contract

Depends on Experience

Search all similar jobs