Senior Site Reliability Engineer

• Posted 19 hours ago • Updated 12 minutes ago
Full Time
Part Time
6 Months
Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

  • Operational Excellence
  • FOCUS
  • Scalability
  • Service Level
  • KPI
  • Incident Management
  • Emerging Technologies
  • Optimization
  • Build Automation
  • Windows PowerShell
  • Collaboration
  • Regulatory Compliance
  • Security Operations
  • Reliability Engineering
  • Workflow
  • Microsoft Certified Professional
  • Artificial Intelligence
  • Terraform
  • Bash
  • Scripting
  • Active Directory
  • Virtual Machines
  • Operating Systems
  • Systems Architecture
  • Systems Design
  • DNS
  • Dragon NaturallySpeaking
  • Routing
  • Switches
  • Virtual Private Network
  • Network
  • Management
  • Meraki
  • Firewall
  • Network Security
  • Computer Networking
  • Puppet
  • Microsoft SQL Server
  • Leadership
  • Software Engineering
  • DevOps
  • Stakeholder Management
  • Auditing
  • PCI DSS
  • System On A Chip
  • ISO/IEC 27001:2005
  • Scala
  • Java
  • SaaS
  • High Availability
  • Cloud Computing
  • Microsoft
  • Microsoft Azure
  • Amazon Web Services
  • Kubernetes
  • IMG
  • Configuration Management
  • Change Management
  • Content Management
  • SAINT
  • Technical Direction

Summary

Hello

Duration: Long Term Contract

100% REMOTE

About the Role

We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to join our team. This individual will serve as a technical leader responsible for the design, reliability, scalability, and operational excellence of our infrastructure platforms and services.

The successful candidate will possess deep expertise across cloud platforms, Kubernetes, networking, automation, and systems architecture while maintaining a strong focus on security, compliance, and operational resilience. This role requires a strategic thinker who can operate across organizational boundaries, partner effectively with Engineering and Product teams, and drive initiatives from conception through implementation.

As a senior member of the organization, you will be expected to take ownership of critical systems, champion reliability engineering practices, and influence technical direction across a complex hybrid-cloud environment.

Key Responsibilities

Reliability Engineering & Operations

  • Own the availability, reliability, performance, and scalability of production systems.
  • Lead technical response during critical incidents, driving rapid mitigation and long-term corrective actions.
  • Define and track Service Level Objectives (SLOs), Service Level Indicators (SLIs), and operational KPIs.
  • Participate in an on-call rotation and continually improve operational readiness and response processes.
  • Establish and maintain operational standards, runbooks, and incident management procedures.

Infrastructure & Platform Architecture

  • Design, implement, and evolve highly available cloud and hybrid-cloud architectures.
  • Serve as a technical authority for Azure infrastructure and cloud-native services.
  • Architect resilient, scalable solutions supporting both modern containerized workloads and traditional infrastructure.
  • Evaluate emerging technologies and provide strategic recommendations aligned with business objectives.
  • Drive platform standardization, modernization, and infrastructure optimization initiatives.

Cloud, Automation & DevOps

  • Develop and maintain Infrastructure-as-Code (IaC) solutions using Terraform.
  • Build automation frameworks and operational tooling utilizing Bash, PowerShell, and related technologies.
  • Partner closely with software engineering teams to improve deployment pipelines, observability, and operational maturity.
  • Promote DevOps principles and reliability-first engineering practices across the organization.
  • Identify opportunities to eliminate operational toil through automation and self-service capabilities.

Security, Compliance & Governance

  • Support infrastructure controls and evidence collection for compliance frameworks including:
    • PCI-DSS
    • SOC 1 / SOC 2
    • ISO 27001
    • Additional industry-standard security frameworks
  • Collaborate with Security, Risk, and Audit teams to ensure compliance requirements are met.
  • Incorporate security best practices into platform design and operational workflows.

Cross-Functional Leadership

  • Partner effectively with Engineering, Product, Security, Operations, and Executive stakeholders.
  • Communicate complex technical concepts clearly to audiences with varying levels of technical expertise.
  • Lead technical discussions, architecture reviews, and operational improvement initiatives.
  • Influence organizational standards and engineering best practices.
  • Operate effectively within a global, matrixed organization.

Required Qualifications

Technical Expertise

  • 8+ years of experience in Site Reliability Engineering, Platform Engineering, Infrastructure Engineering, or related disciplines.
  • Expert-level experience administering and architecting solutions within Microsoft Azure.
  • Strong working knowledge of AWS services and cloud operations.
  • Experience developing or integrating AI-driven operational workflows and automation.
  • Familiarity with Model Context Protocol (MCP) development and emerging AI infrastructure patterns.
  • Deep expertise operating Kubernetes in production environments.
  • Advanced experience with Terraform and Infrastructure-as-Code methodologies.
  • Expert-level Bash scripting and automation development.
  • Extensive Active Directory administration and architecture experience.
  • Strong experience troubleshooting virtual machines, operating systems, and distributed infrastructure.
  • Experience designing and operating hybrid-cloud environments.
  • Expertise in systems architecture and distributed systems design.
  • Strong networking fundamentals, including:
    • Firewalls and security policies
    • DNS
    • Routing and switching
    • VPN technologies
    • Network troubleshooting
  • Experience managing Meraki firewall policies and network security controls.
  • Experience with Cloudflare services and edge networking technologies.
  • Experience with configuration management platforms such as Puppet.
  • Experience administering and supporting Microsoft SQL Server environments.
  • Experience with Datadog, PagerDuty, and modern observability platforms.

Engineering Leadership

  • Demonstrated systems-thinking approach to solving complex technical challenges.
  • Strong ownership mentality with a track record of driving issues from identification through permanent resolution.
  • Experience collaborating closely with software engineering and product teams in a DevOps-oriented environment.
  • Ability to balance strategic architecture responsibilities with hands-on operational execution.
  • Exceptional verbal, written, and stakeholder management skills.

Preferred Qualifications

  • Experience supporting regulated environments and audit programs, including PCI-DSS, SOC 1, SOC 2, and ISO 27001.
  • Experience with Scala and/or Java application ecosystems.
  • Experience supporting large-scale SaaS or high-availability production platforms.
  • Experience implementing platform engineering or internal developer platform initiatives.
  • Public cloud certifications such as:
    • Microsoft Certified: Azure Solutions Architect Expert
    • Azure Administrator Associate
    • AWS Solutions Architect
    • Certified Kubernetes Administrator (CKA)



Gopal Gupta
Technical Recruiter

E:
D:
A: 505 Knolle Court, Saint Augustine| FL 32092

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91022079
  • Position Id: 2026-49432
  • Posted 19 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Atlanta, Georgia

Today

Full-time

Remote

Today

Full-time

USD 114,000.00 - 148,000.00 per year

El Segundo, California

Today

Full-time

USD 153,000.00 - 185,000.00 per year

Coppell, Texas

2d ago

Easy Apply

Full-time

Depends on Experience

Search all similar jobs