Sr. Site Reliability Engineer (Storage Platform)

Remote • Posted 7 hours ago • Updated 7 hours ago
Contract W2
Remote
Depends on Experience
Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

  • Automated Testing
  • Cloud Computing
  • Collaboration
  • Communication
  • Computer Networking
  • Ansible
  • Backup
  • Continuous Delivery
  • Continuous Integration
  • Disaster Recovery
  • Bash
  • CentOS
  • Ceph
  • Change Management
  • Fiber Channel
  • IT Service Management
  • Git
  • Golang
  • Grafana
  • High Availability
  • Hypervisor
  • IP
  • ITIL
  • Incident Management
  • Infrastructure Lifecycle Management
  • Intellectual Property
  • Kernel-based Virtual Machine
  • Computer Science
  • DNS
  • Documentation
  • Operating Systems
  • Level Design
  • Management
  • Migration
  • NetApp
  • OpenStack
  • Python
  • Dragon NaturallySpeaking
  • Enterprise Storage
  • Root Cause Analysis
  • SUSE Linux
  • Scalability
  • Kubernetes
  • Legacy Systems
  • Linux
  • Red Hat Enterprise Linux
  • Red Hat Linux
  • SDS
  • Scripting
  • Storage
  • Technical Writing
  • Telco
  • Terraform
  • Testing
  • Ubuntu
  • Virtual Machines
  • Wiki
  • Workflow
  • iSCSI

Summary

Job Title: Sr. Site Reliability Engineer (Storage Platform) Remote

Position: Contract to Hire

Job Summary
We are seeking a highly experienced Sr Site Reliability Engineer Storage Platforms to design, implement, and support Software Defined Storage (SDS) and Kubernetes platforms in a private cloud environment. This role focuses on scalability, resilience, automation, and performance using Infrastructure-as-Code and GitOps practices.

This is a deeply technical role requiring expert-level understanding of Software Defined Storage, Kubernetes, and extensive working knowledge on Linux Operating systems. You will also collaborate with platform and SRE teams to maintain secure, performant, and multitenant-isolated services that serve high-throughput, mission-critical applications.

Key Responsibilities

  • Design, implement, and operate large-scale Software Defined Storage architectures across private and public cloud regions within ITIL methodology.
  • Deploy and support enterprise storage platforms (Pure Storage, HPE, NetApp) and SDS solutions (Ceph, Longhorn).
  • Build self-service storage workflows for Kubernetes CSI and OpenStack consumers (VM and Baremetal).
  • Develop Infrastructure-as-Code using Ansible, Terraform, Helm and Git, with Python/Bash automation.
  • Implement CI/CD pipelines for infrastructure updates, patching, upgrades, testing, and rollback.
  • Build observability, alerting, and auto-remediation using GitOps and tools such as Prometheus, Loki, and Grafana.
  • Architect and maintain high availability, disaster recovery, and scale-out infrastructure.
  • Develop and review high-level and low-level design documents for storage infrastructure
  • Perform deep troubleshooting across storage, Kubernetes, hypervisors, networking, and Linux systems.
  • Participate in on-call rotations, incident response, and root cause analysis.
  • Collaborate globally on change management, documentation, and operational best practices.

Must Have

  • 6+ years of experience managing enterprise storage and Kubernetes platforms on Linux.
  • Strong hands-on experience with SDS solutions (Ceph, Longhorn) and storage migrations from legacy systems.
  • Experience with block, file, and object storage, including Fibre Channel and IP-based protocols.
  • Experience with NVMe-oF or iSCSI fabrics.
  • Expert knowledge of Kubernetes and Linux systems (Ubuntu, RHEL/CentOS).
  • Proficiency with Infrastructure-as-Code (IaC) (Ansible, Terraform).
  • Strong scripting skills in Python and Bash (Golang (GO) a plus).
  • Strong working knowledge of Enterprise DNS and integrations with Kubernetes
  • Experience operating 24x7 mission-critical production environments.
  • Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack).
  • Strong written and verbal communication skills.
  • Proficiency with Git, CI/CD pipelines, and automated testing frameworks
  • Ability to write technical documentation and contribute to community wikis or knowledge bases.
  • Bachelor s degree in computer science or equivalent professional experience.

Nice to Have

  • OpenStack Cinder multi-backend administration.
  • Backup platforms (Rubrik).
  • Understanding of CIS/NIST security and infrastructure lifecycle management.
  • ITIL Foundation/advanced certifications in support of ITSM standard methodology.
  • Background in telco, edge cloud, or large enterprise environments.
  • CNCF Certified Kubernetes Administrator (CKA), Certified Kubernetes Security
  • Specialist (CKS) or Red Hat specialist in Ceph Storage Administrator (EX125) certifications.
  • Master s degree in computer science, IT, Engineering, or a related field preferred;

equivalent experience and relevant industry certifications will also be considered

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: prutx001
  • Position Id: 8905151
  • Posted 7 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

2d ago

Easy Apply

Contract

$125

Remote

Today

Easy Apply

Full-time

Depends on Experience

Remote

Today

Easy Apply

Contract, Third Party

Up to $65

Remote

Today

Easy Apply

Contract, Third Party

$50 - $60

Search all similar jobs