Apply Now

Sr. Site Reliability Engineer (Storage Platform)

Remote • Posted 30+ days ago • Updated 11 days ago

Contract W2

Remote

Depends on Experience

Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

Automated Testing
Cloud Computing
Collaboration
Communication
Computer Networking
Ansible
Backup
Continuous Delivery
Continuous Integration
Disaster Recovery
Bash
CentOS
Ceph
Change Management
Fiber Channel
IT Service Management
Git
Golang
Grafana
High Availability
Hypervisor
IP
ITIL
Incident Management
Infrastructure Lifecycle Management
Intellectual Property
Kernel-based Virtual Machine
Computer Science
DNS
Documentation
Operating Systems
Level Design
Management
Migration
NetApp
OpenStack
Python
Dragon NaturallySpeaking
Enterprise Storage
Root Cause Analysis
SUSE Linux
Scalability
Kubernetes
Legacy Systems
Linux
Red Hat Enterprise Linux
Red Hat Linux
SDS
Scripting
Storage
Technical Writing
Telco
Terraform
Testing
Ubuntu
Virtual Machines
Wiki
Workflow
iSCSI

Summary

Job Title: Sr. Site Reliability Engineer (Storage Platform) Remote

Position: Contract to Hire

Job Summary
We are seeking a highly experienced Sr Site Reliability Engineer Storage Platforms to design, implement, and support Software Defined Storage (SDS) and Kubernetes platforms in a private cloud environment. This role focuses on scalability, resilience, automation, and performance using Infrastructure-as-Code and GitOps practices.

This is a deeply technical role requiring expert-level understanding of Software Defined Storage, Kubernetes, and extensive working knowledge on Linux Operating systems. You will also collaborate with platform and SRE teams to maintain secure, performant, and multitenant-isolated services that serve high-throughput, mission-critical applications.

Key Responsibilities

Design, implement, and operate large-scale Software Defined Storage architectures across private and public cloud regions within ITIL methodology.
Deploy and support enterprise storage platforms (Pure Storage, HPE, NetApp) and SDS solutions (Ceph, Longhorn).
Build self-service storage workflows for Kubernetes CSI and OpenStack consumers (VM and Baremetal).
Develop Infrastructure-as-Code using Ansible, Terraform, Helm and Git, with Python/Bash automation.
Implement CI/CD pipelines for infrastructure updates, patching, upgrades, testing, and rollback.
Build observability, alerting, and auto-remediation using GitOps and tools such as Prometheus, Loki, and Grafana.
Architect and maintain high availability, disaster recovery, and scale-out infrastructure.
Develop and review high-level and low-level design documents for storage infrastructure
Perform deep troubleshooting across storage, Kubernetes, hypervisors, networking, and Linux systems.
Participate in on-call rotations, incident response, and root cause analysis.
Collaborate globally on change management, documentation, and operational best practices.

Must Have

6+ years of experience managing enterprise storage and Kubernetes platforms on Linux.
Strong hands-on experience with SDS solutions (Ceph, Longhorn) and storage migrations from legacy systems.
Experience with block, file, and object storage, including Fibre Channel and IP-based protocols.
Experience with NVMe-oF or iSCSI fabrics.
Expert knowledge of Kubernetes and Linux systems (Ubuntu, RHEL/CentOS).
Proficiency with Infrastructure-as-Code (IaC) (Ansible, Terraform).
Strong scripting skills in Python and Bash (Golang (GO) a plus).
Strong working knowledge of Enterprise DNS and integrations with Kubernetes
Experience operating 24x7 mission-critical production environments.
Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack).
Strong written and verbal communication skills.
Proficiency with Git, CI/CD pipelines, and automated testing frameworks
Ability to write technical documentation and contribute to community wikis or knowledge bases.
Bachelor s degree in computer science or equivalent professional experience.

Nice to Have

OpenStack Cinder multi-backend administration.
Backup platforms (Rubrik).
Understanding of CIS/NIST security and infrastructure lifecycle management.
ITIL Foundation/advanced certifications in support of ITSM standard methodology.
Background in telco, edge cloud, or large enterprise environments.
CNCF Certified Kubernetes Administrator (CKA), Certified Kubernetes Security
Specialist (CKS) or Red Hat specialist in Ceph Storage Administrator (EX125) certifications.
Master s degree in computer science, IT, Engineering, or a related field preferred;

equivalent experience and relevant industry certifications will also be considered

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: prutx001
Position Id: 8905151
Posted 30+ days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Cloud Infrastructure DevOps Engineer

Remote

•

24d ago

We are hiring for Cloud Infrastructure DevOps Engineer to support our client. Company Name: - JTSi (Johnson Technology Systems, Inc.) Title: Cloud Infrastructure DevOps Engineer Location: Fully Remote Visa Status: US Person DESCRIPTION OF PROJECT AND TASKS: *Possible Convert to hire* You will work in a highly critical production environment where your expertise in automation and systems engineering is essential. Supporting all three leading hyperscalers - AWS, Azure, and Google Cloud Platform, y

Easy Apply

Contract

Depends on Experience

Senior DevOps Tech Lead

Remote

•

Today

We need following candidate: Title: Senior DevOps Tech Lead Duration: 3+ Months Remote Work - Preferably East Coast. Should work during EST hours Must have 14+ years of overall IT experience Role Overview We are seeking a Senior Tech Lead with strong expertise in DevOps, Kubernetes, and AWS EKS, to lead end-to-end application modernization and cloud migration initiatives. The role focuses on migrating and modernizing complex 3-tier monolithic applications from on-premises environments to AWS

Easy Apply

Contract, Third Party

Depends on Experience

Site Reliability Engineer III

Remote

•

5d ago

Job Details: Job Title: Site Reliability Engineer Duration: Long-Term Contract Location: Chicago, IL || Remote (Candidate from CST Zone only) Job Description: Job Responsibilities: Applies software engineering practices to IT operations tasks to maintain a scalable and reliable production environment for running software services create a bridge between development and operations by applying a software engineering mindset to system administration topics.Writing and developing code to automate

Easy Apply

Contract

70 - 80

Sr. Platform Engineer - Kubernetes

Remote

•

5d ago

Title: Sr. Kubernetes Networking Platform Engineer Location: 100% Remote Description: Job Summary: The Kubernetes Networking Platform Senior Engineer will lead the design, delivery, and operation of networking capabilities across the enterprise Kubernetes platform. This includes critical components such as ingress controllers, service mesh, DNS, and traffic management. This engineer will join a team responsible for building a secure, scalable, and observable networking layer that enables applic

Easy Apply

Contract

75 - 80

Search all similar jobs