SRE / DevOps Engineers - Helix

Remote • Posted 10 hours ago • Updated 10 hours ago
Contract W2
Contract Independent
No Travel Required
Remote
Depends on Experience
Fitment

Dice Job Match Score™

🔗 Matching skills to job...

Job Details

Skills

  • Apache Helix
  • MEAN Stack
  • Business Continuity Planning
  • Performance Management
  • Preventive Maintenance

Summary

Role: SRE / DevOps Engineers - Helix
Location: Remote
Contract: C2C
Exp: 10+ Years
Role Summary
We are seeking three SRE / DevOps Engineers to improve the reliability, observability, and operational readiness of business-critical platforms and services supporting the Helix program. While titled as SRE / DevOps, these roles are heavily operations-oriented and require strong production support, incident
response, and Splunk-based monitoring experience.
Key Responsibilities
Lead complex initiatives to improve the reliability, availability, and operational readiness of business-critical platforms and services.
Own and support production operations, including implementation support, system health
monitoring, and proactive issue identification.
Play a key role in Incident Management, including triage, coordination, root cause analysis, and driving post-incident remediation.
Support and participate in Business Continuity Planning activities, including failover readiness, disaster recovery testing, and recovery validation.
Design, implement, and maintain monitoring, alerting, and observability solutions, with a strong emphasis on Splunk-based logging and dashboards.
Automate operational workflows to reduce manual effort and improve mean time to detect and mean time to recover.
Partner with application, platform, and security teams to ensure services are built and deployed with operational excellence and reliability in mind.
Define and enforce SRE and DevOps standards, including SLIs/SLOs, alert hygiene, runbooks, and on-call best practices.
Lead and participate in post-incident reviews, ensuring root causes are addressed and preventive actions are implemented.
Mentor engineers on reliability engineering, incident response, and operational best practices.
Continuously evaluate and improve system performance, resiliency, and operational tooling across the platform lifecycle.
Additional Role Context
These roles are more operations-heavy than a traditional engineering-focused SRE title may suggest.
Strong Splunk experience is required, including dashboard creation, query development, log investigation, and trace-based troubleshooting across connected systems.
The team needs people who can navigate issues across integrated systems involved in the Helix VM process, including front-end and infrastructure-connected services.
The role supports a high-volume change environment, including multiple CR implementations in a single evening and operational coordination across workstreams.
Candidates should be informed up front that the role may require after-hours deployments, night support, and possible weekend work tied to CRs and future BCP events. CR activity typically begins around 9 PM ET.
Dallas is the preferred location for these operations resources to support onboarding and collaboration with the existing local team, though strong candidates outside Dallas may still be considered.
Engagement is expected through the end of the year, based on project demand.
Project Context: Helix
Helix is an internal platform initiative focused on enabling self-service infrastructure provisioning for
application development teams. The program works with platform teams to expose APIs and automation
for provisioning services such as VMs, storage, and related infrastructure resources. The goal is to reduce
manual operational processes, improve governance, and support migration away from heritage
environments through a strategic internal developer platform
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91099596
  • Position Id: 8935164
  • Posted 10 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Easy Apply

Contract

Depends on Experience

Remote

Today

Contract

75-95/hr

Remote

Yesterday

Easy Apply

Contract

$160,000 - $180,000

Remote

Today

Easy Apply

Contract

Depends on Experience

Search all similar jobs