Principle SRE Engineer

Remote • Posted 1 day ago • Updated 1 day ago
Contract Independent
Contract W2
Contract Corp To Corp
No Travel Required
Remote
$45 - $50/hr
Fitment

Dice Job Match Score™

🔗 Matching skills to job...

Job Details

Skills

  • Principle SRE Engineer

Summary

Career Opportunity:
Job Title: Principle SRE Engineer 

 

About CodeForce 360
Making a career choice is amongst the most critical choices one can make, and it’s important for the choice to be calculated with factors such as a company’s run of success since its inception and more. But, when you come across a company that has reputation proven with nothing but an illustrious run of success since the day it began, you don’t need to think of anything else. That’s precisely what some of our employees and prospective employees think when they came across CodeForce 360.

 

Position Overview

Principle SRE Engineer 

 

Requirements:

The type of profile that would be most valuable for us is someone who has personally driven the operationalization of SRE frameworks – not just at a strategic level, but through execution. This would include areas such as:

  • Defining and implementing SLIs/SLOs and reliability targets that align with the departments Golden Pathways
  • Building and operationalizing observability standards (metrics, logs, traces)
  • Designing/evolving existing incident management and RCA practices
  • Driving automation and reliability engineering workflows
  • Establishing service health dashboards and telemetry pipelines
  • Working closely with engineering teams to embed reliability into development and operations

 

Ideally this would be someone who has stood up or significantly evolved SRE programs in complex enterprise environments and can help accelerate implementation of the practices we are defining.

 

This role would be very execution-focused – someone comfortable rolling up their sleeves, working with the engineering teams directly, and helping us operationalize the reliability model across our platforms.

 

  1. Design and Build Central SRE Operating view
    1. Implement golden-pathway telemetry across:

                                                              i.      App Performance Monitoring (APM) – Service response times, transaction bottlenecks

                                                             ii.      Logging & Tracing -correlated logs, structured tracing

                                                           iii.      Event & Alerting – actionable event definitions tied to severity

                                                           iv.      RCA/Tagging Compliance monitoring – auto tagging, and RCA lifecycle ingestion

                                                             v.      Build executive level Scorecards and dashboards via Grafana and ServiceNow performance analytics:

        1. Per-app reliability score
        2. SRE maturity score
        3. Mean time to detect/respond/restore (MTTx)
        4. Escalation patterns and failure root trends
  1. Enable Long-Term SRE Governance
    1. Establish SRE telemetry ingestion pipelines
    2. Design alert logic for low-quality signals
    3. Build RCA tagging enforcement playbooks
    4. Deliver runbooks and telemetry integration guides per application type
  1. Centralized SRE Golden Dashboard – Single Pane of Glass
    1. A central pillar of this initiative is the creation of a Centralized SRE Golden Dashboard serving as a Single Pane of Glass – for executive and operational visibility across all 40 + applications

                                                              i.      The dashboard will:

        1. Aggregate key telemetry: reliability metrics, RCA themes, MTTR, incident volumes, tag compliance, alert noise, performance degradation, and resilience scoring.
        2. Display per-app SRE health scores based on the maturity framework.
        3. Include dynamic drilldowns into:
          1. Incident hygiene (tagging, closure quality, RCA ownership)
          2. SLA/OLAs/SLIs/SLOd/Error budgets cleanly architected
          3. Alerting trends and noise correlation
          4. Capacity/resiliency warnings
          5. Serve as the definitive executive reporting source – used for monthly reviews, CIO/VP visibility, and roadmap investment decisions.
 

How to Apply

Job ID: JPC - 227332

 

For more information, please contact below:

Bhushan Reddy

 

Qualified individuals will be contacted for an interview.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10489662
  • Position Id: 219095-11612-
  • Posted 1 day ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Contract

75-95/hr

Remote

3d ago

Easy Apply

Contract

Depends on Experience

Remote

23d ago

Easy Apply

Contract

70 - 85

Remote

5d ago

Easy Apply

Contract

Depends on Experience

Search all similar jobs