Principal SRE Engineer


Enfycon Inc
Dice Job Match Score™
🫥 Flibbertigibetting...
Job Details
Skills
- SRE
Summary
Urgently looking Principle SRE Engineer for an urgent job opportunity .
Remote role
Need Sr candidates
Client: CVS Health
Any visa fine
Max 2-3 round- With CVS Health
Key Responsibilitie
- Defining and implementing SLIs/SLOs and reliability targets that align with the departments Golden Pathways
- Building and operationalizing observability standards (metrics, logs, traces)
- Designing/evolving existing incident management and RCA practices
- Driving automation and reliability engineering workflows
- Establishing service health dashboards and telemetry pipelines
- Working closely with engineering teams to embed reliability into development and operations
Design and Build Central SRE Operating view
Implement golden-pathway telemetry across:
- App Performance Monitoring (APM) – Service response times, transaction bottlenecks
- Logging & Tracing -correlated logs, structured tracing
- Event & Alerting – actionable event definitions tied to severity
- RCA/Tagging Compliance monitoring – auto tagging, and RCA lifecycle ingestion
- Build executive level Scorecards and dashboards via Grafana and ServiceNow performance analytics:
- Per-app reliability score
- SRE maturity score
- Mean time to detect/respond/restore (MTTx)
- Escalation patterns and failure root trends
Enable Long-Term SRE Governance
- Establish SRE telemetry ingestion pipelines
- Design alert logic for low-quality signals
- Build RCA tagging enforcement playbooks
- Deliver runbooks and telemetry integration guides per application type
Centralized SRE Golden Dashboard – Single Pane of Glass
- A central pillar of this initiative is the creation of a Centralized SRE Golden Dashboard serving as a Single Pane of Glass – for executive and operational visibility across all 40 + applications
The dashboard will:
- Aggregate key telemetry: reliability metrics, RCA themes, MTTR, incident volumes, tag compliance, alert noise, performance degradation, and resilience scoring.
- Display per-app SRE health scores based on the maturity framework.
Include dynamic drilldowns into:
- Incident hygiene (tagging, closure quality, RCA ownership)
- SLA/OLAs/SLIs/SLOd/Error budgets cleanly architected
- Alerting trends and noise correlation
- Capacity/resiliency warnings
- Serve as the definitive executive reporting source – used for monthly reviews, CIO/VP visibility, and roadmap investment decisions.
- Dice Id: 91142795
- Position Id: 8920404
- Posted 9 hours ago
Company Info
We can certainly recommend trying out our staffing services. Our meticulous approach, tailored strategies, and dedication to excellence make us a valuable partner in meeting your hiring needs effectively. Feel free to reach out to our team to discuss how we can support your organization's growth and success through our comprehensive staffing solutions.


Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs