Principle SRE Engineer


VDart, Inc.
Dice Job Match Score™
⏳ Almost there, hang tight...
Job Details
Skills
- Site Reliability Engineer
- MTTR
- Telemetry
- SLO & SLIs
Summary
Role : Principle SRE Engineer
Duration : 6+ Months
Location : Dallas, TX(Remote)
Looking for senior/principle-level SRE practitioner who has strong hands-on experience implementing reliability practices at scale.
The type of profile that would be most valuable for us is someone who has personally driven the operationalization of SRE frameworks – not just at a strategic level, but through execution. This would include areas such as:
• Defining and implementing SLIs/SLOs and reliability targets that align with the departments Golden Pathways
• Building and operationalizing observability standards (metrics, logs, traces)
• Designing/evolving existing incident management and RCA practices
• Driving automation and reliability engineering workflows
• Establishing service health dashboards and telemetry pipelines
• Working closely with engineering teams to embed reliability into development and operations
- Ideally this would be someone who has stood up or significantly evolved SRE programs in complex enterprise environments and can help accelerate implementation of the practices we are defining.
- This role would be very execution-focused – someone comfortable rolling up their sleeves, working with the engineering teams directly, and helping us operationalize the reliability model across our platforms.
2. Design and Build Central SRE Operating view
a. Implement golden-pathway telemetry across:
i. App Performance Monitoring (APM) – Service response times, transaction bottlenecks
ii. Logging & Tracing -correlated logs, structured tracing
iii. Event & Alerting – actionable event definitions tied to severity
iv. RCA/Tagging Compliance monitoring – auto tagging, and RCA lifecycle ingestion
v. Build executive level Scorecards and dashboards via Grafana and ServiceNow
Performance analytics:
1. Per-app reliability score
2. SRE maturity score
3. Mean time to detect/respond/restore (MTTx)
4. Escalation patterns and failure root trends
3. Enable Long-Term SRE Governance
a. Establish SRE telemetry ingestion pipelines
b. Design alert logic for low-quality signals
c. Build RCA tagging enforcement playbooks
d. Deliver runbooks and telemetry integration guides per application type
4. Centralized SRE Golden Dashboard – Single Pane of Glass
a. A central pillar of this initiative is the creation of a Centralized SRE Golden Dashboard serving as a Single Pane of Glass – for executive and operational visibility across all 40 + applications
The dashboard will:
1. Aggregate key telemetry: reliability metrics, RCA themes, MTTR, incident volumes, tag compliance, alert noise, performance degradation, and resilience scoring.
2. Display per-app SRE health scores based on the maturity framework.
3. Include dynamic drilldowns into:
a. Incident hygiene (tagging, closure quality, RCA ownership)
b. SLA/OLAs/SLIs/SLOd/Error budgets cleanly architected
c. Alerting trends and noise correlation
d. Capacity/resiliency warnings
e. Serve as the definitive executive reporting source – used for monthly reviews, CIO/VP visibility, and roadmap investment decisions.
Key Skills: Site Reliability Engineer, Telemetry, RCA, SLA & SLIs, MTTR, Incident management
- Dice Id: 10330808
- Position Id: 95783-5195-
- Posted 12 hours ago
Company Info
VDart, headquartered in Atlanta, GA, is a global leader in digital talent solutions and IT staffing, delivering top technology professionals to businesses worldwide. With a strong presence across North America, Europe and Asia, we specialize in helping organizations navigate complex technology landscapes with the right expertise.
Through a strategic, client-focused approach, we have placed over 20,000 professionals across key industries and advanced technology solutions. Whether placing top talent in cutting-edge roles or providing strategic digital workforce solutions, our network of 4,000 specialists across 13 countries is committed to excellence, agility and impact.
Backed by 18 years of industry experience, we go beyond staffing to build long-term partnerships that accelerate digital transformation and drive sustained growth. Whether you need a technology partner to fuel innovation or specialized workforce solutions to maintain a competitive edge, VDart delivers the right people, skills and mindset to create a lasting impact in a digital-first world.
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs