Staff Engineer - SRE - Retail & Pharmacy

  • Woonsocket, RI
  • Posted 8 hours ago | Updated 8 hours ago

Overview

Remote
On Site
USD 118,450.00 - 284,280.00 per year
Full Time

Skills

Health Care
Pharmacy
Root Cause Analysis
Service Level
KPI
Dashboard
Real-time
Reliability Engineering
Instrumentation
MEAN Stack
ROOT
Use Cases
Data Storage
Leadership
Collaboration
Mentorship
Operational Efficiency
Continuous Improvement
Documentation
Scalability
Optimization
DevOps
Splunk
Dynatrace
Grafana
Scripting
Python
Java
Amazon Web Services
Microsoft Azure
Google Cloud
Google Cloud Platform
Version Control
Continuous Integration
Continuous Integration and Development
Git
Bitbucket
Jenkins
Team Leadership
Adobe Analytics
Microservices
Kubernetes
Docker
Incident Management
Change Management
Problem Management
Communication
CHAOS
Retail
Point Of Sale
Cloud Computing
API Management
Computer Science
Management
Finance
Coaching

Job Details

At CVS Health, we're building a world of health around every consumer and surrounding ourselves with dedicated colleagues who are passionate about transforming health care.

As the nation's leading health solutions company, we reach millions of Americans through our local presence, digital channels and more than 300,000 purpose-driven colleagues - caring for people where, when and how they choose in a way that is uniquely more connected, more convenient and more compassionate. And we do it all with heart, each and every day.

The Staff Engineer - SRE, Retail & Pharmacy will implement and maintain comprehensive observability solutions, providing real-time insights into the performance and overall health of systems to proactively identify and address potential issues. This role is responsible for investigating and resolving incidents quickly during critical situations and performing root cause analysis to prevent future recurrence. You will collaborate with cross-functional teams to build robust monitoring, alerting, and telemetry solutions, enabling proactive issue detection and resolution across distributed systems. As a senior member of the SRE team, you will drive best practices, mentor others, and shape the strategic evolution of our observability ecosystem in a complex, edge-centric architecture.

What You Will Do:
  • Observability Strategy & Implementation
    • Design and implement comprehensive observability solutions tailored for edge computing environments, including monitoring, logging, tracing, and metrics collection, to provide deep visibility into system performance and health across distributed remote facilities
    • Define and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and business KPIs to measure and enhance system reliability in edge and centralized infrastructure
    • Build and optimize dashboards, visualizations, and alerting systems to enable real-time insights and rapid incident response for edge nodes and remote facilities
    • Implement distributed tracing and log aggregation systems to troubleshoot complex issues in edge computing environments

  • System Reliability & Performance
    • Collaborate with engineering teams to ensure applications and infrastructure at edge locations are designed with observability in mind, incorporating best practices for instrumentation and monitoring in resource-constrained environments
    • Drive proactive identification of issues in edge facilities through advanced observability tools, reducing Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) across distributed systems
    • Lead incident postmortems, analyzing root causes specific to edge environments and implementing observability-driven improvements to prevent recurrence

  • Tooling & Automation
    • Develop and maintain tools, scripts, and automation to enhance observability pipelines, optimizing for the unique challenges of edge computing, such as bandwidth limitations and intermittent connectivity
    • Evaluate and integrate industry-standard observability tools (e.g., Prometheus, Grafana, ELK Stack, OpenTelemetry) and recommend solutions tailored for edge computing use cases
    • Optimize observability data storage, retention, and querying to balance performance, cost, and scalability across a large number of remote facilities

  • Leadership & Collaboration:
    • Mentor and guide junior SREs and engineers on observability best practices for edge computing, fostering a culture of reliability and proactive monitoring
    • Partner with solution, engineering, and business teams to align observability efforts with business objectives, ensuring seamless operation of edge and centralized systems
    • Lead cross-functional initiatives to improve observability, reliability, and operational efficiency across distributed edge infrastructure

  • Continuous Improvement:
    • Stay current with emerging observability trends, tools, and methodologies, particularly those suited for edge computing and distributed systems, and advocate for their adoption
    • Contribute to the development of observability standards, runbooks, and documentation tailored for edge environments to ensure consistency and scalability
    • Drive cost optimization for observability infrastructure while maintaining high-quality monitoring and alerting capabilities across remote facilities

Minimum Qualifications:
  • 8+ years of experience in SRE, DevOps, or related technology roles
  • 5+ years of experience in delivering software in a large-scale environment with reliability and resilience concepts (multi-region, multi-cloud, containerization, etc.)
  • 5+ years of experience with observability and monitoring tools such as Splunk, Dynatrace, Datadog, Prometheus, Grafana, etc.
  • 3+ years of experience with programming/scripting languages (e.g., Python, java) for automation and tooling in distributed environments
  • 3+ years of experience on Cloud Technologies (AWS, Microsoft Azure, Google Cloud
  • 3+ years of experience with source control and continuous integration tools like Git/Stash, BitBucket, or Jenkins
  • 2+ years of engineering team leadership or management experience
  • Experience using customer feedback tools such as Quantum Metrics, Medalia, and Adobe Analytics
  • Deep understanding of microservices architecture and cloud-native technologies
  • Experience in configuring, supporting, and managing Rancher, Kubernetes, and/or Docker
  • Experience in Incident Management, Change Management, Infrastructure Support, and Problem Management concepts and processes
  • Excellent interpersonal and communication skills, including the ability to engage technical and non-technical stakeholders

Preferred Qualifications:
  • Expertise working in edge computing environments with a large number of remote facilities, managing observability for distributed, high-latency, or resource-constrained systems
  • Familiarity with chaos engineering principles to validate observability systems in edge environments
  • Experience with retail SRE organizations, including experience with store systems; Point of Sale (POS), hand-helds, etc.
  • Expertise in cloud development and deployment technologies, including containerization and multi-cloud configurations
  • Demonstrated understanding of various API management and related platforms like Apigee, Vordel, Data power

Education:
  • Bachelor's degree in Computer Science, Engineering, or related field required
  • Master's degree in Computer Science, Engineering, or related field preferred

Pay Range

The typical pay range for this role is:

$118,450.00 - $284,280.00

This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above. This position also includes an award target in the company's equity award program.

Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.

Great benefits for great people

We take pride in our comprehensive and competitive mix of pay and benefits - investing in the physical, emotional and financial wellness of our colleagues and their families to help them be the healthiest they can be. In addition to our competitive wages, our great benefits include:
  • Affordable medical plan options, a 401(k) plan (including matching company contributions), and an employee stock purchase plan.
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching.
  • Benefit solutions that address the different needs and preferences of our colleagues including paid time off, flexible work schedules, family leave, dependent care resources, colleague assistance programs, tuition assistance, retiree medical access and many other benefits depending on eligibility.

For more information, visit ;br>
We anticipate the application window for this opening will close on: 01/30/2026

Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.