Director, Site Reliability Engineering

  • Bentonville, AR
  • Posted 13 hours ago | Updated 1 hour ago

Overview

Remote
On Site
USD 156,000.00 - 312,000.00 per year
Full Time

Skills

Big Data
Scalability
Machine Learning (ML)
Generative Artificial Intelligence (AI)
Customer Service
Business Systems
Innovation
Continuous Improvement
Operational Excellence
Incident Management
Root Cause Analysis
Continuous Integration
Continuous Delivery
Integration Testing
Build Tools
Dashboard
Coaching
Mentorship
IaaS
DevOps
Leadership
Collaboration
Communication
Agile
Finance
Life Insurance
Military
Exceed
Software Asset Management
English
Computer Science
Computer Engineering
Information Systems
Software Engineering
Supervision
Reliability Engineering
System Administration
Management
IBM SmartCloud
Web Content
WCAG
Assistive Technology
Accessibility

Job Details

Position Summary...

What you'll do...

Are you passionate about pioneering cutting-edge technology leveraging GenAI and big data to revolutionize Walmart's customer service experiences? Do you dream of working on innovative systems that make a significant impact on hundreds of millions of customers across the globe? We are seeking a visionary and hands-on Director of Site Reliability Engineering (SRE) to lead and scale a world-class SRE organization. This leader will be responsible for building a high-performing team, driving operational and engineering excellence, and ensuring the availability, scalability, and performance of our systems.

About Team: Customer Care Technology

The Customer Care Technology team builds best-in-class customer service experiences for hundreds of millions of Walmart customers and customer service agents globally. We are a group of software engineers, data scientists, and machine learning experts pushing the boundaries of GenAI technology in complex enterprise applications. The Customer Care Technology team is part of the Enterprise Business Systems organization in Walmart Global Tech. We partner with our product and business teams to drive significant measurable business impact. Our mission is to help customers save money and live better.

What you'll do:
  • Build and Lead a High-Impact Team: Recruit, mentor, and retain top-tier Distinguished, Principal, and Staff-level SREs. Foster a culture of ownership, innovation, and continuous improvement.
  • Champion Operational Excellence: Establish and uphold best practices to ensure system reliability, availability, and performance. Drive incident response, root cause analysis, and postmortem processes that raise the operational bar.
  • Drive Engineering Excellence: Implement and scale CI/CD pipelines, enforce robust unit and integration test coverage, and promote engineering practices that accelerate delivery without compromising quality.
  • Develop Scalable Systems and Processes: Build tools, dashboards, and metrics to proactively monitor system health, detect anomalies, and automate remediation. Lead retrospectives to identify systemic improvements.
  • Foster a Strong Team Culture: Create an inclusive, collaborative, and high-trust environment. Provide coaching and career development opportunities to help engineers grow and thrive.
  • Communicate Vision and Strategy: Effectively communicate vision and strategy to cross-functional teams, from senior leadership to partner teams and engineers.

What you'll bring:
    • Bachelors, Masters, or PhD from a reputed institution.
    • 10+ years of software engineering and/or site reliability experience in a related industry.
    • 5+ years of experience managing and mentoring engineering teams.
    • Proven experience leading SRE or infrastructure teams at scale.
    • Deep understanding of distributed systems, cloud infrastructure, and DevOps practices.
    • Strong leadership, communication, and cross-functional collaboration skills.
    • Track record of building high-performing teams and delivering reliable, scalable systems.
    • Excellent verbal and written communication skills, adept at communicating with executive levels, peers, and subordinates.
    • Demonstrated history of customer obsession and an agile mindset.
    • Strong sense of ownership and urgency.

At Walmart, we offer competitive pay as well as performance-based bonus awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting. Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more.

You will also receive PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes. The amount you receive depends on your job classification and length of employment. It will meet or exceed the requirements of paid sick leave laws, where applicable.

For information about PTO, see ;br>
Live Better U is a Walmart-paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities. Programs range from high school completion to bachelor's degrees, including English Language Learning and short-form certificates. Tuition, books, and fees are completely paid for by Walmart.

Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms.

For information about benefits and eligibility, see One.Walmart.

Bellevue, Washington US-11075:The annual salary range for this position is $156,000.00-$312,000.00

Bentonville, Arkansas US-10735:The annual salary range for this position is $130,000.00-$260,000.00

Additional compensation includes annual or quarterly performance bonuses.

Additional compensation for certain positions may also include:

- Stock

Minimum Qualifications...

Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications.

Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and6 years' experience in site reliability engineering, site and system administration, infrastructure management, or related area.Option 2: 8 years' experience in site reliability engineering, site and system administration, infrastructure management, or related area.3 years' supervisory experience.

Preferred Qualifications...

Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.

Experience in site reliability engineering, site and system administration, infrastructure management, or related area., Master's degree in site reliability engineering, site and system administration, infrastructure management, or related area and 4 years' experience in site reliability engineering, site and system administration, infrastructure management, or related area., SRE certification (for example, IBM Cloud Site Reliability Engineer)., We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart's accessibility standards and guidelines for supporting an inclusive culture.

Primary Location...

2501 Se J St, Ste A, Bentonville, AR 72716-3724, United States of America
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.