Director Service Reliability Engineering

Overview

On Site
USD 117,700.00 - 197,900.00 per year
Full Time

Skills

Problem Management
Customer Facing
Computer Science
Software Engineering
Leadership
DevSecOps
IT Operations
HTML
Cascading Style Sheets
JavaScript
Backbone.js
Node.js
Android
IOS Development
Nginx
Java
Play Framework
Apache Tomcat
Docker
PostgreSQL
Couchbase
Apache Cassandra
Microsoft SSIS
Apache Kafka
Apache Spark
Analytics
Apache Hadoop
IBM Cognos Analytics
Tableau
OAuth
Microsoft Azure
Google Cloud
Google Cloud Platform
Amazon Web Services
Automated Testing
Akamai
Linux
Unix Administration
Scripting
Programming Languages
Python
Bash
Terraform
Orchestration
Kubernetes
Computer Networking
Database
Dynatrace
Splunk
Artificial Intelligence
Machine Learning (ML)
Disaster Recovery
Failover
Problem Solving
Conflict Resolution
Communication
Stakeholder Management
KPI
Accountability
Emerging Technologies
Reliability Engineering
Hospitality
Roadmaps
Mentorship
Collaboration
Innovation
Scalability
Continuous Improvement
Budget
Cloud Computing
Root Cause Analysis
Incident Management
Real-time
DevOps
Continuous Integration
Continuous Delivery
Management
Service Delivery
Computer Hardware
Presentations
Health Care
Life Insurance
Insurance
Recruiting
SAP BASIS
Law

Job Details

Job Description

JOB SUMMARY

As a leader of the Service Reliability Engineering - Experience organization, you lead the team responsible for accelerating and automating the flow of operational activities, ensuring the reliability, performance and scalability of our critical digital platforms. This role will be responsible for driving the SRE strategy for our digital experience team, implementing best practices, and collaborating closely with cross-functional teams to enhance our infrastructure, observability, automation, incident/problem management and disaster recovery processes. You'll play a key role in modernizing Mariott's technology stack, fostering a culture of reliability and improving service availability across our customer-facing enterprise applications and its supporting infrastructure. As Director of SRE, you will have the opportunity to shape the reliability and scalability of mission-critical platforms, enabling seamless guest experiences across our global portfolio of brands. If you're enthusiastic about contributing to a resilient architecture and thrive in a collaborative environment, join us at the forefront of innovation. Be a key player in shaping the future of SRE at Marriott, working alongside like-minded individuals who share your passion for speed, confidence, and efficiency.

Qualifications:
  • Undergraduate degree in computer science, software engineering, or a related field (or equivalent experience)
  • 10+ years of experience in SRE, devsecops or IT operations
    • At least 5 years' experience in a previous leadership role within SRE, devsecops or IT Operations
  • At least five years of experience in the following technologies. Certifications preferred:
    • Presentation Management - HTML, CSS, JS, Backbone, Node JS, Android, iOS
    • Application Platforms - NGINX, Java, Akana, Play Framework, Tomcat, Docker, Openshift
    • Application Data - PostgreSQL, Couchbase, Cassandra
    • Integration Services - Apache Kafka, Apache Spark, Akana
    • Analytics Platforms - Hadoop, dashDB, Cognos, Tableau
    • Security - Forgerock, OpenID, OAUTH, Ping Identity
    • Public Cloud - Azure, Google Cloud, AliCloud, Amazon Web Services
    • CI/CD - Harness (particularly in context of usage within SRE)
  • Experience with test automation
  • Working knowledge and proven track record of implementing disaster indifferent architecture
  • Experience with CDN and Akamai tools
  • Linux/Unix system administration experience
  • Proficient in scripting and programming languages (like Python, Go, Bash, Shell)
  • Hands on experience with infrastructure as code (like Terraform), container orchestration (like Kubernetes), and reliability automation
  • Working knowledge of networking, databases, distributed systems
  • Deep knowledge of monitoring, logging and incident response tools (like Dynatrace, Splunk, OpsGenie, BigPanda, Prometheus, etc.)
  • Experience implementing and maintaining CI/CD pipelines for large-scale applications
  • Experience creating system architectures for disaster recovery implementation and failover during disasters
  • Familiarity with AI/ML-driven observability and predictive maintenance techniques
  • Experience in creating system architectures e.g. for Disaster recovery implementation and failover during disasters.
  • Exceptional problem solving, communication and stakeholder management skills

Competencies:
  • Experience leading, mentoring and developing high performing SRE teams
  • Experience managing large, cross functional vendor teams
  • Experience defining SLOs/SLIs, error budgets, and KPIs to drive accountability and performance
  • Ability to foster a culture of continuous improvement
  • Proven record of staying ahead of industry trends/informed of emerging technologies to enhance system reliability and efficiency
  • Experience in hospitality is preferred

CORE WORK ACTIVITIES:
  • Define and execute Marriott's SRE vision, aligning with business objectives and technology roadmaps
  • Build, mentor and lead a high-performing SRE team, fostering a culture of collaboration and innovation
  • Establish reliability, observability and automation goals to improve system uptime, performance and scalability
  • Partner with engineering, operations and security teams to drive best practices and continuous improvement
  • Implement reliability-focused engineering practices, including SLAs, SLOs/SLIs and error budgets
  • Design and maintain resilient, scalable and fault-tolerant architectures across cloud and hybrid environments
  • Develop strategies to proactively identify and mitigate risks to system performance and availability
  • Drive root cause analysis (RCA) and post-mortem processes to prevent recurring incidents
  • Champion automation in monitoring, deployment and incident resolution to reduce toil and enhance efficiency
  • Lead and optimize incident response processes, ensuring rapid detection, diagnosis, and resolution of system failures
  • Enhance observability by leveraging monitoring, logging and tracing solutions to provide real-time insights
  • Partner with DevOps teams to improve CI/CD pipelines and reduce deployment risk

Managing Projects and Priorities
  • Functions as a strategic senior technical expert within the team
  • Champions leaders' vision for product and service delivery
  • Makes and executes the necessary decisions to keep moving forward toward achievement of goals
  • Provides direction and assistance to other teams regarding projects
  • Determines priorities, schedules, plans and necessary resources to promote completion of any projects on schedule
  • Analyzes information and evaluates results to choose the best solution and solve problems
  • Reviews vendor proposals and selects appropriate vendor for services/technologies/hardware
  • Thinks creatively and practically to develop, execute and implement new project plans
  • Generates and provides accurate and timely results in the form of reports, presentations, etc.
  • Plans, develops, implements, and evaluates the quality of operations

The salary range for this position is $117,700 to $197,900 annually. In addition to the annual salary, the position will be eligible to receive an annual bonus and restricted stock units/stock grants.

Washington Applicants Only: Employees will accrue 0.04616 PTO balance for every hour worked and eligible to receive minimum of 7 holidays annually.

All locations offer coverage for medical, dental, vision, health care flexible spending account, dependent care flexible spending account, life insurance, disability insurance, accident insurance, adoption expense reimbursements, paid parental leave, educational assistance, 401(k) plan, stock purchase plan, discounts at Marriott properties, commuter benefits, employee assistance plan, and childcare discounts. Benefits are subject to terms and conditions, which may include rules regarding eligibility, enrollment, waiting period, contribution, benefit limits, election changes, benefit exclusions, and others.

Marriott HQ is committed to a hybrid work environment that enables associates to Be connected. Headquarters-based positions are considered hybrid, for candidates within a commuting distance to Bethesda, MD; candidates outside of commuting distance to Bethesda, MD will be considered for Remote positions.

The application deadline for this position is 5 days after the date of this posting, April 25, 2025.

Marriott International is an equal opportunity employer. We believe in hiring a diverse workforce and sustaining an inclusive, people-first culture. We are committed to non-discrimination on any protected basis, such as disability and veteran status, or any other basis covered under applicable law.

About the Team

Marriott International is the world's largest hotel company, with more brands, more hotels and more opportunities for associates to grow and succeed. Be where you can do your best work, begin your purpose, belong to an amazing global team, and become the best version of you.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.