Incident & Request Lead (Non-Production)

Overview

On Site
Depends on Experience
Full Time

Skills

Amazon Web Services
Change Request Management
ITIL
Incident Management
Leadership
Kubernetes
IT Operations
Stakeholder Management
Splunk
Continuous Integration
Grafana
MEAN Stack
DevOps
Team Management
Request Lead

Job Details

Required Skills & Experience

    • 8 10 years in Incident Management, IT Operations, or SRE leadership.

    • Experience managing teams (Incident Analysts, SREs).

    • Strong knowledge of AWS, Kubernetes, CI/CD pipelines, and observability tools like Splunk, Prometheus, or Grafana.

    • Deep familiarity with ITIL Incident, Problem, and Request management processes.

    • Excellent crisis handling, communication, and stakeholder management skills.

    • Own the full lifecycle of incidents in non-production: detection triage resolution closure.

    • Be the escalation point when delivery teams run into problems.

    • Lead war rooms for major incidents, coordinating with DevOps, Infra, Security, and other teams.

    • Ensure incidents escalate properly to all relevant teams.

    • Track and improve SLAs / metrics like Mean Time to Detect (MTTD), Mean Time to Resolve (MTTR), and environment availability.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.