Senior IT Incident Manager

Remote • Posted 5 hours ago • Updated 5 hours ago

Contract W2

Contract Independent

Remote

Up to $62/hr

Fitment

Dice Job Match Score™

📊 Calculating match score...

Job Details

Skills

Incident Management
Major Incident Management
Incident Commander
Problem Management
Root Cause Analysis
RCA
Post-Incident Review
PIR
5 Whys
Fishbone Analysis
Fault Tree Analysis
IT Operations
IT Infrastructure
Cloud Operations
Azure
AWS
Hybrid Cloud
Microsoft 365
M365
Entra ID
Azure AD
Active Directory
MFA
SSO
Identity and Access Management
IAM
Networking
Route and Switch
Firewalls
Security Operations
SIEM
EDR
SOC
Monitoring
Observability
ITSM
EasyVista
ServiceNow
Jira Service Management
BMC Remedy
CMDB
Change Management
Change Control
ITIL
SLA Management
MTTR
MTTA
Operational Governance
Production Support
Enterprise Applications
Executive Communication
Operational Resilience
Continuous Improvement

Summary

Location / Remote: Remote (must live within the continental U.S.); quarterly travel to Atlanta, GA required

Employment Type: Indefinite W-2 or 1099/IC contract (will extend annually)

Compensation: Up to $55/hour W-2 or up to $62/hour 1099/IC (commensurate with experience)

Benefits: Medical, dental, vision, LTD/STD, HSA/FSA, term life, and optional supplemental insurance coverage available for W-2 employees (including family coverage if needed)

Job Summary:

We are seeking a Senior Problem & Incident Manager to serve as the central orchestration point for enterprise-wide incident response and problem management. This role is highly technical and process-driven, responsible for leading major incidents, driving structured root cause analysis, and strengthening operational governance across infrastructure, cloud, identity, and application environments.

Success in this role is measured by reduced repeat incidents, improved mean time to resolution (MTTR), stronger change controls, and clear executive-level communication during high-impact outages. This individual will operate as an Incident Commander during major events and will help formalize Problem Management processes in a growing, evolving IT environment.

Responsibilities:

Lead enterprise-wide incident management for Severity 1 3 incidents across infrastructure, cloud, identity, security, and applications.
Act as Incident Commander during major outages, coordinating cross-functional technical teams and vendors to restore services quickly and effectively.
Conduct structured Root Cause Analysis (RCA) and post-incident reviews (PIRs) using formal methodologies (e.g., 5 Whys, Fishbone, Fault Tree).
Own the Problem Management lifecycle, identifying recurring issues and driving permanent corrective actions.
Partner with infrastructure, cloud, application, and security teams to reduce change-related incidents and improve production readiness.
Participate in daily operational readouts and provide executive-ready communications regarding system health, impact analysis, and resolution status.
Track and report on key operational metrics, including MTTR, MTTA, repeat incident rates, change-related incidents, and problem backlog aging.
Support and mature ITSM processes, including incident, problem, and change management workflows.
Help define and document SOPs, playbooks, escalation paths, and governance standards in a developing process environment.
Ensure operational risks are identified, communicated, and mitigated in alignment with compliance and security expectations.

Required Skills & Experience:

8+ years of progressive experience in IT operations, infrastructure, cloud operations, or enterprise production environments.
Proven experience leading major incident response efforts in enterprise environments.
Hands-on experience conducting structured Root Cause Analysis and post-incident reviews.
Strong technical background in enterprise infrastructure (networking, compute, storage, virtualization).
Experience supporting Azure and/or AWS cloud environments.
Strong familiarity with Microsoft 365 and identity platforms (Active Directory, Entra ID, MFA, SSO).
Working knowledge of security tooling (SIEM, EDR, vulnerability management) and monitoring/observability platforms.
Experience with ITSM platforms (e.g., EasyVista, ServiceNow, Jira Service Management, BMC, or similar).
Strong written and verbal communication skills, with the ability to deliver executive-level incident communications.
Demonstrated ability to operate independently, prioritize effectively, and manage high-pressure situations.

Preferred Qualifications:

Experience building or maturing Problem Management processes from the ground up.
ITIL certification or strong familiarity with ITIL concepts.
Experience correlating change management events to incident trends.
Background in regulated or compliance-driven industries (e.g., utilities, energy, finance).
Experience working in environments with identity-heavy or access-controlled systems.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10290115
Position Id: 8894347
Posted 5 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Senior IT Incident Manager

Dice Job Match Score™

Job Details

Skills

Summary

Similar Jobs