job summary:
The Major Incident Manager (MIM) for a Healthcare IT Enterprise Operations Center (EOC) is a high-impact leadership role responsible for the end-to-end management of critical IT service disruptions that affect patient care, clinical workflows, and hospital operations. This is senior level, decision authority role within Core Tech, responsible for leading enterprise-level, Tier 1 incidents across the healthcare environment. This role goes beyond traditional incident coordination and requires command authority, strong executive communication, and deep understanding of healthcare IT operations, including infrastructure, networking, clinical imaging, and EHR access. This role exists because the current incident model is not meeting enterprise needs and requires experienced leadership to reset and stabilize Tier 1 incident response across multiple facilities. This role is accountable for incident outcomes, not coordination alone. Role Summary: The Major Incident Manager acts as the "Incident Commander," (single point of command) driving the swift restoration of critical healthcare services (e.g., EHR access, diagnostic imaging, and network connectivity) while maintaining transparent communication with executive leadership and clinical stakeholders. This role is responsible for incident outcomes, not just processes.
location: Telecommute
job type: Contract
salary: $70 - 90 per hour
work hours: 3pm to 11 pm
education: Bachelors
responsibilities:
Key Responsibilities
Incident Command & Coordination:
-Lead the Major Incident Bridge by facilitating 24/7 technical bridge calls and "war rooms" to triage and resolve Priority 1 (P1) and Priority 2 (P2) incidents impacting multiple health ministries or systems.
-MIM impact to multiple facilities, wider user impacts, or multiple technology failures.
-Lead real-time decision-making under pressure, balancing technical recovery with patient-safety and clinical impact.
-Resources will not typically be responsible for concurrent major incident ownership
Clinical Impact Assessment:
-Quickly evaluate the scope of technical outages to determine their impact on patient safety and critical business applications.
-Outages may impact EHR systems, Clinical Imaging, Network dependencies, as examples.
-Identify trends, anomalies, and recurring failure patterns
Stakeholder Communication:
-Issue regular, clear updates to senior leadership (CIO/CTO) and Service Portfolio leads using pre-defined communication templates and protocols.
-Maintain confidence and calm in high-pressure situations with senior leaders and clinical partners.
Post-Incident Management:
-Own and facilitate after action reports and Post-Incident Reviews (PIR) within 48 hours to identify root causes and drive preventive actions.
-Conduct root cause discussions and ensure corrective actions are identified and tracked.
-Communication and stakeholder management are the number-one success factor for this role and is treated as top priority for candidate review.
Process Improvement:
-Own and continuously optimize the enterprise incident management process in alignment with ITIL best practices and healthcare regulatory requirements.
Vendor & Matrix Management:
-Coordinate with third-party vendors and internal cross-functional teams (Network, Security, Clinical Apps) to ensure rapid service recovery.
Monitoring and Alerting:
-Utilize AIOps, triaging and monitoring tools, dashboards, and alerting systems across onpremise and cloud environments (e.g. SolarWinds, NetPath, ScienceLogic) to assist with MTTD and MTTR.
Incident Response:
-Serve as an escalation point for complex operational incidents, guiding technical teams to swift and effective resolution for critical monitoring issues.
Performance Optimization:
-Analyze monitoring data and performance metrics (MTTD, MTTA, MTTR, Incident Recurrence Rate, SLA Compliance) to identify trends, anomalies, and potential issues, providing recommendations for improvement and capacity planning.
Automation:
-Identify and implement automation opportunities for major incident management and routine tasks to reduce manual workload and improve e?iciency. Collaboration and Documentation:
- Collaborate with cross-functional teams (Application, Network, Security, Cloud Enablement, Managed Service Provider, etc.) to maintain Major Incident Management comprehensive documentation, including standard operating procedures (SOPs) and runbooks.
Problem Management:
-Participate in root cause analysis (RCA) and post-incident reviews to prevent recurring issues and drive long-term solutions.
Compliance and Security: -Ensure MIB processes comply with organizational security standards and regulatory requirements (e.g., SOC, HIPAA).
qualifications:
Shift requirements: Candidates must be flexible to support regular US business hours or
a second shift (2:00 pm - 10:00 pm Central Time). Flexibility to work extended or o?-hours
during incidents is required.
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including: medical, prescription, dental, vision, AD&D, and life insurance offerings, short-term disability, and a 401K plan (all benefits are based on eligibility).
This posting is open for thirty (30) days.
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
![]()