Incident Manager

  • Reston, VA
  • Posted 12 hours ago | Updated 12 hours ago

Overview

On Site
Depends on Experience
Accepts corp to corp applications
Contract - W2

Skills

Amazon Web Services
Analytical Skill
Auditing
Change Management
Cloud Computing
Collaboration
Command-line Interface
Communication
Dashboard
Data Analysis
Database
Documentation
ExtraHop
Firewall
IT Infrastructure
ITIL
Incident Management
Issue Resolution
Leadership
Management
Microsoft Excel
Microsoft Power BI
Microsoft PowerPoint
SolarWinds
Trend Analysis
Recovery
Process Improvement
ServiceNow
Regulatory Compliance
Scripting
SQL
Splunk

Job Details

Job Title: Incident Manager
Location: Primary location is Reston, VA- 5 Days Onsite
Duration: 12+ months with possible extensions

The role includes rotational shifts, typically starting with a standard 9 5 schedule. Over time, based on performance, the candidate may be assigned to early (7 3) or late (11 7) shifts, including occasional night or weekend shifts usually once in a month.

Job Description:

Manage incidents 24/7 using Fannie Mae processes, lead technical triage, share insights from monitoring tools, and detail resolutions. Recommend process improvements, provide timely updates, assist in postmortems, and support operational enhancements. Maintain application uptime through troubleshooting, bug fixes, performance documentation, and collaboration with infrastructure teams.

Key Job Functions

  • Provide expert-level incident management in a 24/7/365 environment.
  • Lead triage and resolution of high-impact, complex incidents.
  • Act as command center to minimize business disruption.
  • Use monitoring tools for quick root cause analysis and resolution.
  • Collaborate across teams for recovery and process enhancement.
  • Ensure timely stakeholder communication and proper escalation.
  • Deliver incident trend analysis and detailed reporting.
  • Participate in on-call rotation and shift work.
  • Present insights and metrics to senior leadership.

Required Knowledge & Skills

  • Skilled in leading large-scale incident calls with up to 150 participants.
  • Proficient in Microsoft Word, Excel, and PowerPoint; capable of presenting data-driven insights to senior leadership.
  • Holds AWS Cloud Certification (beyond Cloud Practitioner) and ITIL Certification.
  • Hands-on experience with Splunk and other transaction-level monitoring tools.
  • Experienced in using ServiceNow for incident and change management.
  • Capable of transaction-level troubleshooting in AWS cloud environments.
  • Familiar with monitoring tools like ExtraHop, SolarWinds, and Catchpoint.
  • Skilled at identifying trends in application health via dashboards and reports.
  • Experienced in working with compliance, audit, and support teams.
  • Able to present data analysis findings and lead remediation efforts.
  • Proficient in AWS Console/CLI, scripting, SQL, and tools like PowerBI and Tableau.
  • Knowledgeable in IT infrastructure including servers, networks, databases, firewalls, and monitoring solutions.
  • Uses a structured, analytical approach to issue resolution.
  • Strong communicator, especially under pressure, with ability to guide technical teams.

Preferred Qualifications:

  1. AWS Certified Solutions Architect Associate Certification
  2. Experience with OpenTel, SignalFX
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.