Overview
On Site
Full Time
Skills
JIRA
Zoom
incident manager
etc.
monitoring tools like Service now
Pager duty
Slack
Job Details
Incident Manager at Oakland, CA(Hybrid) Need W2 Candidates here
Role:- Incident Manager
Address :- Oakland, CA (3 Days onsite)
Full Time Job
Please share this updated JD to all the resources.
Based on my interaction with CK team, I have modified the JD for hiring SiteOps engineers. Please make sure the below JD is published for shortlisting candidates.
Responsibilities
- Manage incident management bridge calls with support teams, on-call support application teams and management. Manage, escalate, status, and assist, coordinating repair efforts for all major incidents (P1 P4).
- Regular communication updates to the Customer, End-Users and other Stakeholders during the entire Incident Management cycle
- Track and document incident updates in real time
- Since Major incidents are highly escalated cases, handling with presence of mind and innovation.
- Support the development and execution of change management plans to drive adoption and utilization of new processes, systems, and technologies.
- Reviewing changes, their priority, their urgency and performing risk analysis.
- Creating problem tickets and respective action items, reviewing root cause analysis and its closers.
- Performing PIR and Postmortem reports.
- Leading Site reliability/Disaster Recovery/Game Day/Switchover/Failover activities.
- Experience in handling multiple monitoring tools like Service now, Pager duty, Slack, Zoom, JIRA, etc.
- Perform quality audits and data analytics on incident tickets to ensure quality and uncover new trends.
- Meet the SLAs and other KPIs agreed and produce the Process Performance Reports
- Provides documentation for Known Error Data Base (KEDB) or similar depository
- Develop process and procedures that ensure Incident Management related action items are tracked and completed
- Ensuring the Process adherence, meeting the Quality norms
- Provide Management reporting on Incident Metrics and Incident Management performance
Qualifications/Skills required.
- Degree in computer science, Information Technology, or related field.
- Experience in incident management or related field.
- Knowledge of Cloud services is must. ( AWS/Azure/Google Cloud Platform)
- Advanced proficiency in site reliability culture and principles and can demonstrate how to implement site reliability across platform teams while avoiding common pitfalls.
- Should be able to plan and conduct site reliability testing
- Should have experience in AMS - Application Management Services.
- Knowledge of incident management/change management/problem management processes and procedures.
- Experience with and knowledge of change management principles, methodologies and tools
- Excellent problem-solving and analytical skills.
- Excellent verbal & written communication and interpersonal skills.
- Ability to work independently and as part of a team.
- Ability to manage multiple tasks simultaneously.
Note : This is NOT an Infrastructure support role, This is Semi technical role to support an environment which is 100% hosted over cloud and to drive Applications related issues.
Thanks and Regards,
Rajiv Kumar
Associate Manager
Corporate Office: 650 Wilson Lane, Suite 201, Mechanicsburg, PA 17055
Testingxperts Inc DBA Damcosoft
P: +1 Ext 518
E:
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.