Overview
Remote
$64,000 - $120,000
Full Time
No Travel Required
Skills
DFS-Element-00008-Integrated Command Center
Job Details
PURPOSE OF THE JOB AND ACCOUNTABILITY
- Overall, in-charge for Command Center Operations
- 24x7 Event/Alert/Incident Monitoring support for in-scope infra, Apps and Cloud Management
- Create and fix accountability for Command Center Team
- Be the first point of contact for all the escalations, issues handling within and outside the team
- Fix responsibility and ensuring team availability and resourcing
- Ensure first level 1, Level 1.5, Level 2 Team management and administrative tasks e.g. (roster, leaves, discipline, punctuality)
- Maintain and update information / documentation for smooth working of the team
- Handle Critical Situations / Escalations and create RCA s
- Owning Runbook and SOP Updation as and when required
- Conduct monthly review and make sure Team is regularly learning by conducting mock test
- Manage end to end Command Center Operations and escalations
- SPOC for Command Center Team
- Resource planning and operations coverage
- Handling Shift Roster and Leave Management for Team
- Incident management and reporting
- Taking reports of all Command Center standard reports as per contract
- Participate in change management activities and CAB meetings
- Onsite and offshore resource collaboration
- First level escalation contact point for Command Center
- Re-design Command Center service plan based on latest requirement
- Training for Command Center team based on change in the environment
- Gap analysis and identify risks (if any)
- Service improvement
- Working alert optimization
- Collaboration with Tools and Technology team
- Assist PM team by reporting top 20 issues in a month
- Create PM Ticket for recurring issues
- Performs Gap Analysis
- Initiates, coordinates and collaborate with Tools Technology Team, Service Desk and Vendor management
- Reduces the workload of the technology tracks by performing Instruction based (SOP) Level 1.5 troubleshooting and try to resolve Incident tickets at Command Center Level level.
- Review SOP's
- Own Event Management
- Initiate Incident Management for Event Based incidents
- Assist in High Severity Incidents: Initiate Critical Bridges and work closely with CIM/MIM Dedicated Team
- Act as a Situation Manager and assist dedicated CIM/ MIM Team (impact analysis, initiate bridge, inform MIM/CIM, inform Service Desk, inform Business, inform oncall person, send out hourly report, send out closer report, find out recurring issue or in the past this issue has been reported, if required run business bridge) etc
- Always part of high priority tickets, Critical incident Management Bridge Calls (CIM/MIM)
- Provide environmental support and handle escalations
- Provide phone support 24x7x365 for Command Center Team
PRINCIPAL ACCOUNTABILITIES
- Ensuring optimal adherence to SLA s.
- Ensuring acknowledgement & action on all emails addressed to you/your team.
- Optimal adherence to documented SOP s.
- Handling escalations in shift.
- Ensuring Resource availability.
RESPONSIBILITIES
- Act as a Technical Manager or Technical Lead for Command Center Operations
- Provide Team management tasks for Team
- Roaster management
- Leave Management
- Team Management (discipline, punctuality, ownership etc)
- Lead will be the first point of escalation for all the issues pertaining to Command Center Operations
- Conduct Team meetings and discuss current issues / challenges / best practices/ group learning
- Shift huddles, Weekly meetings.
- Always Mentor, coach & motivate the team.
- Manage & minimize attrition within the team as per prescribed levels.
- Encourage team for upgrading skills
- Guide team members in formulating a career plan and align appropriate trainings.
- Conduct & arrange trainings for the teams as per requirement.
- Maintain & update documents in a central repository for the benefit of the team.
- Identify & initiate service improvement plans for the team and ensure its successful closure.
- Monitor & evaluate individual efficiency & performance and provide constructive feedback.
- Ensure teams compliance with approved SOP s & processes.
- Report metrics with regards to team s performance, productivity & efficiency to the senior management.
- Define team KRA s and evaluate yearly
- Interface with internal support teams (HR, Facilities, IT Support, Telecoms and Transport)
- Conduct internal audit review and assist in the external audit process and procedures.
- Risk Management (Risk Register, Mitigation plans and risk assessment projects).
- Technology perspective : Monitor in-scope infra, Apps and Cloud Management with various monitoring tools for example
- Monitoring Tool : Moogsoft, Splunk, iTOM, Big Panda, Solarwinds, SCOM, Dynatrace, AppDynamics, Net cool, Tivoli, HP NNM, HP OVO, LogicMonitor, Grafana, Science Logic, Nagios, Nimsoft, Zabbix, ManageEngine, DataDog, Vmware, WhatsUp Gold, New Relic, SiteScope
- ITSM Tool : Service Now. Cherwell, Remedy, HPSC, HPSM, SalesForce, Service Desk Plus etc
- Batch Job Scheduler : Control-M, Autosys, Redwood, Dollar Universe (DU), TWS, Tidal, IBM Workload Automation,
- Update, maintain Runbook, Other Technical documents, SOPs and all respective documents
- Ensure proper shift handover to the next shift.
- Escalation & Notification to the relevant teams & stakeholders to ensure SLA compliance & minimal impact on the business.
- Strict adherence to response, resolution timelines mentioned in SLA
- Resolution includes where level 1.5 troubleshooting is in Teams scope.
- Discuss operational challenges and constraints in team meetings and with the management to ensure timely resolution.
- Assist Team for the critical incident management process by involving the technical & incident management team.
- Collaborate with respective stakeholders and provide accurate & timely updates
- Train & absorb the level 1.5 troubleshooting and other operational tasks from the various technical tracks.
- Escalate any inconsistencies in the monitoring environment with respect to the monitoring tool configuration, alert thresholds, alert message enrichment & false alerts on time
- Resolve operational challenges, conflicts and constraints if any within team or in project.
- Review team performance : Handover any incomplete tasks, open alerts, incidents and outages reports to the next shift.
EXPERIENCE & SKILL
- 4-5 Years of University education post High school (B.Sc. or BCA or Diploma)
- 1-2 Years of diploma in Information Technology. Preferred Certification in ITIL/MSCE/MSCA/CCNA or RHCE.
- 5-7 Years of working experience in Information Technology with respect to Alert Monitoring / Management experience
- 2-3 years of experience as a Lead or Manager
- Preferably ITIL certified.
- Should be aware of ITIL s Event, Incident, Problem and Change management module
- Sound knowledge or Experience on Windows/Unix Servers, AD, Network Devices, Database, Storage & Backup, Job Scheduling or Cloud computing.
- Should have worked in high pressure work environments and ability to multitask.
- Excellent Verbal & written communication skills.
- Hands-on experience with the following:
- Monitor in-scope infra, Apps and Cloud Management with various monitoring tools for example
- Monitoring Tool : Moogsoft, Splunk, iTOM, Big Panda, Solarwinds, SCOM, Dynatrace, AppDynamics, Net cool, Tivoli, HP NNM, HP OVO, LogicMonitor, Grafana, Science Logic, Nagios, Nimsoft, Zabbix, ManageEngine, DataDog, Vmware, WhatsUp Gold, New Relic, SiteScope
- ITSM Tool : Service Now. Cherwell, Remedy, HPSC, HPSM, SalesForce, Service Desk Plus etc
- Batch Job Scheduler : Control-M, Autosys, Redwood, Dollar Universe (DU), TWS, Tidal, IBM Workload Automation,
- Incident, Problem, Change lifecycle process
- Event to Incident management lifecycle
- Start/stop backup jobs.
- Backup monitoring tools like Networker / legato/ VERITAS Net backup
- Experience in : Start, restart, check error, Tape management
- Generating Reports through Dashboard, Remedy etc.
- DIMENSIONS
- Candidate should be self-driven & self-motivated
- Ability to work as a cross-functional team player in a fast-paced environment where all information is shared.
- Ability to learn new tools, technologies & processes and train the team.
- Ability to work flexible hours from time to time as per requirement.
- COMPETENCIES
- SOP adherence
- SLA adherence
- Excellent team handling skills.
- Excellent customer handling skills.
- Technical Expertise.
- KEY BUSINESS CHALLENGES
- Meet or exceed SLA s.
- Keep up-to-date on new technologies and end customer technologies
- Keep customer satisfaction high
- Effectively manage resourcing levels to cater to 24/7 support.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.