Overview
Remote
On Site
Hybrid
Contract - W2
Contract - 1 day((s))
Skills
Dynatrace
SAN
Splunk
NOC
SCOM
LAN/WAN
Event Monitoring
TOC
Job Details
Title: Monitoring Engineer (Dynatrace, Splunk - NOC/TOC)
Location: Baltimore, MD (Remote Job)
Duration: 12 Months+ CTH
This role supports the First-to-Know capability of the Technical Operations Center (TOC) and serves as the centralized focal point for observability and event management at CareFirst. Event Monitoring Engineers monitor the performance and capacity of enterprise-wide systems, applications and critical business processes using a variety of tools to identify hardware, software, and environmental anomalies. The successful candidate will proactively look for ways to improve processes, ensure events are meaningful and actionable, look for inefficiencies, and document new processes as they evolve. A great benefit to this team would be someone proficient in scripting and coding.
This role will require shift work. The CareFirst Technology Operation Center covers multiple types of shifts that include weekdays, weekends, and eventually a 24/7 operation. Team members will be rotating work times to cover all processes and are asked to be flexible in providing coverage outside of their normal shift hours, when the need arises. Position is for Contract Employment and can be performed fully remote.
Responsibilities include:
Provide eyes-on-glass monitoring using Dynatrace and other monitoring tools
Support a 24x7 system monitoring service to proactively identify and assess problems
Provide oversight, coordination, and visibility for critical business processes
Perform system health checks, some manual some automated
Identify, investigate, verify, report, communicate, and escalate critical events
Review device logs documentation and analysis where applicable
Develop runbooks and manage documentation for repeatable processes (Lifecycle Management)
Will follow basic triage steps, monitor production systems, and assure their high availability
Facilitate and coordinate the necessary IT response to system problems
Continuously analyze events and eliminate noise, and non-actionable event trends (Continual Service Improvement)
Provide event management support to service owners and IT managers
Author reports, trends and anomalies for KPI (Key Performance Indicators) for Event Management and Monitoring
Communicate to stakeholders; support and facilitate open communication between all stakeholders.
Provide eyes-on-glass monitoring using Dynatrace and other monitoring tools
Support a 24x7 system monitoring service to proactively identify and assess problems
Provide oversight, coordination, and visibility for critical business processes
Perform system health checks, some manual some automated
Identify, investigate, verify, report, communicate, and escalate critical events
Review device logs documentation and analysis where applicable
Develop runbooks and manage documentation for repeatable processes (Lifecycle Management)
Will follow basic triage steps, monitor production systems, and assure their high availability
Facilitate and coordinate the necessary IT response to system problems
Continuously analyze events and eliminate noise, and non-actionable event trends (Continual Service Improvement)
Provide event management support to service owners and IT managers
Author reports, trends and anomalies for KPI (Key Performance Indicators) for Event Management and Monitoring
Communicate to stakeholders; support and facilitate open communication between all stakeholders.
Required Qualifications:
Associate of Arts/Associate of Science and 3+years of experience or equivalent combination such as bachelor's degree and 2+ years' experience or no degree and at least 3 years in a NOC/TOC, Command Center roles.
3+ years IT experience and understanding of performance monitoring tools
3+ years Dynatrace monitoring experience
2+ years operating in a command center in an Incident Management, or Event Monitoring/Event Management role
Ability to assess monitoring events and respond or escalate accordingly
Knowledge and experience of system and network infrastructures such as LAN and WAN network technologies, server virtualization, enterprise storage area network (SAN) and backup, and database technologies
Associate of Arts/Associate of Science and 3+years of experience or equivalent combination such as bachelor's degree and 2+ years' experience or no degree and at least 3 years in a NOC/TOC, Command Center roles.
3+ years IT experience and understanding of performance monitoring tools
3+ years Dynatrace monitoring experience
2+ years operating in a command center in an Incident Management, or Event Monitoring/Event Management role
Ability to assess monitoring events and respond or escalate accordingly
Knowledge and experience of system and network infrastructures such as LAN and WAN network technologies, server virtualization, enterprise storage area network (SAN) and backup, and database technologies
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.