Operations Monitoring Engineer

Overview

Accepts corp to corp applications
Contract - W2
Contract - 6+ month(s)

Skills

New Relic
Change Management
Cloud Architecture
DevOps
System Monitoring
Optimization
Root Cause Analysis
Incident Management
Management
Dynatrace
Zabbix
Microsoft SCOM
Writing
Documentation
System Administration
Microsoft Windows
Unix
Linux
VMware
Software Development Methodology
SLA
KPI
Scripting
Python
Node.js
Ruby
Perl
Bash
Cloud Computing
Amazon Web Services
Microsoft Azure
Terraform
Ansible
ITIL
Operating Systems

Job Details

Job type: Operations Monitoring Engineer

Location: On-site 4 days a week in Fort Worth, TX

Must haves:

  • Experience with Dynatrace (Preferably) , Cloud Watch, Data Dog, OR New Relic
  • Change management and incident management experience would be great

Job Description:

The Operations Support Engineer will monitor, alert, and support our systems to ensure seamless operations. Ideal candidates will have 3-5 years of experience with Dynatrace, CloudWatch or similar tools, and a solid understanding of cloud architecture and DevOps principles.

Key Responsibilities:

System Monitoring and Optimization: Monitor systems for faults, identify optimization opportunities, and implement tools and process changes to improve monitoring and alerting.

Incident Response and Root Cause Analysis: Work with major incident response teams for escalations and monitoring during major incidents

Qualifications:

Self-Motivated: Ability to define, develop, and execute plans; manage system outages; and handle high-stress situations.

Availability: Able to work in a 24/7 environment and provide on-call support.

Experience: Proven experience interacting at all levels.

Technical Skills:

Bachelor's degree in Computer Science, Information Systems, or Engineering preferred.

Technical certifications or 5+ years in Event monitoring and alerting

Experience with monitoring tools (Dynatrace, CloudWatch, Zabbix, SCOM).

Strong writing skills for documentation.

Proficient in distributed systems/administration (Windows, Unix, Linux, VMWare, etc.).

Knowledge of ITIL best practices (certification is a plus).

Familiarity with SDLC lifecycle.

Experience in SLA/KPI-driven environments.

ServiceNow proficiency.

General scripting/programming skills (Python, Node.js, Ruby, Perl, Bash/sh).

Preferred Qualifications:

Cloud certifications (AWS, Azure, etc.).

Experience with infrastructure as code tools (Terraform, Ansible, etc.).

ITIL V3 or V4 certification.

Advanced technical skills in various operating systems and environments.

Proven ability to improve monitoring and alerting processes

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.