Overview
Hybrid
Depends on Experience
Contract - W2
Skills
Computer Science
Dynatrace
Amazon Web Services
KPI
Dashboard
Scripting
ITIL
Systems Architecture
Network Monitoring
Problem Management
Job Details
Our West Des Moines client is seeking a seasoned Observability and Monitoring Engineer to design and enhance enterprise-wide monitoring, alerting, logging, and observability solutions for their AWS-based systems. The ideal candidate is a hands-on engineer with excellent architecture skills and proficiency in consolidating various monitoring tools into a unified observability framework. If you meet the qualifications below, apply today to learn more!
Work model:
- Hybrid
What you will do:
- Define and implement standards for logging, metrics, event correlation, tracing, and alerting across various environments
- Develop centralized dashboards and alerting policies to deliver unified visibility across applications and services, databases (MS SQL Server, PostgreSQL, etc.), operating systems, file transfer systems (SFTP, managed transfer tools), AWS services (EC2, RDS, Lambda, S3, CloudWatch, CloudTrail, etc.), batch jobs, and scheduled processes
- Develop noise-free, actionable alert thresholds, escalation procedures, and runbooks
- Consolidate current tools (Graylog, Dynatrace, SolarWinds, Zabbix, Splunk) into a well-integrated ecosystem
- Assess current tool usage and provide modernization and consolidation recommendations as needed
- Maintain monitoring and logging platforms, automate deployments with IaC (CloudFormation) and CI/CD pipelines, and build reusable templates for fast application onboarding
- Create self-service dashboards and reports for technical and business stakeholders and document monitoring standards, logging schemas, dashboard naming, and alerting guidelines
- Establish SLIs, SLOs, and reliability KPIs for essential services
- Collaborate with scrum, security, and infrastructure teams to enhance reliability and reduce MTTR; assist with incident resolution, problem management, and root cause analysis
What you will need to be successful:
- Bachelor's degree in Computer Science or a related field
- 5+ years of hands-on experience in monitoring and observability utilizing Dynatrace
- Practical experience with monitoring and logging tools, including Graylog, SolarWinds, Zabbix, Splunk, or equivalents
- 5+ years of hands-on expertise with AWS architecture and services
- High-level comprehension of event correlation, logs, metrics, traces, and distributed tracing
- Proven experience in designing dashboards and KPIs across infrastructure, applications, and databases
- Excellent scripting and automation skills (PowerShell, Python, Bash) and working knowledge of CloudFormation or Terraform
- Strong comprehension of systems architecture, network monitoring, and performance tuning
- Experience with ITIL frameworks for incident and problem management
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.