Overview
On Site
Depends on Experience
Accepts corp to corp applications
Contract - Independent
Contract - W2
Contract - 12 Month(s)
Skills
AIOps
Moogsoft
Servicenow
ITOM
Devops
SRE
Job Details
Key Responsibilities:
- Lead AI Ops strategy and implementation using Moogsoft and ServiceNow ITOM.
- Design and deploy intelligent event correlation and noise reduction mechanisms.
- Integrate monitoring, alerting, and observability tools across infrastructure and applications.
- Collaborate with DevOps, Infrastructure, and Application teams to improve reliability and performance.
- Drive automation of incident response and remediation workflows.
- Define and implement KPIs for system reliability, availability, and performance.
- Provide technical leadership and mentorship to SRE and operations teams.
- Ensure compliance with security and governance standards.
Required Skills & Qualifications:
- 8+ years of experience in Site Reliability Engineering or IT Operations.
- Hands-on experience with Moogsoft (event correlation, alert management) and ServiceNow ITOM (Discovery, Event Management, CMDB).
- Strong understanding of observability tools (e.g., Prometheus, Grafana, Splunk, AppDynamics).
- Experience with automation and orchestration tools (e.g., Ansible, Terraform, Jenkins).
- Proficiency in scripting languages (Python, Shell, etc.).
- Solid grasp of cloud platforms (AWS, Azure, Google Cloud Platform) and container technologies (Kubernetes, Docker).
- Excellent problem-solving, communication, and leadership skills.
Preferred Qualifications:
- Certifications in ServiceNow ITOM or Moogsoft.
- Experience with AIOps platforms beyond Moogsoft (e.g., BigPanda, Dynatrace).
- Familiarity with ITIL processes and Agile methodologies.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.