Monitoring and Observability Engineer (enterprise monitoring tools experience)

  • Posted 1 day ago | Updated 22 hours ago

Overview

Remote
$70 - $80
Contract - W2
Contract - 3 Month(s)

Skills

Enterprises Network Operation Center
AIOps
Broadcom DxNetOps
Micro Focus
IBM Tivoli
VMware
NOC

Job Details

Job Title: (Agile1) IT- Product Specialist, Expert
Location: Remote within the U.S
Duration: 3 months with possibility of extension

*Per the HM: Currently this role with go until 12/25/2025, but hopefully should be extended.
**100% Remote work / Can submit non local

 

Position Summary:
We are seeking a highly skilled and hands-on IT Product Specialist to support the rollout, integration, and Tier 2 operational support of EMS tools. This role demands deep technical expertise, broad infrastructure knowledge, and strong collaboration skills. The ideal candidate will have full lifecycle experience across the Software Development Lifecycle (SDLC) and be instrumental in both project delivery and ongoing support, including upgrades, testing, and production readiness.
This position requires a strong technical foundation across infrastructure, monitoring, and automation domains, with practical experience in Operational Intelligence (OI), AIOps, Observability platforms, and containerized environments such as Kubernetes and Docker.
Key Responsibilities:
Project & Integration Work

  • Lead integration of monitoring and event management tools into an enterprise Observability and AIOps platform.
  • Build and optimize monitoring instrumentation to proactively detect and resolve issues across diverse infrastructure layers.
  • Develop dashboards, reports, and alerting mechanisms tailored to operational and business needs.
  • Perform hands-on development and scripting to support automation, integration, and data transformation.
  • Participate in all phases of the SDLC: requirements analysis, design, development, testing, deployment, and maintenance.
  • Collaborate with cross-functional teams including operations, application support, database, VMware, and enterprise NOC.
  • Ensure scalability and reliability of monitoring solutions across hybrid environments (on-prem, cloud, containers).

Tier 2 Operational Support

  • Perform Operational Intelligence (OI) application upgrades in DEV environments to validate fixes and enhancements.
  • Drive promotion of validated upgrades to TEST and PROD environments.
  • Support container upgrades and lifecycle management in Kubernetes/Docker or other container runtimes.
  • Manage and update SSL/TLS certificates across monitoring platforms.
  • Conduct Disaster Recovery (DR) testing, validate failover scenarios, and document outcomes.
  • Perform device discovery, onboarding, and decommissioning within monitoring platforms.
  • Administer Linux (RHEL) systems and support containerized environments.
  • Troubleshoot and resolve complex issues across infrastructure, applications, and monitoring layers.

Required Qualifications:

  • 5–7 years of experience in an Enterprise NOC or IT Operations environment.
  • 5–7 years of hands-on experience with enterprise monitoring tools (e.g., Broadcom DxNetOps, Micro Focus, IBM Tivoli).
  • Strong experience across the SDLC with a focus on automation, testing, and deployment.
  • Hands-on experience with Observability, AIOps, and Operational Intelligence platforms.
  • Proficiency in RESTful API integration and data transformation using JSON and XML.
  • Experience with scripting languages such as Bash, Python, or PERL.
  • Solid understanding of networking and server infrastructure (Windows, Linux, Unix, AIX).
  • Practical experience with container orchestration platforms (Kubernetes) and Docker or other container runtimes.
  • Strong documentation skills and attention to detail.
  • Ability to work independently and collaboratively in a fast-paced, dynamic environment.
  • Broad technical background across infrastructure, monitoring, automation, and cloud-native technologies.

 Preferred:Experience with observability tools such as:

  • Broadcom Operations Insight
  • Grafana
  • Exposure to CI/CD pipelines and DevOps practices.
  • Experience with infrastructure upgrades, DR testing, and certificate management.isor,
  • Infrastructure & Operations
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.