Sr Observability Engineer

Woodbridge Township, NJ, US • Posted 9 hours ago • Updated 9 hours ago
Contract Independent
Contract W2
Contract Corp To Corp
12 Months
No Travel Required
Able to Sponsor
On-site
Depends on Experience
Fitment

Dice Job Match Score™

⏳ Almost there, hang tight...

Job Details

Skills

  • Grafana
  • Python
  • OCP
  • Dynatrace
  • Migration

Summary

Role: Sr Observability Engineer

Location : Iselin, NJ (5 Days Onsite)

 

The image contains a job description for a senior Observability / Monitoring Platform role. The key requirements are:

Core Responsibilities

  • Lead design, modernization, and operation of enterprise-scale observability and reliability platforms.
  • Support mission-critical and highly regulated environments.
  • Own metrics, monitoring, automation, and platform reliability across distributed, mainframe, and hybrid environments.
  • Drive migration from legacy tools (such as Geneos) to Prometheus and Grafana.
  • Lead teams building scalable, self-service observability platforms.

Platform Architecture & Strategy

  • Architect and operate platforms based on:
    • Prometheus
    • Grafana
    • Grafana Alloy/Agent
    • Grafana Mimir
  • Support:
    • 80K+ Linux servers
    • 40K+ Windows servers
    • OpenShift (OCP)
    • Cloud platforms
  • Define monitoring standards for on-prem, z/Linux, z/OS, and hybrid cloud environments.

Observability & Monitoring Engineering

  • Own monitoring and alerting for infrastructure, applications, and processes.
  • Develop exporters using:
    • node_exporter
    • process_exporter
    • MQ exporters
    • Custom Python exporters
    • Dynatrace integrations
  • Handle:
    • High-cardinality metrics
    • Scrape tuning
    • Relabeling
    • Ingestion pipelines
    • Remote-write backpressure
  • Troubleshoot metric gaps and ingestion bottlenecks.
  • Design highly available BCP/DR monitoring architectures.

Automation

Lead Python-based automation for:

  • Exporter lifecycle management.
  • Dynamic target discovery.
  • Monitoring onboarding pipelines.
  • Inventory reconciliation and validation.
  • Event-driven remediation workflows.

Required Technologies

Observability Stack

  • Prometheus (advanced PromQL including joins, cardinality control, optimization)
  • Grafana
  • Grafana Agent
  • Grafana Alloy
  • Grafana Mimir
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90962964
  • Position Id: 9010823
  • Posted 9 hours ago
Contact the job poster
RR

Rajender Reddy

Manager - Recruiting & Operations @ ICS Global Soft, Inc.
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Hybrid in Holmdel, New Jersey

24d ago

Easy Apply

Contract

$60 - $80

Jersey City, New Jersey

Today

Contract

Compensation information provided in the description

New York, New York

Today

Full-time

USD 149,000.00 - 186,000.00 per year

New York, New York

Yesterday

Full-time

USD 149,000.00 - 186,000.00 per year

Search all similar jobs