Observability Principal Architect

Remote • Posted 2 days ago • Updated 2 days ago
Contract W2
Contract Independent
Contract Corp To Corp
12 Months
25% Travel Required
Remote
70+
Fitment

Dice Job Match Score™

🎯 Assessing qualifications...

Job Details

Skills

  • Dynatrace | Azure Monitor | Azure Application Insights | Azure Log Analytics | LogicMonitor | ManageEngine

Summary

Enterprise Observability & AIOps Architect

(Application + Infrastructure + Platform + ITSM/AIOps)

Role Overview

We are seeking a highly experienced Enterprise Observability & AIOps Architect with 15+ years of experience in designing and modernizing enterprise-scale observability ecosystems across applications, infrastructure, cloud platforms, databases, integrations, and operational workflows.

The ideal candidate should possess strong expertise in:

·        AIOps & Event Correlation

·        ITSM Integration

·        Telemetry Governance

·        SRE & Operational Excellence

·        Enterprise Monitoring Rationalization

·        AI-driven Operational Transformation

This role requires both strategic architecture leadership and strong hands-on expertise across modern observability and AIOps platforms in large enterprise environments.

Key Responsibilities

Enterprise Observability Architecture

·        Lead enterprise-wide observability assessments across applications, infrastructure, cloud, databases, and operational workflows.

·        Define current-state and target-state observability architecture.

·        Develop monitoring rationalization and consolidation strategies across enterprise toolsets.

·        Establish standards for telemetry, tagging, service identity, alerting, dashboards, and governance.

·        Define scalable operating models aligned to SRE, ITSM, and platform engineering practices.

Application Observability

·        Architect observability solutions across:

APM | Distributed tracing | Logs & metrics | RUM & synthetics

·        Define SLI/SLO-driven monitoring and alerting strategies.

·        Improve service dependency visibility, transaction tracing, and telemetry quality.

·        Design monitoring patterns for microservices, APIs, Kubernetes, Azure-native, and legacy applications.

Infrastructure & Platform Observability

·        Design observability solutions for cloud infrastructure, middleware, databases, platform services, and batch ecosystems.

·        Assess alert quality, duplication, routing inefficiencies, and monitoring overlaps.

·        Define event correlation, severity models, enrichment standards, and operational ownership structures.

AIOps & Intelligent Operations

·        Design AIOps capabilities including:

o   Event correlation

o   Noise reduction

o   Intelligent alert prioritization

o   Anomaly detection

o   Predictive insights

o   Root-cause contextualization

·        Define AI-assisted operational workflows for incident reduction, MTTR optimization, and automated remediation.

ITSM & Operational Integration

·        Integrate observability platforms with ServiceNow, incident workflows, CMDB, and collaboration tools.

·        Define monitoring-to-incident operational workflows and governance standards.

·        Establish KPI-driven operational maturity frameworks.

Governance & Blueprinting

·        Develop enterprise standards, onboarding blueprints, engineering playbooks, and reusable observability patterns.

·        Create reference architectures, dashboard standards, and operational governance frameworks.

·        Define “Day-1 Observability” onboarding models for new services.

Required Experience

·        15+ years of experience in observability, infrastructure, SRE, production operations, platform engineering, or AIOps architecture.

·        Strong experience in enterprise-scale hybrid cloud and distributed environments.

·        Proven experience leading observability transformation and monitoring rationalization initiatives.

·        Experience working with executive leadership, enterprise architects, platform teams, and operations organizations.

·        Strong understanding of enterprise operational workflows, incident management, and reliability engineering.

 

Required Technical Expertise

Observability Platforms

Strong hands-on expertise in:

Dynatrace | Azure Monitor | Azure Application Insights | Azure Log Analytics | LogicMonitor | ManageEngine

Preferred:

Splunk | ELK/OpenSearch | PrometheGrafana | Datadog | New Relic | BigPanda | PagerDuty

Core Skills

·        Event correlation & alert engineering

·        Distributed tracing & topology mapping

·        AIOps & intelligent operations

·        Cloud monitoring & telemetry

·        Kubernetes & microservices observability

·        ITIL / ITSM integration

·        SRE principles & operational governance

Cloud & Platform Experience

Azure | AWS | Kubernetes | APIs & integrations | Middleware & distributed systems

 

Preferred Qualifications

·        Experience defining enterprise observability standards and governance models.

·        Experience with operational transformation initiatives involving AI/AIOps.

·        Strong workshop facilitation, stakeholder management, and executive presentation skills.

·        Certifications in Cloud, Observability, ITIL, SRE, or AIOps preferred.

 

Success Criteria

·        Establish a unified enterprise observability architecture.

·        Reduce alert noise and operational inefficiencies.

·        Improve telemetry quality, service visibility, and incident response.

·        Enable scalable AIOps-driven operational workflows.

·        Deliver standardized onboarding, governance, and engineering blueprints.

·        Improve operational maturity, reliability, and service resilience.

 

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91132356
  • Position Id: 8972621
  • Posted 2 days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Easy Apply

Contract

Depends on Experience

Remote

Today

Easy Apply

Third Party, Contract

70 - 80

Remote

Today

Easy Apply

Contract

Depends on Experience

Remote

12d ago

Easy Apply

Contract

70 - 80

Search all similar jobs