Observability Principal Architect

Remote • Posted 30+ days ago • Updated 19 hours ago
Contract W2
24 Months
No Travel Required
Able to Sponsor
Remote
Depends on Experience
Fitment

Dice Job Match Score™

🔗 Matching skills to job...

Job Details

Skills

  • Observability
  • Architect

Summary

Job Title: Observability Principal Architect
Location: 100% Remote, USA
Employment Type: Contract

Duration: Long-Term
 
 

Need 15+ to 25 years of experience

 

Observability Principal Architect will take care of the below points:

Platform monitoring, platform observability, Principal architect

Just be hardcore full tikka experience.

Talk to C level people, VP, SVPs

Design a company’s ,monitoring and

Protecting revenue, company revenue 2.4 Million

Drive estimate how many staffs are required

Will design company’s revenue

Reduce It operation cost

Where to invest in tools and where not

Highly impactful lead

 

Job Description

Purpose

Lead the observability assessment and define a unified, AI-enabled, service-centric target-state architecture and transformation roadmap.

 

Key Responsibilities

  • Lead end-to-end engagement including stakeholder alignment, workshops, and governance
  • Define and execute assessment framework across:
    • Application, Data/Batch, and Infrastructure layers
    • Coverage, signal quality, architecture, workflows, and cost
  • Drive tool rationalization strategy (Strategic / Tactical / Reduce / Replace / Integrate)
  • Design target-state observability architecture, including:
    • Telemetry model (logs, metrics, traces, events)
    • Service identity and ownership model
    • Correlation and enrichment patterns
    • SLO-driven alerting strategy
  • Define AI/AIOps integration strategy:
    • Signal correlation and deduplication
    • Intelligent alert routing
    • Incident summarization and RCA acceleration
  • Establish integration patterns with ITSM and communication platforms
  • Own engineering blueprint / playbook for day-one observability and retrofit
  • Lead creation and review of all deliverables:
    • Assessment report, decision pack, architecture, standards, blueprint
  • Present findings and recommendations to executive stakeholders

Technical Skills

  • Deep experience with:
    • Dynatrace (APM, tracing)
    • Azure Monitor & Application Insights
    • Azure Log Analytics (KQL, ingestion, cost management)
    • LogicMonitor / ManageEngine (infra monitoring)
    • ActiveBatch (scheduler monitoring)
    • SQL Sentry (database monitoring)
  • Strong understanding of Azure cloud-native observability ecosystem
  • Experience integrating observability with ITSM (ServiceNow or equivalent) and notification platforms

 

Industry & Transformation Skills

  • Observability architecture across App + Data + Infra domains
  • Multi-tool rationalization and consolidation strategy
  • SLI/SLO-based monitoring and alert design
  • Service-centric observability and dependency modeling
  • AI/AIOps strategy and implementation patterns
  • Strong executive communication and consulting mindset

 

Expectations / Success Criteria

  • Deliver decision-ready outputs, not just analysis
  • Define clear tool roles and rationalization paths
  • Provide a practical, implementable target-state architecture
  • Ensure outputs are scalable, governed, and reusable
  • Drive confidence with stakeholders across technical and business team

Roles & Responsibilities

Observability Principal Architect

Purpose

Lead the observability assessment and define a unified, AI-enabled, service-centric target-state architecture and transformation roadmap.

 

Key Responsibilities

  • Lead end-to-end engagement including stakeholder alignment, workshops, and governance
  • Define and execute assessment framework across:
    • Application, Data/Batch, and Infrastructure layers
    • Coverage, signal quality, architecture, workflows, and cost
  • Drive tool rationalization strategy (Strategic / Tactical / Reduce / Replace / Integrate)
  • Design target-state observability architecture, including:
    • Telemetry model (logs, metrics, traces, events)
    • Service identity and ownership model
    • Correlation and enrichment patterns
    • SLO-driven alerting strategy
  • Define AI/AIOps integration strategy:
    • Signal correlation and deduplication
    • Intelligent alert routing
    • Incident summarization and RCA acceleration
  • Establish integration patterns with ITSM and communication platforms
  • Own engineering blueprint / playbook for day-one observability and retrofit
  • Lead creation and review of all deliverables:
    • Assessment report, decision pack, architecture, standards, blueprint
  • Present findings and recommendations to executive stakeholders

Technical Skills

  • Deep experience with:
    • Dynatrace (APM, tracing)
    • Azure Monitor & Application Insights
    • Azure Log Analytics (KQL, ingestion, cost management)
    • LogicMonitor / ManageEngine (infra monitoring)
    • ActiveBatch (scheduler monitoring)
    • SQL Sentry (database monitoring)
  • Strong understanding of Azure cloud-native observability ecosystem
  • Experience integrating observability with ITSM (ServiceNow or equivalent) and notification platforms

 

Industry & Transformation Skills

  • Observability architecture across App + Data + Infra domains
  • Multi-tool rationalization and consolidation strategy
  • SLI/SLO-based monitoring and alert design
  • Service-centric observability and dependency modeling
  • AI/AIOps strategy and implementation patterns
  • Strong executive communication and consulting mindset

 

Expectations / Success Criteria

  • Deliver decision-ready outputs, not just analysis
  • Define clear tool roles and rationalization paths
  • Provide a practical, implementable target-state architecture
  • Ensure outputs are scalable, governed, and reusable
  • Drive confidence with stakeholders across technical and business team

Skills To Be Evaluated On

Architecture,Technical Architecture, traces) OpenTelemetry framework design APM & Cloud monitoring platforms expertise Data pipeline observability ITSM integration & event correlation AIOps & anomaly detection Kubernetes & microservices monitoring Alert optimization & noise reduction SLI/SLO framework definition Integration architecture & governance standards, traces), Integration architecture & governance standards

 
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10509003
  • Position Id: 8937016
  • Posted 30+ days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Easy Apply

Contract

Depends on Experience

Remote

3d ago

Easy Apply

Contract, Third Party

70+

Remote

Today

Easy Apply

Contract

80

Remote

Today

Easy Apply

Contract

50 - 65

Search all similar jobs