Enterprise Monitoring and Observability Architect

  • Manassas, VA
  • Posted 6 hours ago | Updated 6 hours ago

Overview

Hybrid
Depends on Experience
Contract - W2
Contract - Independent
Contract - 12 Month(s)

Skills

Enterprise Architecture
Architect
TOGAF
Monitoring
Observability
OpenTelemetry
Site Reliability Engineering (SRE)
RHEL
Monitoring Tools
Splunk
Grafana or Prometheus
Perl or Python

Job Details

Job Overview:

We are looking for an experienced Enterprise IT Monitoring and Observability Architect to join our direct client, at their Manassas, VA location. This strategic role reports directly to the Head of Service & Data Architecture and is responsible for the technical design and evolution of real-time monitoring, alerting, and reporting solutions across the enterprise infrastructure and applications.

The successful candidate will work closely with infrastructure and application engineers, as well as operations teams, to continuously enhance the scope, quality, and effectiveness of monitoring solutions. This role plays a vital part in enabling efficient event management and delivering an optimal user experience.


Key Responsibilities:

  • Lead the detailed technical design and implementation of monitoring and observability solutions for enterprise IT systems, including networks, servers, storage, databases, and applications

  • Define and maintain standards, architectural patterns, and best practices for monitoring technologies across the organization

  • Develop and manage technical roadmaps and ensure alignment with organizational goals

  • Drive the integration of monitoring solutions with incident management, analytics, and reporting platforms

  • Partner with multi-disciplinary teams to deliver innovative services, tools, and applications that enhance operational efficiency

  • Improve the user experience and streamline event and incident management through enhanced observability

  • Produce high-quality technical documentation and ensure knowledge sharing across IT teams

  • Leverage modern monitoring tools, including OpenTelemetry, and support SRE principles

  • Apply design thinking and systems thinking methodologies to drive innovation and sustainability in monitoring strategies


Required Skills & Experience:

  • 4+ years of experience in Enterprise Architecture, with a focus on monitoring and observability

  • Proficiency in OpenTelemetry, including implementation and integration in distributed systems

  • Strong background in Site Reliability Engineering (SRE) practices and principles

  • Hands-on experience designing and deploying enterprise-wide monitoring platforms

  • Experience with Red Hat Enterprise Linux (RHEL), shell scripting, and automation tools (e.g., Python, Perl)

  • Familiarity with monitoring tools such as Prometheus, Grafana, Splunk, or similar

  • Deep understanding of IT infrastructure domains (network, servers, storage, databases)

  • Proven ability to develop and maintain architectural roadmaps and technical designs

  • Experience using TOGAF or similar enterprise architecture frameworks

  • Skilled in technical documentation and the communication of complex solutions to diverse audiences

  • Demonstrated expertise in applying Design Thinking and Systems Thinking methodologies


Nice to Have:

  • Prior experience in financial services or high-availability enterprise environments

  • CI/CD pipeline experience and DevOps culture familiarity

  • Cloud monitoring or hybrid infrastructure monitoring experience


 

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About TechWish