Apply Now

Senior Observability Engineer (ESS Platform SME)

McLean, VA, US • Posted 8 hours ago • Updated 8 minutes ago

Full Time

Part Time

On-site

Fitment

Dice Job Match Score™

👾 Reticulating splines...

Job Details

Skills

API
HTTP
Analytics
CPU
Network
Performance Monitoring
Budget
Trend Analysis
Collaboration
Leadership
Elasticsearch
Kibana
Microservices
Kubernetes
OCP
Continuous Integration
Continuous Delivery
Scripting
Python
Shell
Groovy
Soft Skills
Conflict Resolution
Problem Solving
Communication
Dashboard
Software Performance Management
Employee Self-service

Summary

Job Title: Senior Observability Engineer (ESS Platform SME)

Location: McLean, VA(onsite) & inperson Interview

Job Type: C2C or W2

Role Overview:

We are seeking a highly experienced Senior Observability Engineer with deep expertise in ESS (Elastic Stack) to lead and accelerate the development of enterprise-grade observability capabilities across mission-critical applications.

This role requires a hands-on SME who can design, build, and scale observability dashboards, APM, tracing, and monitoring solutions exclusively within ESS. The candidate will play a key role in transforming current monitoring into a proactive, intelligent, and scalable observability ecosystem.

This is a high-impact, fast-paced engagement (target < 6 months) requiring ownership, technical depth, and execution excellence.

Key Responsibilities:

ESS Observability Architecture & Implementation

Design and implement end-to-end observability solutions using ESS (Elastic Stack).
Build a centralized observability layer covering all MF applications.
Ensure block-level aggregation with drill-down to:
- Application-level metrics
- APM traces
- Logs and events
- Service dependencies

Dashboard Engineering (Critical Priority)

Develop and scale a large backlog of ESS dashboards, including but not limited to:
- Cluster Health (OCP/K8s)
- API & APM Dashboards
- Service Health & Dependency Monitoring
- Pod Status / Restart / Scaling Metrics
- HTTP Status Analytics (200/400/500 trends)
- Transaction Processing Metrics
- Infra Metrics (CPU, Memory, Disk, Network)
- Synthetic Monitoring & Availability
Build intuitive, drill-down dashboards from MF Block Service Application level.

APM, Tracing & Monitoring Expansion

Expand ESS-based:
- Application Performance Monitoring (APM)
- Distributed tracing
- Real User Monitoring (RUM)
- Synthetic monitoring
Enable end-to-end traceability across microservices.

Proactive Observability & Alerting

Design and implement smart alerting rules:
- Move from reactive proactive detection
- Reduce noise, improve signal quality
Define SLOs, SLIs, and error budgets
Enhance anomaly detection and trend analysis

Collaboration & Leadership

Work closely with:
- EOT Observability Team
- Internal CDLs
- Application teams
Act as ESS Observability SME
Provide guidance, standards, and best practices

Required Skills & Experience:

Strong hands-on experience with ESS (Elastic Stack):
- Elasticsearch
- Logstash
- Kibana
- Beats / Elastic Agent
- Elastic APM
Proven experience building enterprise-scale observability dashboards in ESS
Deep understanding of:
- Microservices architecture
- Kubernetes / OpenShift (OCP)
Experience with:
- APM, distributed tracing, logging, metrics correlation
Ability to design multi-layer observability (infra platform app)

Strongly Preferred:

Experience with:
- Synthetic monitoring tools integrated with ESS
- Real User Monitoring (RUM)
- Service maps and dependency graphs
Knowledge of:
- CI/CD observability integration
- Alerting frameworks within Elastic
Scripting: Python / Shell / Groovy (nice to have)

Soft Skills:

Strong ownership mindset
Ability to work under aggressive timelines
Excellent problem-solving skills
Clear communication with technical and non-technical teams

Success Criteria (First 3 6 Months):

Deliver enterprise-grade ESS observability dashboards
Achieve full MF application visibility
Implement end-to-end APM + tracing coverage
Establish proactive alerting framework

Additional Notes:

Candidate must be an ESS expert - alternative tools experience alone will not be sufficient.
This is a high-priority, business-critical role with immediate impact expectations.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91112461
Position Id: OOJ - 3386-2387-1778088440
Posted 8 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Observability Engineer

Hybrid in McLean, Virginia

•

Today

Job Number: R0237472 Observability Engineer The Opportunity : So met hing breaks at 2 AM. Today, a human gets paged. Tomorrow, an AI agent detects the anomaly, correlates the root cause, triggers the remediation, and closes the ticket, all before the first cup of coffee. You are the engineer who builds that tomorrow. We are seeking a senior Observability Engineer with expertise in both AI technologies and enterprise performance monitoring. This role combines hands-on engineering with AIOps im

Full-time

USD 86,800.00 - 198,000.00 per year

Observability Engineer

Hybrid in McLean, Virginia

•

Today

Job Number: R0238664 Observability Engineer The Opportunity : Every tra nsa ction, every user click, every API call flows through observability before it becomes an insight, an alert, or an automated action. You are the engineer who makes that pipeline work. We are seeking an Observability Engineer with expertise in enterprise monitoring and integration development to join a team delivering full-stack visibility across 250+ mission-critical services supporting enterprise operations. You will

Full-time

USD 62,000.00 - 141,000.00 per year

Lead system Engineer

Vienna, Virginia

•

6d ago

NOTE : Need Public Trust Clearence Position Summary We are seeking an experienced Lead Systems Engineer to lead the design, implementation, and operational support of enterprise-scale monitoring and observability solutions for cloud-hosted applications and infrastructure. This role will provide technical leadership to a team of engineers while partnering closely with government customers and internal stakeholders to deliver highly reliable, secure, and scalable monitoring solutions. The ideal ca

Easy Apply

Full-time

Depends on Experience

Senior Observability Engineer

Alexandria, Virginia

•

Today

This Department of War enterprise data and analytics program delivers mission-critical capabilities that enable leaders across the Department to make faster, better-informed decisions using trusted data at scale. Leidos Digital Modernization sector is seeking an experienced Senior Observability Engineer to support the delivery, enhancement, and adoption of enterprise data and analytics products used across multiple DoD organizations. In this role, you will work alongside government partners, en

Full-time

USD 107,900.00 - 195,050.00 per year

Search all similar jobs