Observability Engineer

Irving, TX, US • Posted 1 day ago • Updated 5 hours ago
Full Time
On-site
Company Branding Image
Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

  • KPI
  • Writing
  • Root Cause Analysis
  • MEAN Stack
  • Management
  • Network
  • Onboarding
  • IT Operations
  • Reliability Engineering
  • Systems Engineering
  • Grafana
  • Splunk
  • SPL
  • Dashboard
  • SQL
  • OCP
  • Kubernetes
  • Linux
  • Microsoft Windows
  • Computer Networking
  • ServiceNow
  • Workflow
  • Capacity Management
  • Trend Analysis
  • Cloud Computing
  • Database
  • Middleware
  • Machine Learning (ML)
  • IT Service Management
  • Innovation
  • Collaboration
  • Recruiting
  • Artificial Intelligence
  • Privacy
  • Insurance
  • Finance
  • Professional Development
  • Training
  • Leadership
  • CompTIA
  • Customer Service
  • Career Counseling
  • SAP BASIS
  • Law
  • ADA
  • Apex
  • Oracle Application Express

Summary

Job#: 3034826

Job Description:
Observability Engineer

Location: Irving, Texas (Onsite)

Role Overview

We are seeking a Senior Observability Engineer to architect and implement robust observability frameworks. The ideal candidate will translate platform telemetry into actionable insights for engineering, operations, and leadership. This role involves architecting end-to-end observability and alerting frameworks and building the dashboards and alerts that utilize them.

Key Responsibilities
  • Architect end-to-end observability frameworks spanning cloud platforms, on-premises infrastructure, networking, databases, middleware, and application layers.
  • Evaluate, select, and integrate observability tooling across the stack, establishing reference architectures.
  • Design and maintain standardized Grafana dashboards for infrastructure, platform, and workload health across OCP, AKS, and GKE.
  • Define golden signals and platform health KPIs aligned to availability, performance, capacity, and reliability.
  • Act as an advanced Splunk user, writing complex SPL for investigation, correlation, and root-cause analysis.
  • Correlate metrics, logs, and events across Grafana and Splunk to shorten mean time to resolution (MTTR).
  • Implement platform-specific observability patterns for OCP, AKS, and GKE.
  • Configure and manage ThousandEyes for synthetic monitoring and network intelligence.
  • Administer and tune BigPanda for AIOps-driven event correlation and noise reduction.
  • Design and maintain ServiceNow integrations for automated incident creation and alert enrichment.
  • Document observability standards, dashboard conventions, and onboarding patterns.
Required Qualifications

Experience:
  • 7+ years of total IT experience, including IT operations, site reliability engineering, systems engineering, or a related infrastructure discipline.
  • 5+ years of hands-on experience designing and implementing observability, monitoring, or telemetry solutions across distributed systems.
  • 5+ years of experience working with Kubernetes platforms in production environments.


Technical Skills:
  • Observability experience with logging events, tracing, and enhanced monitoring.
  • Strong hands-on experience with Grafana for dashboarding, variables, alerts, and data sources.
* Advanced proficiency with Splunk SPL, dashboards, and investigations.
  • Ability to use various querying languages (e.g., SQL, PromQL).
  • Deep understanding of Kubernetes internals and hands-on experience with OpenShift (OCP), AKS, and/or GKE.
  • Experience with Prometheus, OpenTelemetry, and Kubernetes exporters.
  • Solid understanding of OS-level monitoring (Linux/Windows) and networking fundamentals.
  • Experience with ThousandEyes, BigPanda, and ServiceNow ITSM workflows.

Preferred Qualifications
  • Experience designing SLOs/SLIs and reliability scorecards.
  • Familiarity with service mesh metrics (Istio / mTLS).
  • Experience with capacity planning and trend analysis using observability data.
  • Exposure to multi-cloud observability strategy and platform standardization.
  • Experience monitoring databases, message brokers, or middleware.
  • Familiarity with AIOps or ML-driven anomaly detection in observability pipelines.


Everforth Apex is a world-class IT services company that serves thousands of clients across the globe. When you join Everforth Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRateds Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico.

Everforth Apex uses a virtual recruiter as part of the application process. Click for more details. By applying for this job, you agree to receive calls, AI-generated calls, text messages, or emails from Everforth Apex and its affiliates, and contracted partners. Frequency varies for text messages. Message and data rates may apply. Carriers are not liable for delayed or undelivered messages. You can reply STOP to cancel and HELP for help. You can access our privacy policy at

Everforth Apex Benefits Overview: Everforth Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Everforth Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Everforth Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Everforth Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our 'Welcome Packet' as well, which an Everforth Apex team member can provide.

Everforth Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Everforth Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law.

If you require an accommodation under the Americans with Disabilities Act to participate in an interview with a virtual recruiter or to use our website for a search or application, please contact our Benefits Department at or . Please note that this contact information is strictly to be used for medical ADA accommodations and that no other inquiries will be answered.

UnitedHealthcare creates and publishes the Transparency in Coverage Machine-Readable Files on behalf of Everforth Apex Systems.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: apexsan
  • Position Id: BHJOB2374_3034826
  • Posted 1 day ago

Company Info

About Apex Systems

Part of the Commercial Segment of ASGN Incorporated, Apex Systems is a leading global technology services company specializing in customizable industry-specific solutions that drive better results and transform businesses for over 25 years.

Delivering Value and Innovation

Apex Systems partners with global and Fortune 500 companies, leveraging cutting-edge technology through strategic alliances to drive businesses forward. These proven solutions and services combined with our unique deployment model that builds qualified, industry specific, fit-for-purpose teams fulfills our clients’ digital visions and achieves results. Our agility and obsession with providing value enables us to support an ever-evolving digital world.

About_Company_One
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Coppell, Texas

Today

Easy Apply

Full-time

USD 60.00 per hour

Irving, Texas

Today

Easy Apply

Full-time

Plano, Texas

Today

Easy Apply

Full-time

Coppell, Texas

Today

Easy Apply

Full-time

Search all similar jobs