Sr. Observability Engineer

Charlotte, NC, US • Posted 18 hours ago • Updated 18 hours ago
Full Time
On-site
$80,000 - $100,000/yr
Fitment

Dice Job Match Score™

📊 Calculating match score...

Job Details

Skills

  • Observability
  • AppDynamics
  • Splunk

Summary

Job Title: Sr. Observability Engineer

Location: Charlotte, NC Irving, TX, Iselin, NJ
Employment Type: Full-Time

About this Role

An observability engineer designs, implements, and maintains systems to monitor, analyze, and report on the health and performance of software applications and infrastructure, ensuring high availability, performance, and security. They are crucial in understanding complex IT systems and proactively addressing potential issues

In this Role, You Will:

  • Designing and Implementing Observability Pipelines: Observability engineers create robust pipelines to collect, aggregate, and analyze data from various sources.
  • Monitoring and ing: They establish monitoring systems and s to detect anomalies and performance issues in real-time.
  • Metric & Instrumentation Standards: Defining common metric standards for every stage of the Application Lifecycle process and Instrumentation standards and scripting including OTel standards alignment
  • Data Analysis and Visualization: They analyze telemetry data (logs, metrics, traces) to gain insights into system behavior and identify trends.
  • Incident Response: They investigate and troubleshoot incidents, using observability data to understand the root cause and implement solutions.
  • Collaboration and Communication: They collaborate with development, SRE, and other teams to ensure observability practices are integrated into workflows and to share insights.
  • Staying Up-to-Date: They stay current with the latest trends in observability, logging, monitoring, and cloud technologies.
  • Documentation and Knowledge Sharing: They create comprehensive documentation for observability systems and processes and share knowledge with other teams.

Skills and Knowledge:

  • Strong understanding of distributed systems: They need to understand the complexities of modern architectures, including microservices, cloud-native environments, and hybrid infrastructure.
  • Proficiency in observability tools: They are familiar with tools for logging, metrics, and tracing, such as ELK Stack, Prometheus, Grafana, and distributed tracing systems.
  • Data analysis and visualization skills: They can analyze telemetry data to identify trends and patterns and create visualizations to communicate insights.
  • Scripting and automation: They can automate tasks and create scripts to manage observability infrastructure.
  • Problem-solving skills: They can diagnose and troubleshoot system issues using observability data.
  • Communication skills: They can effectively communicate technical information to both technical and non-technical audiences.
  • Experience with cloud platforms: They have experience with cloud platforms like AWS, Azure, and Google Cloud Platform.
  • Understanding of IT service management practices: They understand IT service management practices like change management, release management, incident management, and problem management.

Required Qualifications:

  • Demonstrated experience in Observability monitor, analyze, and report on the health and performance of software applications and infrastructure .

Desired Qualifications:

  • 8+ years of experiencein observability, monitoring, and reliability engineering across largescale enterprise or cloudnative environments.
  • Strongexpertise in observability tools and platformssuch as Prometheus, Grafana, ELK/OpenSearch, Splunk, Dynatrace, AppDynamics, or equivalent.
  • Handson experience designing and implementing observability pipelinesfor logs, metrics, and traces in distributed systems.
  • Deepunderstanding of OpenTelemetry (OTel), including instrumentation standards, collectors, exporters, and vendorneutral telemetry architectures.
  • Stronganalytical and troubleshooting skills, using telemetry data for incident investigation, rootcause analysis, and performance optimization.
  • Proficiencyin scripting and automation(Python, Go, Bash/PowerShell) with strong collaboration skills to work across Dev, SRE, and Platform teams.

Work Environment & Benefits:

  • Hybrid Work Model: Combination of on-site and remote work, depending on business needs.
  • Collaborative Culture: Work closely with cross-functional teams, vendors, and senior leadership.
  • Professional Development: Access to training programs, certifications, and career advancement opportunities.
  • Global Impact: Support a mission-critical network infrastructure serving millions of customers worldwide.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 501494924
  • Position Id: 117792-5-1
  • Posted 18 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Charlotte, North Carolina

Today

Easy Apply

Full-time

$80,000 - $100,000

Charlotte, North Carolina

Today

Easy Apply

Full-time

USD 61.00 - 65.00 per hour

Charlotte, North Carolina

Today

Easy Apply

Full-time

USD 65.00 - 75.00 per hour

Charlotte, North Carolina

Today

Easy Apply

Full-time

USD 60.00 - 63.00 per hour

Search all similar jobs