Lead SRE

Hybrid in McKinney, TX, US • Posted 10 hours ago • Updated 8 hours ago
Contract Corp To Corp
Contract Independent
Contract W2
12 Months
No Travel Required
Hybrid
Depends on Experience
Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

  • AppDynamics
  • Amazon Web Services
  • Dynatrace
  • ServiceNow

Summary

Job Title: Lead Integration & Observability Specialist (SRE Lead)

Location: McKiney, TX (Hybrid role)

Client: NTT DATA / Globe Life Insurance

 

Job Summary:

We are seeking a Lead Integration & Observability Specialist to design, implement, and lead enterprise observability and reliability solutions, while supporting cloud-based integration platforms on AWS/Azure. The role focuses on monitoring, automation, and operational readiness of applications, APIs, data pipelines, and messaging systems.

This is a hands-on technical leadership role with mentoring and solution ownership responsibilities.

 


Key Responsibilities

  • Lead the implementation of enterprise observability for applications, APIs, services, batch jobs, and data pipelines.
  • Design and standardize monitoring, alerting, logging, metrics, and health checks across distributed systems.
  • Integrate observability platforms with incident management and automation tools to support proactive issue detection and remediation.
  • Support reliability and availability of integration platforms built on AWS/Azure
  • Perform advanced troubleshooting using logs, metrics, and traces to resolve production issues.
  • Define operational readiness standards and non-functional requirements.
  • Mentor engineers on observability best practices and platform usage.
  • Collaborate with product, support, and operations teams to improve service stability and delivery.

 

Required Skills (Mandatory)

  • 15+ years of overall IT experience
  • 7+ years of relevant experience in Observability / Monitoring / Reliability Engineering
  • Strong hands-on experience with enterprise observability tools, such as:
    • Instana, Dynatrace, AppDynamics, Prometheus, Grafana
  • Expertise in:
    • Monitoring and alerting design
    • Log management and analysis
    • Metrics and distributed tracing
    • Health checks and SLO/SLI concepts
  • Experience monitoring AWS/Azure workloads
  • Strong troubleshooting and incident analysis skills
  • Experience defining operational and non-functional requirements
  • Technical leadership and mentoring experience
  • Automation and ITSM integration (ServiceNow workflows, incident automation)
  • CI/CD and release management exposure
  • Cloud integration and messaging exposure
  • Automation and ITSM integration (ServiceNow workflows, incident automation)
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90970970
  • Position Id: 8973900
  • Posted 10 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Plano, Texas

4d ago

Easy Apply

Contract

Depends on Experience

Plano, Texas

Today

Contract

USD 75,024.00 per year

Plano, Texas

Today

Easy Apply

Contract

Depends on Experience

Plano, Texas

Today

Easy Apply

Third Party, Contract

Depends on Experience

Search all similar jobs