SRE with Strong Middleware Expertise

Overview

On Site

Depends on Experience

Accepts corp to corp applications

Contract - W2

Contract - Independent

Contract - 12 Month(s)

No Travel Required

Unable to Provide Sponsorship

Skills

Amazon Web Services

API

Terraform

Middleware

Kubernetes

Apache HTTP Server

Ansible

Dynatrace

weblogic

Job Details

Position Title: SRE with Strong Middleware Expertise

Job Location: Plano, TX(Onsite)

Joining Mode: Long Term Contract

Shift 1: 8:00 AM – 5:00 PM

Shift 2: 4:00 PM – 1:00 AM

Shift 3: 12:00 AM – 9:00 AM

Job Summary

We are seeking a Site Reliability Engineer (SRE) with strong Middleware expertise to design, operate, and continuously improve highly available, secure, and scalable enterprise platforms.

This role blends deep middleware operations (WebLogic, API gateways, Java platforms) with SRE principles such as automation, observability, SLIs/SLOs, error budgets, and incident reduction.

The ideal candidate will partner with application, infrastructure, security, and DevOps teams to ensure platform reliability while driving automation, standardization, and operational excellence.

Key Responsibilities

Reliability & SRE Practices

• Define, implement, and track SLIs, SLOs, and error budgets for middleware and platform services

• Drive MTTR reduction, availability improvements, and operational resilience

• Lead incident response, root cause analysis (RCA), and post-incident reviews

• Implement proactive monitoring and alerting to reduce noise and prevent outages

Middleware Platform Engineering

• Administer and support enterprise middleware platforms including:

o Oracle WebLogic, Apache, NGINX

o API Gateways (Apigee Edge / X)

o Java application servers and JVM-based services

• Perform patching, upgrades, configuration tuning, and capacity planning

• Manage certificates, keystores, trust stores, and TLS configurations

• Ensure platform security, compliance, and performance standards

Observability & Monitoring

• Design and maintain end-to-end observability using tools such as:

o Dynatrace, ELK/Kibana, Splunk (or equivalent)

• Build executive and operational dashboards for real-time health visibility

• Reduce alert fatigue through smart alerting, thresholds, and suppression

• Monitor JVM metrics, behavior, thread utilization, and API performance

Automation & Infrastructure Efficiency

• Develop automation and self-healing solutions using:

o Shell scripting, Python, Ansible, Terraform, or similar tools

• Automate routine operational tasks (restarts, validations, health checks)

• Enable CI/CD-friendly middleware deployments and configuration management

• Standardize environments across development, QA, and production

Cloud, Containers & Modern Platforms

• Support middleware workloads on:

o Kubernetes / OpenShift

o Public or hybrid cloud environments (AWS, Azure, Google Cloud Platform)

• Integrate platform reliability into containerized and microservices architecture

• Collaborate with DevOps teams on deployment pipelines and release strategies

Collaboration & Leadership

• Act as a reliability advisor to application and development teams

• Partner with Unix/Linux, Database, Network, and Security teams

• Provide mentoring, documentation, and best-practice guidance

• Participate in on-call rotations and production support leadership

Required Skills & Experience

Technical Skills

• 5+ years of experience in Middleware / Platform Operations / SRE

• Strong expertise in WebLogic, Java middleware, Apache/NGINX

• Hands-on experience with observability platforms (Dynatrace, ELK, Splunk)

• Solid understanding of Linux/Unix systems and networking fundamentals

• Experience with API platforms (Apigee preferred)

• Automation and scripting skills (Shell, Python, Ansible, Terraform)

• Experience with Kubernetes/OpenShift and containerized workloads

SRE & Operational Excellence

• Practical experience implementing SRE principles in production

• Strong troubleshooting skills (thread dumps, heap analysis, logs)

• Experience with incident management, RCA, and change management

• Ability to balance reliability vs delivery velocity

Nice-to-Have

• Experience with cloud-native architectures and service meshes

• Knowledge of IAM / Security integrations (OAuth, SAML, mTLS)

• Exposure to CI/CD tools (Jenkins, GitHub Actions, GitLab CI)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About Mango Analytics

Share