SRE Engineer

  • Phoenix, AZ
  • Posted 2 days ago | Updated 2 days ago

Overview

On Site
$120,000 - $125,000
Full Time

Skills

Site Reliability
Java
Kubernetes
Azure

Job Details

Job Title: Senior Site Reliability Engineer (SRE) Java / Kubernetes / Azure
Location: Phoenix, AZ (Day 1 Onsite)
Type: Full-Time
Experience Required: 14+ Years
Key Responsibilities
Provide senior-level SRE support, ensuring system reliability, availability, and operational excellence
across all environments.
Develop and maintain services and automation scripts using Java as the primary programming
language.
Build, deploy, and optimize workloads running on Kubernetes clusters (including multi-cluster and
federated deployments).
Manage and enhance cloud infrastructure leveraging Azure services and best practices.
Work with Linux/Unix systems and develop automation using BASH shell scripting.
Build automation and tooling using Python or Go.
Design, implement, and maintain CI/CD pipelines using GitLab CI/CD and Jenkins.
Support application streaming, event processing, and analytics using Kafka Stream Generator,
KSQLDB, and Spark Streams.
Work with service mesh technologies including Istio and understand Anthos Service Mesh.
Utilize VMware and other virtualization platforms for environment provisioning.
Provide robust incident support, root-cause analysis, and production issue resolution.
Implement eBPF-based observability and performance troubleshooting where applicable.
Develop and enhance monitoring and alerting systems using Splunk, Prometheus, Datadog, and Kiali.
Configure and manage Nginx Controller and Seesaw load-balancing.
Use Terraform for infrastructure-as-code and Docker for containerization.
Manage Kubernetes storage using Portworx.
Automate repetitive operational tasks and contribute to platform stability and efficiency.
Provide support across all US time zones, including rotational shifts, weekends, and occasional 24/7
escalations.
Required Skills & Qualifications
Extensive experience in incident response, troubleshooting, performance engineering, and service
reliability.
Ability to automate manual operational tasks.
Strong understanding of monitoring, alerting, and observability practices.
Java (Proficient) Must be hands-on in building, supporting, and optimizing Java-based systems and
microservices.
Kubernetes (Hands-on) Deployment, autoscaling, federation, ingress, storage, service mesh, and
cluster operations.
Azure (Highly Proficient) Strong experience across Azure compute, networking, storage, DevOps,
and security features.
Functional languages proficiency: Prolog, Haskell, OCaml.
Knowledge of Linux/Unix internals and BASH scripting.
Strong experience with Python or Go.
VMware and virtualization technologies.
Kafka ecosystem tools: Kafka Stream Generator, KSQLDB, Spark Streams.
Experience with Istio/Anthos Service Mesh.
Familiarity with eBPF for low-level observability.
Monitoring tools: Splunk, Prometheus, Datadog, Kiali.
Load balancing with Nginx Controller and Seesaw.
Docker and Terraform expertise.
Experience working with Portworx for Kubernetes storage.
Certification Required
Azure
Kubernetes

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Best Peers