Job Title: Senior Site Reliability Engineer (SRE) Lead
Location: Charlotte, NC / Edison, NJ / Columbus, OH (Local Candidates Only, 5 Days Onsite)
Interview: Face-to-Face Mandatory
Job Description:
Seeking a Senior SRE Lead with strong experience in platform reliability, observability, monitoring, automation, and cloud-native environments. The ideal candidate will drive platform health, incident management, performance optimization, and reliability initiatives across enterprise-scale distributed systems.
Mandatory Skills:
-
Site Reliability Engineering (SRE)
-
Splunk, ELK, Grafana, Prometheus, GCL
-
APM Monitoring, Dashboard Creation & Alerting
-
Incident Management & Production Support
-
Kubernetes / OpenShift
-
AWS or Azure
-
Python & Shell Scripting (Bash)
-
Ansible & YAML Playbooks
-
Git, Jenkins, CI/CD
-
Kafka / MQ Messaging
-
Java, Spring Boot
-
Oracle & MongoDB
-
Microservices Architecture
-
Linux/Unix Administration
Required Experience:
-
10+ years of Software Engineering experience
-
4+ years in SRE/Platform Reliability Engineering
-
Experience supporting large-scale distributed systems
-
Strong troubleshooting and performance tuning expertise
Preferred Skills:
Keywords: SRE, Site Reliability Engineer, Splunk, ELK, Grafana, Prometheus, GCL, Kubernetes, OpenShift, AWS, Azure, Python, Bash, Ansible, Jenkins, Git, Kafka, MQ, Oracle, MongoDB, Java, Spring Boot, Observability, Monitoring, Incident Management, Production Support, Platform Reliability.