Overview
On Site
Depends on Experience
Full Time
Accepts corp to corp applications
Skills
Site Reliability Engineering
SRE
AWS
Azure
GCP
Grafana
Prometheus
Splunk
ServiceNow
OpsRamp
OpenShift
Site Reliability
Job Details
Title - Senior SRE Engineer
Location NYC (3 days onsite, Hybrid)
Position Overview:
We are seeking a highly skilled and experienced Senior Site Reliability Engineering (SRE) Engineer to lead our SRE team in ensuring the reliability, scalability, and performance of our production systems. The ideal candidate will have a strong background in cloud infrastructure, automation, and system monitoring, with excellent leadership and communication skills to collaborate across teams and foster a culture of operational excellence
Required Skills:
- Strong background in IT infrastructure, cloud platforms (AWS, Azure, Google Cloud Platform), and SRE practices.
- Experience in enterprise and application architecture.
- Proven experience in building APIs and backend services.
- Hands-on experience with tools:
- Monitoring & Observability: Grafana, Prometheus, Splunk
- ITSM & Operations: ServiceNow, OpsRamp
- Project & Incident Tracking: JIRA
- Experience in building alerts, dashboards, and operational runbooks.
- Experience managing distributed systems and large-scale production environments.
- Strong leadership, communication, and problem-solving skills.
- Ability to quickly learn and adapt to new technologies and environments.
Preferred:
- Exposure to OpenShift and Azure cloud platforms.
- Certifications: SRE Foundation, ITIL, or relevant cloud certifications.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.