Senior SRE Engineer

Overview

Remote

On Site

USD 93,581.10 - 155,968.51 per year

Full Time

Skills

Scalability

Mentorship

Continuous Integration

Continuous Delivery

High Availability

Higher Education

Reliability Engineering

Systems Design

Management

Kibana

Dynatrace

Docker

Kubernetes

Orchestration

Python

Windows PowerShell

Incident Management

Communication

Collaboration

Analytical Skill

Conflict Resolution

Problem Solving

Terraform

Instrumentation

Cloud Computing

Amazon Web Services

Google Cloud

Google Cloud Platform

Microsoft Azure

DevOps

Adaptability

Job Details

Job Overview:

We are looking for a highly motivated SR SRE Engineer with a strong background in Observability to join our growing team. This role requires a seasoned professional to guide our team in building, scaling, and maintaining observability solutions that help ensure our systems and services are highly available, performant, and secure.

Responsibilities:

Lead the development and implementation of observability tools and practices across multiple platforms, including monitoring, logging, tracing, and alerting.
Work closely with product and engineering teams to define observability standards, goals, and best practices.
Design and optimize the architecture of observability infrastructure to provide clear insights into the health, performance, and scalability of services.
Troubleshoot and diagnose complex issues related to performance and availability, offering actionable insights and solutions.
Mentor and guide junior SREs on observability tools and practices, fostering a culture of reliability and proactive monitoring.
Manage incidents and post-incident reviews to continuously improve monitoring systems and practices.
Partner with DevOps, Software Engineers, and other stakeholders to ensure seamless integration of observability tools with CI/CD pipelines.
Implement and maintain high-availability monitoring and alerting systems.
Ensure automation of observability tooling to scale with the growth of systems and services.

Education and Experience Required:

Combined minimum of 6 years' higher education and/or work experience in systems design, management and/or architecture

5+ years of experience in Site Reliability Engineering, DevOps or system design and/or architecture similar roles.

3+ years of experience leading or managing observability initiatives.

Strong hands-on experience with monitoring tools like Kibana, Dynatrace, Datadog, or similar.

Solid understanding of observability concepts (metrics, logging, tracing, alerting) and frameworks (e.g., OpenTelemetry).

Experience with cloud environments such as AWS, Google Cloud, or Azure.

Familiarity with containerization (Docker, Kubernetes) and orchestration platforms.

Excellent problem-solving skills and ability to troubleshoot complex distributed systems.

Mid-level programming skills in Python, Jason, PowerShell, or other relevant languages.

Experience with incident response and post-mortem analysis.

Excellent communication and collaboration skills

Advanced analytical skills, Advanced troubleshooting skills and Advanced problem solving skills

Education and Experience Preferred:

Familiarity with infrastructure as code (Terraform, CloudFormation).

Login and enrollment instrumentation using SLO/SLI and measuring FCI and FSI.

Experience in building and maintaining distributed systems at scale.

Knowledge of security best practices in observability.

Certifications in Cloud (AWS, Google Cloud Platform, Azure), SRE or DevOps are a plus.

Process-oriented, Logical thinker

Strong knowledge of server/client and virtual technologies

Adaptable, Able to learn quickly in a rapid pace environment

M&T Bank is committed to fair, competitive, and market-informed pay for our employees. The pay range for this position is $93,581.10 - $155,968.51 Annual (USD). The successful candidate's particular combination of knowledge, skills, and experience will inform their specific compensation.

Location
Buffalo, New York, United States of America

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share