Site Reliability Engineer

Overview

On Site
USD0.0000 - USD0.0000
Full Time

Skills

Proven track record of building observability solutions at scale
Advanced knowledge of AWS cloud architecture and services.
Deep understanding of SLI/SLO/SLA concepts

Job Details

About Ascendion

Ascendion is a full-service digital engineering solutions company. We make and manage software platforms and products that power growth and deliver captivating experiences to consumers and employees. Our engineering, cloud, data, experience design, and talent solution capabilities accelerate transformation and impact for enterprise clients. Headquartered in New Jersey, our workforce of 6,000+ Ascenders delivers solutions from around the globe. Ascendion is built differently to engineer the next.

Ascendion | Engineering to elevate life

We have a culture built on opportunity, inclusion, and a spirit of partnership. Come, change the world with us:

  • Build the coolest tech for world s leading brands
  • Solve complex problems and learn new skills
  • Experience the power of transforming digital engineering for Fortune 500 clients
  • Master your craft with leading training programs and hands-on experience

Experience a community of change makers!

Join a culture of high-performing innovators with endless ideas and a passion for tech. Our culture is the fabric of our company, and it is what makes us unique and diverse. The way we share ideas, learning, experiences, successes, and joy allows everyone to be their best at Ascendion.

About the Role:

Title: Site Reliability Engineer

Job Description:
  • We are looking for a Site Reliability Engineer (SRE) with deep expertise in designing and implementing full observability frameworks including telemetry, instrumentation, distributed tracing, and metrics pipelines.
  • This role is engineering-heavy and suited for someone who thrives in building scalable observability platforms rather than simply reacting to monitoring alerts.
  • The ideal candidate also brings strong experience in AWS cloud architecture and a builder s mindset to reliability engineering.
Key Responsibilities:
  • Architect and implement end-to-end observability platforms including logs, metrics, traces, and events across distributed systems.
  • Build and manage telemetry pipelines using open standards like OpenTelemetry, Prometheus, Grafana, and AWS-native tools (CloudWatch, X-Ray, etc.).
  • Embed observability as code into CI/CD pipelines and infrastructure provisioning tools.
  • Partner with application and platform teams to define and implement SLIs, SLOs, and error budgets as engineering primitives.
  • Develop and maintain custom instrumentation libraries to provide actionable insights across services.
  • Engineer reliable, self-service observability tooling to empower development teams.
  • Drive cloud-native observability patterns on AWS, optimizing for performance, scalability, and cost.
  • Actively participate in post-incident reviews to improve system design and observability strategy.
  • Collaborate with SRE, DevOps, and Platform teams to align reliability objectives with business goals.
Required Skills & Experience:
  • Proven track record of building observability solutions at scale (not just using tools).
  • Strong hands-on expertise with OpenTelemetry, Prometheus, Grafana, ELK, CloudWatch, X-Ray, etc.
  • Advanced knowledge of AWS cloud architecture and services.
  • Proficient in at least one modern programming language (e.g., Python, Go, Java).
  • Experience with IaC tools such as Terraform or CloudFormation.
  • Deep understanding of SLI/SLO/SLA concepts, service health indicators, and telemetry standards.
  • Familiarity with containerization and orchestration (Docker, Kubernetes).
  • Ability to build reusable components, SDKs, or libraries that enable observability at scale.
Preferred Qualifications:
  • AWS Certifications (DevOps Engineer, Solutions Architect, etc.).
  • Experience contributing to open-source observability tools.
  • Background in software engineering or platform reliability architecture.

Location: McLean, VA - Hybrid Onsite role

Salary Range: The salary for this position is between $120,000 $140,000 annually. Factors which may affect pay within this range may include geography/market, skills, education, experience, and other qualifications of the successful candidate.

Benefits: The Company offers the following benefits for this position, subject to applicable eligibility requirements: [medical insurance] [dental insurance] [vision insurance] [401(k) retirement plan] [long-term disability insurance] [short-term disability insurance] [5 personal days accrued each calendar year. The Paid time off benefits meet the paid sick and safe time laws that pertains to the City/ State] [10-15 days of paid vacation time] [6 paid holidays and 1 floating holiday per calendar year] [Ascendion Learning Management System]

Want to change the world? Let us know.

Tell us about your experiences, education, and ambitions. Bring your knowledge, unique viewpoint, and creativity to the table. Let s talk!

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.