Job Details

Senior Site Reliability Engineer (SRE)

Location: United States (Remote)

About the Role

We are seeking a Senior Site Reliability Engineer (SRE) with proven experience in ensuring high availability, reliability, and performance across complex enterprise systems. The role centers on supporting a hybrid ecosystem involving SAP workloads, modern data pipelines, and AWS cloud infrastructure.

This is an individual contributor role designed for someone who thrives on ownership, understands mission-critical systems, and can bring stability and scalability to enterprise-grade environments.

Key Responsibilities

Ensure 24/7 uptime and operational continuity of modern data pipelines that integrate with cloud data warehouses and processing engines.
Build and maintain observability frameworks (monitoring, logging, alerting) using tools such as Prometheus, Grafana, or Datadog.
Lead incident response, root cause analysis (RCA), and post-mortem processes to maintain a culture of reliability and continuous improvement.
Optimize cloud infrastructure on AWS (including EC2, S3, RDS, Lambda, IAM) to meet performance and availability SLAs.
Implement and manage CI/CD pipelines and infrastructure automation using tools like Terraform, CloudFormation, or Ansible.
Collaborate with cross-functional teams (data, platform, security, and product) to enforce best practices in uptime, scaling, and system hardening.
Drive automation of reliability tasks, performance tuning, and cost optimization efforts across the stack.

Requirements

7+ years in an SRE, DevOps, or cloud infrastructure engineering role
Hands-on experience in designing and maintaining highly available systems
Strong expertise in AWS services and cloud-native architecture
Experience working with SAP systems (e.g., S/4HANA, ECC, or BW) in hybrid or cloud-based setups
Familiarity with modern data platforms and pipeline frameworks (e.g., Spark, Snowflake, or similar)
Proficiency in monitoring, alerting, and incident response in production environments
Experience with infrastructure as code (e.g., Terraform, CloudFormation)
Comfortable working independently with high accountability and ownership
Strong troubleshooting skills and a bias for automation and root cause resolution

Preferred Qualifications

Experience supporting high-volume, low-latency enterprise systems
Exposure to metadata-driven or low-code data transformation platforms
Familiarity with Kubernetes or containerized workloads
Understanding of enterprise-grade security and compliance requirements (e.g., SOC2, HIPAA)

Why Join Us?

High-impact role with direct influence on platform reliability
Work at the intersection of SAP, cloud infrastructure, and modern data technologies
Autonomy, ownership, and opportunities to drive innovation
Collaborative culture with a focus on engineering excellence

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Senior Site Reliability Engineer (SRE)

Job Details

Senior Site Reliability Engineer (SRE)

About the Role

Key Responsibilities

Requirements

Preferred Qualifications

Why Join Us?

About Bossini Technologies

Share