Overview
Remote
Up to $60
Contract - Independent
Contract - W2
Contract - 6 Month(s)
Able to Provide Sponsorship
Skills
SRE
DevOps
cloud infrastructure engineering
Job Details
Senior Site Reliability Engineer (SRE)
Location: United States (Remote)
About the Role
We are seeking a Senior Site Reliability Engineer (SRE) with proven experience in ensuring high availability, reliability, and performance across complex enterprise systems. The role centers on supporting a hybrid ecosystem involving SAP workloads, modern data pipelines, and AWS cloud infrastructure.
This is an individual contributor role designed for someone who thrives on ownership, understands mission-critical systems, and can bring stability and scalability to enterprise-grade environments.
Key Responsibilities
- Ensure 24/7 uptime and operational continuity of modern data pipelines that integrate with cloud data warehouses and processing engines.
- Build and maintain observability frameworks (monitoring, logging, alerting) using tools such as Prometheus, Grafana, or Datadog.
- Lead incident response, root cause analysis (RCA), and post-mortem processes to maintain a culture of reliability and continuous improvement.
- Optimize cloud infrastructure on AWS (including EC2, S3, RDS, Lambda, IAM) to meet performance and availability SLAs.
- Implement and manage CI/CD pipelines and infrastructure automation using tools like Terraform, CloudFormation, or Ansible.
- Collaborate with cross-functional teams (data, platform, security, and product) to enforce best practices in uptime, scaling, and system hardening.
- Drive automation of reliability tasks, performance tuning, and cost optimization efforts across the stack.
Requirements
- 7+ years in an SRE, DevOps, or cloud infrastructure engineering role
- Hands-on experience in designing and maintaining highly available systems
- Strong expertise in AWS services and cloud-native architecture
- Experience working with SAP systems (e.g., S/4HANA, ECC, or BW) in hybrid or cloud-based setups
- Familiarity with modern data platforms and pipeline frameworks (e.g., Spark, Snowflake, or similar)
- Proficiency in monitoring, alerting, and incident response in production environments
- Experience with infrastructure as code (e.g., Terraform, CloudFormation)
- Comfortable working independently with high accountability and ownership
- Strong troubleshooting skills and a bias for automation and root cause resolution
Preferred Qualifications
- Experience supporting high-volume, low-latency enterprise systems
- Exposure to metadata-driven or low-code data transformation platforms
- Familiarity with Kubernetes or containerized workloads
- Understanding of enterprise-grade security and compliance requirements (e.g., SOC2, HIPAA)
Why Join Us?
- High-impact role with direct influence on platform reliability
- Work at the intersection of SAP, cloud infrastructure, and modern data technologies
- Autonomy, ownership, and opportunities to drive innovation
- Collaborative culture with a focus on engineering excellence
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.