Senior Site Reliability Engineer - Contract W-2


Clarkstech
Dice Job Match Score™
⏳ Almost there, hang tight...
Job Details
Skills
- Amazon DynamoDB
- Amazon S3
- Amazon EC2
- Apache Kafka
- Performance Tuning
- Kubernetes
- Docker
- DevOps
- Grafana
- Management
- Continuous Integration
- Continuous Improvement
- Splunk
Summary
Role Overview
We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to design, build, and scale observability and reliability solutions across enterprise-grade distributed systems. This role focuses on improving system reliability, performance, and operational excellence through advanced telemetry, automation, and cloud-native best practices, with a strong emphasis on AWS environments.
Key Responsibilities
- Design, implement, and maintain end-to-end observability solutions including metrics, logging, and distributed tracing.
- Build and manage real-time monitoring dashboards and alerting systems using tools such as Datadog, Splunk, Prometheus, Grafana, or ELK.
- Develop and enforce Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to ensure system reliability.
- Lead incident response efforts, including troubleshooting, root cause analysis, and rapid resolution of production issues.
- Drive reliability engineering practices including post-incident reviews and continuous improvement initiatives.
- Automate operational and monitoring workflows using Python, Bash, or Go.
- Develop self-healing systems and auto-remediation capabilities to reduce manual intervention.
- Collaborate closely with DevOps, Cloud, and Security teams to enhance CI/CD pipelines and infrastructure resilience.
- Optimize application and infrastructure performance, scalability, and availability in cloud environments.
- Champion reliability, observability, and operational best practices across engineering teams.
Required Qualifications
- 10+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering roles.
- Strong expertise in observability and monitoring tools (Datadog, Splunk, Prometheus, Grafana, ELK stack).
- Hands-on experience managing incidents and participating in on-call rotations in production environments.
- Proficiency in Linux system administration, networking fundamentals, and performance tuning.
- Strong programming/scripting skills in Python, Bash, Go, or similar languages.
- Experience with containerization and orchestration tools such as Docker and Kubernetes.
- Proven experience designing and maintaining CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, etc.).
- Solid understanding of distributed systems, high availability, and scalability patterns.
AWS & Cloud Expertise
- Hands-on experience with AWS services such as EC2, ECS/EKS, Lambda, S3, RDS, DynamoDB, and VPC networking.
- Experience with AWS-native observability tools including CloudWatch, X-Ray, and CloudTrail.
- Familiarity with infrastructure automation using AWS CloudFormation or Terraform.
- Experience implementing scalable, fault-tolerant architectures in AWS environments.
- Understanding of cost optimization and performance tuning in cloud-native systems.
Nice-to-Have Skills
- Experience with AIOps, anomaly detection, and predictive monitoring solutions.
- Knowledge of Infrastructure as Code (Terraform, Ansible, Pulumi).
- Exposure to security monitoring, compliance, and integration with observability platforms.
- Experience with event-driven architectures and streaming platforms (e.g., Kafka).
- Familiarity with chaos engineering and resilience testing practices.
- Dice Id: 91165214
- Position Id: 8948332
- Posted 2 hours ago
Company Info
About Clarkstech
At ClarksTech, we are a renowned global IT consulting firm committed to collaborating with business and societal leaders in overcoming their most critical challenges and seizing their greatest opportunities. Our achievements are rooted in fostering deep collaboration and cultivating a global community of diverse individuals who are dedicated.
We have highly skilled engineers with excellent technical knowledge and experience in using the latest software standards. We have built a large pool of knowledge that we apply to deliver solutions that meet client’s needs, expectations and budget.
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs