Overview
Remote
$65 - $75
Contract - Independent
Contract - 12 Month(s)
No Travel Required
Skills
Site Reliability Engineer
Jenkins
Terraform
Kubernetes
AWS
Python
Job Details
Sr. Site Reliability Engineer
Duration: 6-12 Months
Location: Remote, US
Client is looking for Sr. Site Reliability Engineer - ensure system reliability, scalability, and automation by implementing DevOps best practices and optimizing CI/CD pipelines in large-scale production environments.
What You'll Be Doing:
- Design, implement, and maintain scalable, reliable, and secure infrastructure supporting mission-critical applications and services.
- Lead end-to-end CI/CD pipeline automation using Jenkins, GitLab, Bitbucket, and GitHub Enterprise ensuring seamless integration and delivery processes.
- Manage and optimize artifact repositories and security scanning with JFrog Artifactory and Xray.
- Collaborate with development, QA, and operations teams to ensure high availability, performance, and reliability of services.
- Drive continuous improvement initiatives, automate repetitive tasks, and mentor junior engineers.
- Provide after-hours support for production releases and participate in on-call rotations as needed.
- Document processes, runbooks, and architectural decisions to ensure knowledge sharing and operational transparency.
What Required Skills You'll Bring:
- 10+ years of experience in DevOps, Site Reliability Engineering, or related roles in large-scale production environments.
- Proficiency with developer tools like Atlassian suite, Jenkins, GitLab, GitHub, SonarQube, and JFrog.
- Strong scripting abilities (Python, Bash, or similar) and Infrastructure as Code (Terraform, CloudFormation, Ansible) for automation and tooling.
- Experience with cloud platforms (AWS, Azure, or Google Cloud Platform)
- Solid understanding of containerization (Docker, Kubernetes) and microservices deployment.
What Desired Skills You'll Bring:
- Bachelor s or master s degree in computer science, Engineering, or the equivalent job-related experience.
- AI experience would be preferred (GitHub Copilot, OpenAI, MCP, Agentic) but not a requirement.
- Strong analytical and problem-solving skills with a passion for automation, reliability, and continuous improvement.
- Ability to troubleshoot complex production issues and lead incident response efforts.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.