Cloud Infrastructure SRE - Full Time

Overview

On Site
Full Time

Skills

Automation
AWS
Cloud
Terraform
Ansible
SRE
IaC
Cloud Infrastructure
CloudFormation
Site Relaibility
GCP or Azure

Job Details

Role: Cloud Infrastructure SRE
Location/s: Alpharetta, GA / Berkeley Heights, NJ (Onsite from Day 1)
Job Type: Full Time
Required Skills:
SRE, Cloud, Automation
Job Description:
  • Design, build, and maintain highly available, scalable, and secure cloud infrastructure on platforms such as AWS, Google Cloud Platform, or Azure.
  • Develop and implement automation for provisioning, monitoring, scaling, and incident response using Infrastructure-as-Code tools (e.g., Terraform, CloudFormation, Ansible).
  • Monitor system reliability, capacity, and performance; proactively detect and address issues before they impact users.
  • Respond to production incidents, participate in on-call rotations, and lead post-incident reviews to drive root cause analysis and reliability improvements.
  • Collaborate with software engineering and security teams to ensure new services and features are production-ready and meet reliability standards.
  • Build and maintain tools for deployment, monitoring, and operations; automate manual processes to reduce toil.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.