Role: US - Site Reliability Engineer II
Duration: Long Term
Location: Alpharetta GA
Will the candidate have access to the development environment and / or Personally Identifiable Information (PII)? Yes
• Supplier will have to conduct in-person verification of candidate upon offer extension as part of onboarding compliance (see recent Supplier Communication)
We are seeking a skilled Site Reliability Engineer to join our team and help build, maintain, and scale our cloud-native infrastructure. You will work closely with development and operations teams to ensure our systems are reliable, scalable, and efficient. The ideal candidate is passionate about automation, observability, and infrastructure-as-code, and thrives in a collaborative, fast-paced environment.
Key Responsibilities
Design, implement, and manage cloud infrastructure on Azure using Terraform and Terragrunt.
Maintain and optimize Kubernetes clusters on Azure Kubernetes Service (AKS).
Build and manage CI/CD pipelines using GitHub Actions/Workflows and ArgoCD for GitOps deployments.
Enhance system reliability by implementing monitoring, alerting, and observability solutions with Grafana.
Automate operational tasks to reduce toil and improve team efficiency.
Participate in on-call rotations, incident response, and post-mortem analysis.
Collaborate with development teams to improve application performance, scalability, and resilience.
Implement and advocate for SRE best practices, including SLIs, SLOs, and error budgets.
Continuously improve system performance, cost efficiency, and security.
Required Skills & Qualifications
3+ years of experience in an SRE, DevOps, or cloud infrastructure role.
Strong experience with Azure cloud services and infrastructure.
Hands-on experience with java and Terraform and Terragrunt for infrastructure-as-code.
Proficiency with Kubernetes (preferably AKS), Databricks and container orchestration.
Experience with CI/CD tools, especially GitHub Workflows/Actions and ArgoCD.
Solid understanding of observability tools like Grafana (Prometheus, Loki, Tempo experience is a plus).
Education Requirements Bachelor's degree required, (Masters preferred)