AWS Cloud Site Reliability Engineer (AWS SRE)

Overview

On Site
$100000 - $130000
Full Time
No Travel Required

Skills

AWS
SRE
Sagemaker
EC2
S3
Route 53
Load Balancer
ASG
ACM
RDS
AWS Batch
CloudWatch
Lambda
API Gateway
Step Functions
jFrog
SSO
Single Sign-On
Python
Bash
IaC
S3
Glue
Lake Formation
Redshift
Git
Github
uDeploy
Jenkins
Linux

Job Details

Role :: AWS Cloud Site Reliability Engineer (AWS SRE)

Location :: Philadelphia PA

Type :: Fulltime

Job Description

Must Have Technical/Functional Skills:

  • Design, implement, and manage scalable, secure, and resilient infrastructure on AWS.
  • Expertise in various AWS services (e.g., EC2, S3, Route 53, Load Balancer, ASG, ACM, RDS, AWS Batch, CloudWatch Logs & Metrics, etc.) with 5+ years experience working in AWS infrastructure.
  • 5+ years of working experience in leveraging AWS Serverless services (e.g., Lambda, API Gateway, Step Functions), Data Lake services (e.g., S3, Glue, Lake Formation, Redshift), Machine learning services (e.g. Sagemaker).
  • 5+ years of working experience in Container-based services (e.g., Kubernetes/EKS, ECR, Docker) managed using Helm chart.
  • Develop and maintain Infrastructure as Code (IaC) using Terraform with 5+ years working experience in Terraform.
  • Ensure best practices in version control using Git/Github.
  • Build and optimize CI/CD pipelines for efficient and reliable software delivery using Jenkins, uDeploy, Manage Artifacts using jFrog, minimum 3+ years of working experience.
  • Manage and troubleshoot VPCs, networking configurations, and hybrid cloud connectivity.
  • Administer and support Linux and Windows-based systems in cloud environments.
  • Write and maintain automation scripts using Bash and Python.
  • Collaborate with enterprise architects to align infrastructure with business and technical goals.
  • Interpret and contribute to infrastructure diagrams and technical documentation.
  • Participate in incident response, root cause analysis, and on-call rotations.
  • Integrate and manage SSO (Single Sign-On) solutions for secure access control.
  • Collaborate with development, QA, and DevOps teams to ensure smooth deployment processes.

Roles & Responsibilities:

  • Design, implement, and maintain scalable and secure AWS cloud infrastructure to support a variety of applications and services.
  • Collaborate closely with cross-functional teams, including software engineers and DevOps professionals, to architect and deploy AWS solutions that meet project requirements.
  • Conduct regular performance monitoring and optimization of AWS resources to ensure cost efficiency and reliability.
  • Stay updated on the latest AWS services, features, and best practices, incorporating them into cloud architecture and development processes.
  • Implement and enforce AWS security measures, including IAM policies and access controls, to protect sensitive data and infrastructure.
  • Experience in code development in at least one programming language (preferably python).
  • Strong experience with using infrastructure as a code (Terraform).
  • Strong experience with Configuration Management Tool (Ansible).
  • Extensive knowledge of Jenkins, CI/CD pipeline and tools like uDeploy.
  • Experience working in Kubernetes/EKS.
  • Actively participate in code reviews, providing feedback on AWS infrastructure as code (IAC) and configurations to enhance system stability and security.
  • Troubleshoot and resolve AWS-related issues, ensuring uninterrupted operation of applications and services hosted on AWS.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Stanley David and Associates