The SR SRE Engineer is responsible for any and all tasks related to the performance, stability, reliability, efficiency, and security to both the sites and the general team operations. Responsibility also extends to how incidents are managed and operated.
Proactive relationship building and communication essential in this role. This includes engagements with SROâ€™s, Clients, and 3rd-Parties to ensure continuous improvements in system architecture, deployments, automation, and configuration management.
Design and develop complete end to end automation environment using configuration/auto-scaling tools.
Define standards for configuration, monitoring, reliability, scalability, performance optimization and capacity planning of new infrastructure focused on 99.9%+ uptime.
Respond to off-hours and weekend emergency alerts, alarms, and requests, in keeping with the team's on-call rotation schedule.
Work closely with Architects, Security Engineers, Product Managers, SRO and other clients and partners of the SRE team to meet the needs of the organization to stay competitive - from the infrastructure up to the highest level of applications.
Exstensive Python and AWS is a must.
5+ years of hands-on experience as an individual contributor in a systems administration/development or DevOps role
Experience supporting mission-critical platforms, both physical and virtualized environments, using CentOS, RedHat, Ubuntu.
Strong experience with configuration management systems such as Ansible (preferred), Puppet or Chef.
Experience designing, building and managing large scale infrastructure in AWS and Rackspace, including experience leveraging one or more coding languages for automation.
Proficiency in high level languages such as Python (preferred), Ruby or Java and working on software projects in a collaborative environment such as Bitbucket or Git.
Strong knowledge and experience in automation.
13101 W Washington Blvd., Suite 246 Los Angeles, CA, 90066Contact