Sr SRE Site Reliability Engineer (Sunnyvale, CA)
- A hands-on, energetic and seasoned site reliability engineering lead to serve
as a primary person responsible for the overall health, availability, reliability, scalability, and capacity planning of our critical services.
- Ensure durability and operability of the services. The SRE Lead will be responsible
in automating mundane and repetitive procedures.
-The SRE Lead will work closely with system engineers, network engineers, database administrators, information security team to achieve service level objectives.
- The SRE Lead will be responsible for establishing and monitoring of service
level indicators to maintain the overall health of the services.
-The SRE Lead will work in troubleshooting, triaging production issues, and further escalating the issues to the engineering team for a permanent fix.
-Participate in the change management processes to ensure the durability and
operability of the service.
- 5+ years of hands-own experience in deploying and troubleshooting of apps or services in a large scale Linux/Unix environment.
- Establish change management discipline in roll-out and deployment of new product features.
-Experienced in developing automated solution for deploying, monitoring,
and logging of our critical services.
- Experience in automation through scripts or through other tools for reducing
of manual processes.
-Infrastructure knowledge of Network, Load Balancers, VM, Firewalls, Security Certificates, etc.
- Working knowledge of Oracle.
-Experience in troubleshooting and issue triaging.
-Working knowledge of source control software (SVN or Git).
- Proficient in solving problems wide range of issues across multiple technologies.
-Ability to multi-task and manage tasks with varying priorities.
-Ability to work independently with minimal supervision. Excellent written and oral communication skills.
- Working knowledge of APM solutions such as AppDynamics or DynaTrace
- Experience with containerized applications using Docker and Kubernetes
- Understanding of Configuration Management systems like Chef