Overview
Skills
Job Details
Site Reliability Engineer (SRE) O'Fallon, Missouri 24 Months contract
Overview:
The Client is looking for a Site Reliability Engineer who can help us solve problems, build our pipelines and lead Client in automation and best practices.
- Are you a born problem solver who loves to figure out how something works?
- Are you a geek who loves all things automation?
- Do you have a low tolerance for manual work and look to automate everything you can?
Business Operations is leading the DevOps transformation at Client through our tooling and by being an advocate for change & standards throughout the development, quality, release, and product organizations. We need team members with an appetite for change and pushing the boundaries of what can be done with automation. Experience in working across development, operations, and product teams to prioritize needs and to build relationships is a must.
Responsibilities:
- Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement.
- Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns
- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health with automated alerts.
- Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
- Practice sustainable incident response and detailed postmortems.
- Take a holistic approach to problem solving, by connecting the dots during a production event thru the various technology stack that makes up the platform, to optimize mean time to recover
- Work with a global team spread across tech hubs in multiple geographies and time zones
- Share knowledge and mentor junior resources
Qualifications:
- BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience.
- Experience with algorithms, data structures, scripting, pipeline management, and software design.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Ability to help debug, optimize code, and automate routine tasks.
- We support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed.
- Experience in one or more of the following is preferred: Python, Go, Bash Scripting.
- Interest in designing, analyzing, and troubleshooting large-scale distributed systems.
- We need team members with an appetite for change and pushing the boundaries of what can be done with automation.
- Experience in working across development, operations, and product teams to prioritize needs and to build relationships is a must.
- For work on our ops team, engineer with experience in industry standard CI/CD tools like Git/BitBucket, Jenkins, and Chef. Experience designing and implementing an effective and efficient CI/CD flow that gets code from dev to prod with high quality and minimal manual effort is required.
- Required Bash and Python scripting experience.
All About You:
- Must be high-energy, detail-oriented, and proactive
- Must have the ability to function under pressure in an independent environment
- Must provide the necessary skills to have a high degree of initiative and self-motivation to drive results
- Excellent interpersonal and problem-solving skills
- Excellent written and verbal communication skills
- Expert knowledge of some of the following technologies: Directory Services, Authentication/Authorization, Access Provisioning, Public Key Infrastructure (PKI), Controls and compliance
- Knowledge of with SOAP and REST web service.