Automation Developer / Site Reliability Engineer

Overview

Remote
Contract - W2
Contract - Longterm

Skills

python
Java
C++
Perl
Ruby
Go
C
Site Reliability Engineer
UNIX operating systems

Job Details

Job Title: Platform SRE Automation Developer / Site Reliability Engineer

Job Location: Remote in Mexico

Job Type; Full time contract

Job Summary: This team's engineers support the growing consumer credit card business. The platform is built on a microservice architecture on a modern technology stack hosted in AWS public cloud and uses state of the art development practices and tooling for SDLC, with observability tools such as Datadog, Prometheus, Splunk, etc.Our engineers are responsible for applying strong software development and engineering principles to support our primary mission, which is to safeguard the production environments. We perform all the functions of a SRE team from production support, architecture design (or redesign), automation, development and support of observability tooling, defining and monitoring SLOs, and incident management.


Note: We are not necessarily looking for people with previous SRE experience. Instead, we are looking for strong engineers with a developer-oriented focus who are willing to expand their areas of expertise.


How will you fulfil your potential?
Build and improve the observability and alerting
Apply your software engineering skills to automate away manual tasks and operational support TOIL.
Work closely with application developers for domain-based observability
Support the upkeep of a production environment a 4 9's by monitoring availability and taking a holistic view of system health.
Create sustainable systems and services through automation and uplifts.
Drive incident management process and support a blameless post-mortems culture. Command an incident at a minimum to mitigation.
Partner with development teams to improve services via rigorous testing and release procedures.
Participate in system design consulting, platform management, and capacity planning.
Participate in infrastructure sizing and optimization.
Work closely with the Card partner
Basic Qualifications
Proficiency in Python (v3 preferred), along with one or more of the following: Go, Python, C, C++, Java, Perl, Ruby or shell scripting.
BS degree in Computer Science or related technical field involving coding and / or systems engineering.
Experience with algorithms, data structures and software design and/or Experience with
UNIX operating systems internals and / or networking.
Preferred Qualifications
Coding beyond simple scripts.
Experience with observability/monitoring platforms such as PrometheGrafana, Splunk.
Experience with AWS.
Experience with distributed systems design, maintenance, and troubleshooting.
Hands-on experience with debugging and optimizing code, as well as automation.
Strong interpersonal skills, drive, and ownership.
Solving novel problems from first principles.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.