Sapphire Digital is looking for a Site Reliability Engineer to expand our Engineering team. We follow an Agile methodology in small software teams to consistently deliver high-quality software. Our stack includes Ruby, Rails, Angular, TypeScript, Node, Rabbit, Solr, Postgres, Redis, Puppet, and Hubot. Our infrastructure is declared as code and provisioned on AWS. We offer mentorship and career guidance, a competitive salary, remote-friendly workspace, unlimited vacation time and continuing education support (conferences, books, online resources).
In this position, you'll be responsible for:
- Triaging and troubleshooting production issues related to our CareSelect product.
- Researching and implementing ways to automate the management of our infrastructure and toil.
- Supporting deployments across our growing development, UAT, and production environments.
- Building out uptime, latency, and error monitoring for the CareSelect stack.
- Providing blameless postmortems for incidents.
- Taking part in on-call rotation for production support.
You might be a good fit if you have:
- 2-4 years of software development experience.
- Experience supporting Linux systems hosted in a cloud environment - We’re using AWS (specifically EC2, CloudFormation, RDS, ElasticCache, and S3, to name a few).
- Experience with web programming languages (Ruby on Rails a definite plus).
- Familiarity with using Puppet.
- Excellent communication skills.
- A strong desire to understand complex systems and how to make them highly available.
- A collaborative spirit and you enjoy working with a team to build things.
- A desire to continually improve and you value giving and receiving constant and constructive feedback.