Overview
Skills
Job Details
POSITION SUMMARY
Health & Wellness organization has an open fulltime position for Senior Site Reliability Engineer/ DevOps Engineer. Health & Wellness is one of America s leading retail healthcare organization, with 2,300 pharmacies and 11 specialty pharmacies, and 200 clinics. Candidates care deeply about the technical side of operations and making sure that everything is running smoothly for over 22,000 healthcare professionals and 13 million customers. Our vision is to help people live healthier lives.
The Senior Site Reliability/DevOps Engineer will specialize in developing scalable methods for building, deploying, and supporting cloud, onprem and store focused enterprise services and systems. Work closely with Software Engineers to deploy and operate solutions; automate and streamline processes; build and maintain tools for deployment, monitoring of platform, and troubleshoot and resolve issues in development, test, and production environments. Demonstrate the company's core values of respect, honesty, integrity, diversity, inclusion and safety.
ESSENTIAL JOB FUNCTIONS
- Design and build infrastructure and systems that provide high levels of scalability, reliability, and performance for stack, while balancing security, maintainability, reliability and operational excellence
- Work with the engineering team to continuously implement and improve reliable and speedy build environments for DEV & QA; provide timely build status updates; automate as much as possible to improve efficiency and quality
- Promote innovation, outside-of-the-box thinking, teamwork, and self-organization
- Ensure traceability, observability, and retrievability of system behavior
- Build logging, monitoring, and alerting systems to identify bottlenecks and assist with debugging, analysis, and optimization in cloud, on-prem and store environments
- Experiment with and recommend new technologies that simplify or improve stack
- Craft solid and clearly explained designs, playbooks, and documentation, for consumption by teammates and the larger engineering organization
- Participate in an off-hours on-call rotation, and perform periodic off-hours work during maintenance windows
- Must be able to perform the essential job functions of this position with or without reasonable accommodation
GENERAL SKILLS
Minimum
- Bachelor's Degree computer science or equivalent related experience (8+ years) and strong theoretical fundamentals (data structures, algorithms, lock-free data structures, multithreaded architectures etc.)
- Any experience with always-on and high-volume web server stack, Azure/Google Cloud Platform PaaS and Azure/Google networking, provisioning native Managed Apps & CI/CD pipelines
- Any understanding of SSH, VPN, TCP/IP, DNS, HTTP(S), network routing and subnets
- 4+ years of experience in the cloud SRE/DevOps/Infrastructure, or any related fields
- Proven knowledge of technology to support omnichannel experiences
- Knowledge of Linux architecture, security, administration, performance monitoring/tuning, troubleshooting, and production operations
- Fluent in Shell Scripting with experience implementing automation and monitoring using shell scripting and other related tools
- Proven knowledge of service-oriented architecture/Cloud
Desired
- Master's Degree
- Other PHD in computer science, information systems, or related field
- Any experience with CI/CD pipelines using tools such as Jenkins, Spinnaker, Azure DevOps, TeamCity, etc.
- Any experience with Azure DevOps services such as DevOps, Pipelines, Test Plans, Artifacts, etc.
- Any experience with Nginx, HAProxy, Squid
- 1 year of experience managing System Observability experience (ELK, PagerDuty, Datadog, New Relic, Azure Monitor, Grafana, etc)
- 1 year of experience with technologies such as Kafka, RabbitMQ, SQS, Ansible, Terraform, Docker and Kubernetes, Jenkins, Spinnaker, Azure DevOps, TeamCity
- 2+ years of experience configuring and managing cloud infrastructure (AWS, Google Cloud Platform, Azure)
- 4+ years of experience in designing/working in high volume eCommerce applications
BONUS POINTS
Microsoft Azure Certification
Experience in retail or healthcare industries
Key Responsibilities
- Design and build infrastructure and systems that provide high levels of scalability, reliability, and performance for stack, while balancing security, maintainability, reliability and operational excellence
- Work with the engineering team to continuously implement and improve reliable and speedy build environments for DEV & QA; provide timely build status updates; automate as much as possible to improve efficiency and quality
- Promote innovation, outside-of-the-box thinking, teamwork, and self-organization
- Ensure traceability, observability, and retrievability of system behavior
- Build logging, monitoring, and alerting systems to identify bottlenecks and assist with debugging, analysis, and optimization in cloud, on-prem and store environments
- Experiment with and recommend new technologies that simplify or improve
- Craft solid and clearly explained designs, playbooks, and documentation, for consumption by teammates and the larger engineering organization
- Participate in an off-hours on-call rotation, and perform periodic off-hours work during maintenance windows
- Must be able to perform the essential job functions of this position with or without reasonable accommodation