Overview
Skills
Job Details
Summary
Our organization builds and provides systems and infrastructure that fuel our core services. We are the foundation on which our software developers build the products that our customers love. We are looking for passionate and dedicated Site Reliability Engineers to continue our focus on providing our customers the highest quality Services experience. Our services have to scale globally, stay highly available, and "just work. If you love designing, engineering and running systems and infrastructure that will help millions of customers, then this is the place for you!
In this unique role, you will play a crucial part in both maintaining our existing infrastructure and developing innovative solutions for the future. Approximately 20% of your time will be dedicated to technical operations, specifically assisting with the upgrade of our Red Hat Enterprise Linux (RHEL) 9 systems. The remaining 80% will focus on greenfield development, building a new, critical release tool that will streamline our software releases improving efficiency and reliability. This is an exciting opportunity for a backed infra / full-stack engineer who thrives on solving complex problems, has a passion for automation, and enjoys the challenge of building new systems from the ground up.
Key Responsibilities
20% Technical Operations (RHEL 9 Upgrade Support):
- Collaborate with the operations team to plan, test, and execute RHEL 9 upgrades across our infrastructure.
- Assist in identifying and resolving compatibility issues, performance bottlenecks, and other operational challenges during the upgrade process.
- Develop and implement automation scripts to streamline RHEL 9 deployment and configuration.
- Monitor system health and performance post-upgrade, proactively addressing any issues.
80% New Tool Development:
- Design, develop, and implement an exciting new, internal release tool using modern development practices adopting state-of-the-art release practices.
- Develop robust and scalable APIs for the new tool, enabling seamless integration with existing systems.
- Integrate the new tool with various internal and external services through API interactions.
- Write clean, well-documented, and maintainable code in Python.
- Participate in code reviews, ensuring high-quality and efficient solutions.
- Contribute to the full software development lifecycle, from requirements gathering and design to testing, deployment, and ongoing maintenance.
Required Skills and Experience
- Linux OS: Deep understanding and hands-on experience with Linux operating systems, particularly Red Hat Enterprise Linux (RHEL).
- Excellent Python Development: Proven expertise in Python for backend development, scripting, and automation, with appropriate testing, particularly using frameworks like Flask/FastAPI.
- REST Development: Strong experience designing, developing, and consuming RESTful APIs within a python backend.
- API Development and Integrations: Demonstrated ability to build robust APIs and integrate with various third-party and internal systems.
- Strong problem-solving skills and a proactive approach to identifying and resolving issues.
- Excellent communication and collaboration skills, with the ability to work effectively in a team environment.
- Ability to prioritize tasks, manage time effectively, and work independently when required.
Preferred Qualifications (Nice to Haves)
- Front end development experience (e.g. React, JS).
- Release Management/CI/CD Processes: Solid understanding and practical experience with Continuous Integration/Continuous Delivery (CI/CD) pipelines (e.g: Spinnaker, Jenkins), version control (e.g., Git), and release management best practices.
- Familiarity with cloud platforms (e.g., AWS, Azure, Google Cloud Platform).
- Experience with containerization technologies (e.g., Docker, Kubernetes).
- Knowledge of observability tools (e.g., Prometheus, Grafana, ELK stack).
- Experience with monitoring and alerting systems.