JOB SUMMARY The Reliability & Production Engineering (RPE) team is seeking an experienced SRE & Production Management professional to enhance operational efficiency and deliver best-in-class services. This role focuses on improving production management, system availability, observability, scalability, performance, and resilience through the application of sound monitoring and reliability engineering principles, and the adoption of modern technology and tooling. The position requires a passion for technology to enable efficient operations in a fast-paced environment and contribution to services for clients across Business Units. This role is based in Montreal, requiring onsite presence 3 days per week. Key Responsibilities Work closely with engineering/development teams to design, build, and maintain systems. Troubleshoot issues across the entire technology stack: hardware, software, application, and network. Identify and drive opportunities to improve automation for platforms, including the scope and creation of automation for deployment, management, and visibility of services. Proactively identify and address systems reliability risks. Work alongside existing global and regional team members on a follow-the-sun basis. Represent the RPE organization in design reviews and operational readiness exercises for new and existing services. Provide production support services under the RPE organization. Develop automation and tooling to support SRE activities and achieve specific reliability and supportability goals (reduction of toil, monitoring and alerting efficiency etc.) for in-scope systems and across the larger organization. Required Qualifications 7-15 years of experience. Bachelor's degree in computer science or related field. Proficiency with Linux. Strong experience in Database scripting (stored procedure and compound SQL) and data analysis in PostgreSQL, DB2, Snowflake or MongoDB. DB monitoring and performance tuning. Working experience in Python/Shell scripting. Troubleshooting skills (tracking trends, producing metrics and analysis). Strong verbal and written communication skills for interaction with global teams and customers. Flexibility to work in shifts and perform on-call responsibilities. Minimum of 3 days per week onsite working from the office in Montreal. Preferred Qualifications Experience in financial services/products, investment banking. Experience in Advanced Monitoring/Alerting Tools (Grafana, Loki, Prometheus, Elastic Search etc.). Knowledge of development tools like Git/GitHub, Jenkins etc. Agile/DevOps/SRE mindset and/or tooling. Understanding of Cloud technology. Education: Bachelors Degree
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: compun
- Position Id: AWADC5831811
- Posted 6 hours ago