Overview
Skills
Job Details
Hybrid 3 days a week onsite
Potential to Convert
On-call during off hours weekends and holidays - rotating schedule 2 times a month
Bachelor's Degree in Computer Science Required
Systems Reliability Engineering (SRE) is a discipline focused on improving system service availability, observability, scalability, performance, and resilience across *** by applying sound software engineering principles and adopting the latest technology and tooling. We are growing SRE capabilities within our Reliability & Production Engineering (RPE) organization as part of the transformation of Morgan
Stanley's Technology.
Your responsibilities will include, but not be limited to:
1. Working closely with engineering/development teams to design, build, and maintain systems.
2. Troubleshooting issues across the entire technology stack: hardware, software, application, and network.
3. Identifying and driving opportunities to improve automation for our platforms; scope and create automation for deployment, management, and visibility of our services.
4. Proactively identifying and addressing systems reliability risks.
5. Working alongside existing global and regional team members on a follow-the-sun basis.
6. Represent the RPE organization in design reviews and operational readiness exercises for new and existing services.
The RPE role is required to provide production support services under RPE organization. The role as well requires the member to develop automation and tooling to support SRE activities and achieve specific reliability and supportability goals (reduction of toil, monitoring and alerting efficiency etc.), for in-scope systems and across the larger org.
The below are the key skill sets to perform day to day work:
- Bachelor's degree in computer science or related field;
- Proficiency with Linux;
- Strong experience in Database scripting (stored procedure and compound SQL) and data analysis in Sybase, DB2 or Greenplum etc.; DB monitoring and performance tuning;
- Working experience in Python/Perl/Shell scripting;
- Troublesho
Hybrid 3 days a week onsite
Potential to Convert
On-call during off hours weekends and holidays - rotating schedule 2 times a month
Bachelor's Degree in Computer Science Required
Systems Reliability Engineering (SRE) is a discipline focused on improving system service availability, observability, scalability, performance, and resilience across *** by applying sound software engineering principles and adopting the latest technology and tooling. We are growing SRE capabilities within our Reliability & Production Engineering (RPE) organization as part of the transformation of Morgan
Stanley's Technology.
Your responsibilities will include, but not be limited to:
1. Working closely with engineering/development teams to design, build, and maintain systems.
2. Troubleshooting issues across the entire technology stack: hardware, software, application, and network.
3. Identifying and driving opportunities to improve automation for our platforms; scope and create automation for deployment, management, and visibility of our services.
4. Proactively identifying and addressing systems reliability risks.
5. Working alongside existing global and regional team members on a follow-the-sun basis.
6. Represent the RPE organization in design reviews and operational readiness exercises for new and existing services.
The RPE role is required to provide production support services under RPE organization. The role as well requires the member to develop automation and tooling to support SRE activities and achieve specific reliability and supportability goals (reduction of toil, monitoring and alerting efficiency etc.), for in-scope systems and across the larger org.
The below are the key skill sets to perform day to day work:
- Bachelor's degree in computer science or related field;
- Proficiency with Linux;
- Strong experience in Database scripting (stored procedure and compound SQL) and data analysis in Sybase, DB2 or Greenplum etc.; DB monitoring and performance tuning;
- Working experience in Python/Perl/Shell scripting;
- Troubleshooting skills (tracking trends, producing metrics and analysis);
- Strong verbal and written skills required to interact with global teams and customers;
- Flexibility of work in shift and perform on-call responsibility; Working from office (3 days per week minimum is the current policy).
Good to Have:
- Experience in financial service/products, investment banking;
- Experience in Advanced Monitoring/Alerting Tools (Splunk, AppDynamics, Elastic Search etc.);
- Have knowledge on development tools like GIT, Jenkins etc.;
- Agile/DevOps/SRE mindset and/or tooling;
- Understanding Cloud technology.
oting skills (tracking trends, producing metrics and analysis);
- Strong verbal and written skills required to interact with global teams and customers;
- Flexibility of work in shift and perform on-call responsibility; Working from office (3 days per week minimum is the current policy).
Good to Have:
- Experience in financial service/products, investment banking;
- Experience in Advanced Monitoring/Alerting Tools (Splunk, AppDynamics, Elastic Search etc.);
- Have knowledge on development tools like GIT, Jenkins etc.;
- Agile/DevOps/SRE mindset and/or tooling;
- Understanding Cloud technology.