Senior Software Engineer/SRE - Automated Disaster Recovery

Overview

On Site

USD 160,000.00 - 240,000.00 per year

Full Time

Skills

Interfaces

Database

Testing

Management

Managed Services

Data Centers

Unit Testing

Continuous Integration and Development

Performance Tuning

SLA

Disaster Recovery

Python

TypeScript

Computer Science

Unix

Shell Scripting

TCP/IP

Computer Networking

OSI Model

Client/server

Continuous Integration

Continuous Delivery

Writing

CHAOS

Splunk

Grafana

Reporting

GitHub

JIRA

Product Ownership

Training

Life Insurance

Bloomberg

Job Details

Senior Software Engineer/SRE - Automated Disaster Recovery

Location
New York

Business Area
Engineering and CTO

Ref #
10045491

Description & Requirements

The Team: We are the Platform Database Services Disaster Recovery as a Service SRE team (DRaaS), charged to administer the end-to-end testing of Bloomberg's datacenters for disaster recovery scenarios of numerous services which support applications that constitute Bloomberg's line of products! On any given day we're inventing, engineering, developing, building, coding, trouble-shooting and maintaining a wide range of: tools, monitors, frameworks, interfaces, protocols, solutions and best-practices around Disaster Recovery. These components stitch together a robust suite of automated and self-healing systems that manage the services that the Platform Database Services SRE team provides to the rest of the firm.
What's in it for you:
You will be part of a team that works to help meet company and regulatory defined Disaster Testing standards. Manage and develop solutions that support various disaster recovery tools, creating these applications to integrate the services they provide into the Bloomberg operational environment as well as Bloomberg products. This in-house tooling suite is required to test our clusters and managed services that reside in our datacenters and nodesites in an automated, scale-able and self driven fashion, complete with accompanying metrics and transparency tools that would be required for internal and external clients. Tooling is expected to be written with end-to-end unit testing and continuous integration to provide the highest level of stability.
We have product ownership and "the classic SRE responsibilities" such as: system tuning, performance analysis, defining and following availability targets such as SLA's, SLO's and SLI's as well as having immediate access to the experts that are designing and coding the Bloomberg specific components, APIs and methods used by and supporting the disaster recovery infrastructure. You'll receive insight and entry to the lowest levels of how Bloomberg applications interact with each other and the runtime environments for the purposes of both in-depth troubleshooting and enhancing stability, reliability, performance and feature-set.
You'll need to have:

4+ years of experience in Python and/or TypeScript
A degree in Computer Science, Engineering or similar field of study or equivalent work experience
5+ years experience with Unix, Unix tools and shell scripting
Experience designing stable, long-lasting APIs
Deep understanding of TCP/IP networking and the OSI model
Experience designing and automating repeatable processes in a client/server modeled environment
Ability to build and maintain highly sophisticated, available, performant, and scalable, critically important systems
Experience building monitors and alarms for system performance, status and stability
Experience with CI/CD systems and writing robust unit and system tests

We'd love to see :

Basic knowledge in Rapid framework
Experience analyzing existing systems and identifying shortcomings with proven methods for improvement
Experience with Chaos Engineering
Experience with Splunk/Humio and Grafana or other metric based reporting tools
Experience with GitHub and JIRA
Passion for product ownership

Salary Range = 00 USD Annually + Benefits + Bonus

The referenced salary range is based on the Company's good faith belief at the time of posting. Actual compensation may vary based on factors such as geographic location, work experience, market conditions, education/training and skill level.

We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation (exempt roles only), paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.

Discover what makes Bloomberg unique - watch our for an inside look at our culture, values, and the people behind our success.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Senior Software Engineer/SRE - Automated Disaster Recovery

Job Details

Share