Site Reliability Engineer

Overview

Accepts corp to corp applications
Contract - 1 day((s))

Skills

Strong Windows and OpenStack experience

Job Details

Job Title: Site Reliability Engineer
Location: Florham Park, NJ - Hybrid 3 days onsite (Onsite day 1)



** Due to client requirements we need or candidates **

The interview will be virtual on July 9th 11 AM or 11:30 AM EST

Project Details:
  • This individual will work with R&D teams under Product Development for Retirement Services.
  • They are building software to improve DevOps, ITOps and support processes for a platform as a service.
  • Need strong windows and OpenStack experience - 7 years or more
  • Candidate should have a good balance between troubleshooting an issue, understanding potential problems with an OS, Network, Security and Database.

Responsibilities:
Work with R&D teams to understand the standards of Product Development and recommend changes towards increased stability of the products and applications.
Building software to improve DevOps, ITOps, and support processes which support the "everything" as code model such as "Infrastructure as code", Platform as a service, etc.
Perform safe reliable deployments of all appropriate software artifacts into various systems from Development, Staging to Production.
Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
Create / Maintain plan for disaster recovery in the staging and production environments
Analyze system problems including root cause determination and manage any needed recovery process to ensure a quick restoration of service without loss of data.
Maintains a broad knowledge of state-of-the-art technology, equipment, and/or systems
Able to understand RESTful services, even using APIs to help towards automation goals
Maintain network and system security, understand security protocols, certificate management

Experience/Skills:
Strong Windows and OpenStack experience
Ability to analyze and resolve problems in systems, networks, software, and APIs; understanding where all sources of information can come from.
Strong experience with Splunk and Dynatrace
Understanding of source/version control such as GIT or BitBucket.
DevOps processes and tools such as Azure DevOps or Jenkins
Involvement with containerization, such as Docker or Kubernetes
CI/CD implementation expertise
Experience with IT automation in general. Using tools like Ansible, coding with programming languages like Python, Groovy, PowerShell or Bash scripts.
Windows and Linux OS knowledge preferred.
Use of monitoring and logging tools such as Splunk, Dynatrace or similar
Advanced English proficiency
Understanding Microsoft suite of development tools is a plus, including Visual Studio, IIS, MS SQL Server, .NET

Must haves:
Windows, OpenStack, .Net applications knowledge, experience writing scripts on Ansible, Python scripting, scripting for servers
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.