Site Reliability Engineer- 5 days onsite NoHo, NYC

Overview

On Site

150k - 250k

Full Time

Skills

Financial Services

Finance

Artificial Intelligence

IaaS

High Availability

Scalability

Collaboration

DevOps

Workflow

Incident Management

Database

Documentation

Standard Operating Procedure

Computer Science

Information Technology

Amazon Web Services

Microsoft Azure

Amazon EC2

Amazon S3

Virtual Private Cloud

Management

Kubernetes

Linux Administration

Shell Scripting

Terraform

Scripting

Bash

Python

Computer Networking

TCP/IP

Dragon NaturallySpeaking

DNS

Firewall

Continuous Delivery

Jenkins

GitLab

Continuous Integration

GitHub

Cloud Computing

Regulatory Compliance

SAP BASIS

Job Details

Site Reliability Engineer

This company is developing AI thought partners designed to enhance human intelligence and creativity, transforming how knowledge is created and shared in financial services. We're unapologetically ambitious driven by a clear goal: to build the world's leading Financial AI company.

The company is located in in NoHo, NYC and will be 5 days onsite.

What You Will Be Doing:

Cloud Infrastructure Management: Design, implement, and maintain robust cloud infrastructure on AWS and/or Azure to ensure high availability, scalability, and fault tolerance.
Monitoring & System Health: Leverage Datadog to build proactive monitoring and alerting systems, enabling rapid detection and resolution of performance issues.
Kubernetes & Container Management: Administer and optimize Kubernetes clusters, utilizing Helm for efficient package management and deployment automation.
Automation & Infrastructure as Code: Develop and maintain Infrastructure as Code (IaC) using Terraform; automate routine tasks with scripts written in Bash or Python.
Cross-Functional Collaboration: Partner with development and operations teams to foster a DevOps mindset, streamline CI/CD workflows, and implement best practices.
Incident Response & Troubleshooting: Diagnose and resolve complex issues across OS, networking, and database layers in cloud-based environments.
Documentation: Create and maintain thorough documentation of infrastructure configurations, standard operating procedures, and troubleshooting playbooks.

Required Skills & Experience:

Bachelor's degree in Computer Science, Information Technology, or a related discipline.

3-5 years of hands-on experience with AWS and/or Azure, including services such as EC2, S3, VPC, and Lambda.
2-3 years managing Kubernetes clusters in production environments.
2-3 years of experience using Helm for Kubernetes application deployments.
2-3 years working with monitoring platforms like Datadog.
3-5 years of experience in Linux system administration and shell scripting.
2-3 years of experience with Infrastructure as Code (Terraform preferred).

Strong scripting abilities in Bash and Python.
Solid understanding of networking concepts, including TCP/IP, DNS, firewalls, and load balancers.
Experience with CI/CD tools such as Jenkins, GitLab CI, or GitHub Actions.
Familiarity with cloud-native security practices and regulatory compliance standards.

Applicants must be currently authorized to work in the United States on a full-time basis now and in the future.
This position doesn't provide sponsorship.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About Motion Recruitment Partners, LLC

Share