site reliability engineer Jobs in nyc, ny

Refine Results
21 - 40 of 190 Jobs

Site Reliability Engineer

General Dynamics

Remote or Aurora, Colorado, USA

Full-time

Basic Qualifications Bachelor's degree in Computer Science, a related field or equivalent experience is required plus a minimum of 5 years of relevant experience; or Master's degree plus 3 years of relevant experience. CLEARANCE REQUIREMENTS: Department of Defense TS/SCI security clearance is required at time of hire. Applicants selected will be subject to a U.S. Government security investigation and must meet eligibility requirements for access to classified information. Due to the nature of

ALTS - Lead SRE

JPMorgan Chase & Co.

Jersey City, New Jersey, USA

Full-time

Job Description We are seeking an experienced Lead Site Reliability Engineer (SRE) to manage and guide our team. The ideal candidate will have a strong foundation in SRE, DevOps, or infrastructure engineering, with leadership skills and the ability to drive team success in a fast-paced, dynamic environment. This role involves overseeing the team's execution, risk management, and strategic initiatives while fostering a collaborative and innovative culture. Key Responsibilities: Team Leadership

Site Reliability Engineer II

JPMorgan Chase & Co.

Jersey City, New Jersey, USA

Full-time

Job Description Play a key role in ensuring system reliability at one of the world's most iconic and largest financial institutions. As a Site Reliability Engineer II at JPMorgan Chase within the Enterprise Technology, Finance Technology team, you will use technology to solve business problems and leverage software engineering best practices as we strive towards excellence. This role often works independently to execute small to medium projects, but you'll also have the opportunity to collabor

Senior Site Reliability Engineer, Observability

mongoDB, inc

New York, New York, USA

Full-time

MongoDB's mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to easily build, scale, and run modern applications by helping them modernize legacy workloads, embrace innovation, and unleash AI. Our industry-leading developer data platform, MongoDB Atlas, is the only globally distributed, multi-cloud database and is available in more than 115 regions across AWS, Google Cloud, and Microsoft

Senior Director of Site Reliability Engineering, Data Platforms

JPMorgan Chase & Co.

New York, New York, USA

Full-time

Job Description If you are a site reliability engineering leader ready to take the reins and drive impact, we've got an opportunity just for you. As a Senior Director of Site Reliability Engineering at JPMorgan Chase within the Enterprise Technology, AI/ML & Data Platforms division, you will play a crucial role at both business and firmwide levels. Your responsibilities will include inspiring your team and others to deliver robust and resilient products and services to our clients. You will be

Senior Lead Site Reliability Engineer

JPMorgan Chase & Co.

Jersey City, New Jersey, USA

Full-time

Job Description As a Senior Lead Site Reliability Engineer at JPMorgan Chase within the Infrastructure & Production Management sector of Consumer & Community Banking, you will be tasked with closely collaborating with stakeholders to establish non-functional requirements (NFRs) and set service availability targets for various applications and product lines. Your role will be crucial in incorporating these NFRs during the design and testing stages of product development, accurately evaluating cu

Senior Site Reliability Engineer, Atlas

mongoDB, inc

New York, New York, USA

Full-time

MongoDB's mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to easily build, scale, and run modern applications by helping them modernize legacy workloads, embrace innovation, and unleash AI. Our industry-leading developer data platform, MongoDB Atlas, is the only globally distributed, multi-cloud database and is available in more than 115 regions across AWS, Google Cloud, and Microsoft

Site Reliability Engineer - USDS

TikTok

New York, New York, USA

Full-time

Location : New York Employment Type : Regular Job Code : A84289 Apply to this job Share this listing: Responsibilities Site Reliability Engineering(SRE) at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. In our team, you'll have the opportunity to manage the complex challenges of scale, while using expertise in coding, algorithms, complexity analysis, and large-scale system design. We embrace a culture of di

ClickHouse SRE, Data Platform -USDS

TikTok

New York, New York, USA

Full-time

Location : New York Employment Type : Regular Job Code : A33433 Apply to this job Share this listing: Responsibilities Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed services and infrastructures. As a site reliability engineer in the data platform area, you will have the opportunity to manage the services and infrastructures in one of the largest data plaforms in the world that directly supports the TikTok a

Data Ingestion SRE, Data Platform - USDS

TikTok

New York, New York, USA

Full-time

Location : New York Employment Type : Regular Job Code : A192260 Apply to this job Share this listing: Responsibilities Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed services and infrastructures. As a site reliability engineer in the data platform area, you will have the opportunity to manage the services and infrastructures in one of the largest dataplaforms in the world that directly supports the TikTok a

Software Engineer, SRE, Payments - USDS

TikTok

New York, New York, USA

Full-time

Location : New York Employment Type : Regular Job Code : A173999 Apply to this job Share this listing: Responsibilities Team Intro: The Global Payment team of the US Tech Service department of TikTok provides all-round payment solutions for the company's USA products, overseas commercialization, and the company's overseas travel and procurement, including channel access, product order design, user interaction, capital management, tax and exchange optimization, settlement Reconciliation an

Site Reliability Engineer / NYC / On-site

Motion Recruitment Partners, LLC

New York, New York, USA

Full-time

This is an opportunity to join a fast-paced infrastructure team focused on scaling cloud-native systems that support complex AI and data workloads. This is a full-time role based in New York City, working with AWS, Kubernetes, Helm, Terraform, Datadog, and scripting in Bash and Python to ensure reliability, automation, and observability across systems. You'll be part of a cutting-edge environment operating at the intersection of fintech and AI, helping build platforms that power smarter financia

SRE (Linux / Golang Automation)

Bayside Solutions

Remote

Contract

Site Reliability Engineer (Linux / Golang Automation) W2 Contract Salary Range: $124,800 - $145,600 per year Location: Remote Role - PST Job Summary: We require a Site Reliability Engineer with a strong background and experience supporting extensive virtualization and Linux compute platforms. Requirements and Qualifications: Experience automating with Golang Experience with Infrastructure as a Service orchestration tools (OpenStack, CloudStack, etc.) Strong experience supporting Linux and

Site Reliability Engineer for CIAM

Barclays

Hanover, New Jersey, USA

Full-time

Job Description Purpose of the role To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.Resolution, analysis and response to system outages and disruptions, and implement measures to prevent si

SRE / PLATFORM / DEVOPS ENGINEER

Motion Recruitment Partners, LLC

New York, New York, USA

Full-time

A prominent technology firm is seeking a Site Reliability Engineer/DevOps Engineer to join its innovative engineering team. This role involves designing, implementing, and automating modern cloud infrastructure solutions while collaborating with a skilled team. The position is based in Lower Manhattan, NYC, with a hybrid work model likely requiring 4 days onsite per week. Required Skills & Experience 3+ years of experience in DevOps, Site Reliability Engineering, or a similar role. Proficiency

Senior Site Reliability Engineer, NIM Factory

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent.

Senior Site Reliability Engineer

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Do you like collaborating across teams to solve complex problems? Do you have a passion for cutting edge technologies and tackling system problems? Join our critical Nameserver SRE team Our team is responsible for defining, measuring, publishing and optimizing key performance indicators of Akamai's nameserver platform. We take a holistic view of complex systems and identify the measures which matter most to customers. Partnering with multiple teams we address difficult problems that go beyond

Cloud Site Reliability Engineer - Azure/AWS (34084)

Myticas LLC

Remote or CA

Contract

Cloud Site Reliability Engineer - AWS & Azure Responsibilities Oversee the design and improvement of infrastructure using SRE best practices, including IaC, recovery automation, and systems that detect and resolve issues independently. Manage and fine-tune critical services across both cloud and on-prem environments: Kubernetes clusters, CI/CD pipelines, artifact registries, and custom workloads. Enhance observability through intelligent logging, metrics, tracing, and alerting. Ensuring systems

Principal Site Reliability Engineer, AI Infrastructure

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you! NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for over 30 years. It's an outstanding legacy of innovation that's fueled by phenomenal technology and exceptional people. Today, we're tapping into the unlimited potenti

Principal Architect, Site Reliability Engineering - GeForce Now

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

NVIDIA is the world leader in accelerated computing-from gaming to data centers to AI and robotics. We are a team of trailblazers reinventing computing at the intersection of graphics, high-performance computing, and AI. If you're driven to tackle sophisticated challenges, push boundaries, and build technology that powers the future, NVIDIA is the place for you. We are looking for an expert and transformative Principal Architect for Site Reliability Engineering (SRE) to join our GeForce Now Engi