Reliability Engineer Jobs in San Francisco, CA

Refine Results
41 - 60 of 110 Jobs

Senior Site Reliability Engineer

2K

Novato, California, USA

Full-time

#LI-Onsite On-Call Requirement: Yes (Periodic Rotation) Who We Are 2K is headquartered in Novato, California and is a wholly owned label of Take-Two Interactive Software, Inc. (NASDAQ: TTWO). Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K's portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, Cat

Lead Fullstack Engineer/Site Reliability Engineer

Salesforce

San Francisco, California, USA

Full-time

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category Software Engineering Job Details About Salesforce We're Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too - driving you

Staff Site Reliability Engineer, Fleetnet

Tesla Motors

Remote or Palo Alto, California, USA

Full-time

We are a product focused global team creating the next-generation of server-side infrastructure and code to support the growing suite of Tesla products and services. We are looking for seasoned SREs with domain expertise in areas related to developing infrastructure as a service, Kubernetes, Gitops, K8s Operator development, and platform security. The Fleetnet SRE team is part of the Vehicle Software division and is embedded with our backend application, data platform and navigation development

Senior Site Reliability Engineer

General Motors

Remote

Full-time

Job Description Develop and design software applications for driverless technology company. Duties may include: Build out and improve observability systems, tools and the related codebase. Contribute code, perform code reviews, and create technical designs that improve performance and reliability of observability systems using software and systems engineering skills. Partner with other Software Engineering teams to better understand use-cases and guide the engineers to use the existing tools eff

Technical Lead, Site Reliability Engineer, Fleetnet

Tesla Motors

Remote or Palo Alto, California, USA

Full-time

We are a small team of experts focused on creating the next-generation server-side infrastructure for Tesla. We're the invisible link connecting every Tesla product, whether it's vehicles, robots, robotaxis, chargers or even mobile apps to bring customers the best user experience possible. We're looking for strong, hands on, technical leader with domain expertise in one or more of: containers, public clouds, or private clouds. Today, over 10 million Tesla users rely on our services to safely and

Principal Site Reliability Engineer, Datastores

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

Site Reliability Engineer

LiveRamp

San Francisco, California, USA

Full-time

LiveRamp is the data collaboration platform of choice for the world's most innovative companies. A groundbreaking leader in consumer privacy, data ethics, and foundational identity, LiveRamp is setting the new standard for building a connected customer view with unmatched clarity and context while protecting precious brand and consumer trust. LiveRamp offers complete flexibility to collaborate wherever data lives to support the widest range of data collaboration use cases-within organizations, b

Senior Site Reliability Engineer, ML Platforms

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

Are you passionate about building and maintaining large-scale production systems that support advanced data science and machine learning applications? Do you want to join a team at the heart of NVIDIA's data-driven decision-making culture? If so, we have a great opportunity for you! NVIDIA is seeking a Senior Site Reliability Engineer (SRE) for the Data Science & ML Platform(s) team. The role involves designing, building, and maintaining services that enable real-time data analytics, streaming,

Staff Site Reliability Engineer - Federal - 3rd Shift

ServiceNow, Inc.

Remote or San Diego, California, USA

Full-time

Company Description It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today - ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500 . Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But thi

Staff Site Reliability Engineer, Cell Software

Tesla Motors

Remote or Fremont, California, USA

Full-time

Tesla is re-thinking how batteries are made from the ground up. We're designing new factories, new equipment, new processes and new software to rapidly scale battery manufacturing, globally. The primary bottleneck to Tesla's future expansion (and the transition to sustainable transport and energy storage) is our ability to produce and procure batteries - that's why we're innovating in-house, with our collection of world-class engineers, to redefine the industry. Software, data and automation all

Sr. Site Reliability Engineer, Compute SRE

Roblox

San Mateo, California, USA

Full-time

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences- all created by our global community of developers and creators. At Roblox, we're building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device.We're on a mission to connect a billion people with op

Site Reliability Engineer DevOps | REMOTE (ship required)

Oracle Corporation

Remote

Full-time

Job Description Are you a creative person who loves a challenge? Solve the complex puzzles you've been dreaming of as our Engineer. If you have a passion for innovation in tech, we want you on our team! Thrive in this crucial automation role. Oracle is a technology leader that's changing how the world does business. We're looking for an experienced and self-motivated person. We appreciate you taking the time to review the list of qualifications and to apply for the position. Come and join us!

Senior Site Reliability Engineer / DevOps - REMOTE (ship required)

Oracle Corporation

Remote

Full-time

Job Description Are you a creative person who loves a challenge? Solve the complex puzzles you've been dreaming of as our Engineer. If you have a passion for innovation in tech, we want you on our team! Thrive in this crucial automation role. Oracle is a technology leader that's changing how the world does business. We're looking for an experienced and self-motivated person. We appreciate you taking the time to review the list of qualifications and to apply for the position. Come and join us!

SITE RELIABILITY ENGINEER

Mindbank Consulting Group

Remote

Full-time

JOB DESCRIPTION: As a Site Reliability Engineer (SRE) on our team, you will play a critical part in ensuring our systems and services reliability, scalability, and performance. You will work closely with cross-functional teams, including engineering, cloud infrastructure, and security, to deliver resilient, observable, and compliant solutions in AWS GovCloud and commercial cloud environments. This role requires applying consultative and technical expertise to support cloud initiatives with the

Staff Site Reliability Engineer, Incident and Disaster

Dropbox Inc

Remote

Full-time

Dropbox is a Virtual First company. For this role, we are hiring in Zones 2 and 3. Please refer to our Compensation section below to see what neighborhoods fall under each Zone. Role Description The Incident and Disaster Team aims to reduce Customer pain by speeding up incident response through standardized incident management processes and tooling as well as through incident prevention strategies such as disaster readiness , chaos testing, safer tooling, stronger controls, automated conformanc

Site Reliability Engineer

AE Business Solutions

Remote

Full-time

AEBS is seeking a Site Reliability Engineer to take on a fully-remote, contract position! The Site Reliability Engineer must be able to work central time zone hours throughout this engagement. *No C2C inquiries, please Ideal Skills/Background: 3+ years of experience implementing Infrastructure-as-Code (IAC) with Terraform 3+ years of Python development experience 3+ years of experience in designing/developing/operating applications in the Azure cloud 2+ years of experience building and running

Senior Platform Engineer - 100% remote from anywhere in the US

Calance

Remote or New York, New York, USA

Full-time

Position Summary: Our client is seeking a highly skilled and experienced Senior Platform Developer II to join their team. This pivotal role will be instrumental in building, scaling, and maintain the robust and secure infrastructure that powers our mission-critical platform. You will be a force multiplier, enabling our development teams to deliver features faster and more reliably. You will champion infrastructure-as-code principles, contribute code to platform scalability, drive automation acr

Site Reliability Engineer

Madison-Davis, LLC

Remote

Contract

Role: Drive the technical implementation of monitoring and alerting strategies across enterprise-scale applications and infrastructure.Collaborate directly with development teams to ensure each new initiative includes the correct telemetry, log tagging, and alert payloads.Act as a liaison to Level 2 and Level 3 support teams to maintain and enhance monitoring dashboards used by the enterprise command center (EMC).Standardize alert formats to ensure consistency with SRE policies and support downs

Internship, Site Reliability Engineer, Applications Engineering (Fall 2025)

Tesla Motors

Fremont, California, USA

Full-time

Consider before submitting an application: This position is expected to start around September 2025 and continue through the Fall term (approximately December 2025) or into Spring 2026 if available and there is an opportunity to do so. We ask for a minimum of 12 weeks, full-time and on-site, for most internships. Our internship program is for students who are actively enrolled in an academic program. entry level candidates seeking employment after graduation and not returning to school should a

Senior Site Reliability Engineer

GlobalLogic Inc.

Remote

Full-time

Job Description: Design, deploy, and scale our Prometheus architecture to handle 100+ million active series and beyond.Deploy and operate large, high-performance Elasticsearch clusters holding 2000+TB of data.Deploy and grow high-throughput data pipelines built on Kafka, handling hundreds of thousands of events per second.Design and build an alerting system that allows engineering teams to construct alerts from multiple data sources and alerting workflows.Write libraries and APIs that give engin