Lead Site Reliability Engineer Jobs in San Francisco, CA

Refine Results
21 - 40 of 44 Jobs

Staff Site Reliability Engineer

Block Inc

California, USA

Full-time

Block is one company built from many blocks, all united by the same purpose of economic empowerment. The blocks that form our foundational teams - People, Finance, Counsel, Hardware, Information Security, Platform Infrastructure Engineering, and more - provide support and guidance at the corporate level. They work across business groups and around the globe, spanning time zones and disciplines to develop inclusive People policies, forecast finances, give legal counsel, safeguard systems, nurture

Senior Site Reliability Engineer - DGX Cloud

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demand knowledge across different systems, networking, coding, database, capacity management, continuous delivery and deployment and open source cloud enabling technologies like Kubernetes and OpenStack. SRE at N

Sr. Site Reliability Engineer, Energy Software

Tesla Motors

Palo Alto, California, USA

Full-time

Tesla is looking for a Site Reliability Engineer to build, enhance, and scale the infrastructure that underpins our Energy IoT applications. These applications provide real-time monitoring, optimization, and control for Tesla's industry-leading energy products, including Powerwall, Megapack, Solar Roof, Supercharger, Wall Connector, Autobidder, and Virtual Power Plants. We are a high-impact team that values curiosity, learning, mentorship, open discourse, and making disciplined decisions by weig

Principal Site Reliability Engineer - Storage

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Do you enjoy collaborating with teams to solve complex challenges? Do you have a passion for cutting edge technologies and tackling distributed system problems? Join our highly skilled Storage Team! We design, deploy, and manage applications and infrastructure that supports Akamai's internal and customer-facing cloud storage platforms. We do this while maintaining Akamai's mission to make life better for billions of people, billions of times a day. Partner with the best In this role, you'll

Staff Site Reliability Engineer, Cell Software

Tesla Motors

Remote or Austin, Texas, USA

Full-time

Tesla is re-thinking how batteries are made from the ground up. We're designing new factories, new equipment, new processes and new software to rapidly scale battery manufacturing, globally. The primary bottleneck to Tesla's future expansion (and the transition to sustainable transport and energy storage) is our ability to produce and procure batteries - that's why we're innovating in-house, with our collection of world-class engineers, to redefine the industry. Software, data and automation all

Senior Site Reliability Engineer - Azure - Remote

UnitedHealth Group

Remote or Eden Prairie, Minnesota, USA

Full-time

For those who want to invent the future of health care, here's your opportunity. We're going beyond basic care to health programs integrated across the entire continuum of care. Join us to start Caring. Connecting. Growing together. The Sr Site Reliability Engineer will architect, develop, and maintain Optum Serve's cloud environment in both the commercial and government cloud. The role will work closely with software engineers, architects, and DevOps engineers to architect and maintain a secur

Sr. Site Reliability Engineer: Splunk Cloud Services

Splunk Inc.

Colorado, USA

Full-time

Description Sr. Site Reliability Engineer: Splunk Cloud Services Job Description Join us as we pursue our exciting vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we're committed to our work, customers, having fun and most significantly to each other's success. Learn more about Splunk careers and how you can become a part of our

Senior Site Reliability Engineer - Observability (FedRAMP IL5)

Splunk Inc.

North Carolina, USA

Full-time

Description Join us as we pursue our ground-breaking vision to make machine data accessible, usable, and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we are committed to our work, customers, having fun, and most significantly to each other's success. The Splunk Observability Cloud provides full-fidelity monitoring and fixing across infrastructure, applications, and user inter

Senior Site Reliability Engineer, Observability, FedRAMP

Splunk Inc.

California, USA

Full-time

Description Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Our customers love our technology, but it's our caring employees that make Splunk stand out as an amazing career destination. No matter where in the world or what level of the organization, we approach our wor

Senior Azure SRE

Kforce Technology Staffing

Remote or Tampa, Florida, USA

Contract

RESPONSIBILITIES: Kforce has a client in Tampa, FL that is seeking a highly skilled Senior Infrastructure Engineer to drive the design, automation, and optimization of cloud infrastructure supporting the firm's core technologies and applications. Acting as a key technical expert, you'll ensure our platforms are scalable, resilient, and aligned with strategic IT initiatives. Responsibilities: * Design and automate infrastructure management to improve system reliability, scalability, and performa

Sr. Site Reliability Engineer, Bare Metal, Infrastructure

Tesla Motors

Remote or Austin, Texas, USA

Full-time

Tesla cloud as a service seeks a high impact Site Reliability Engineer (SRE) to support our bare-metal provisioning platform at scale. You'll provide direct support to internal customers, resolve complex provisioning issues, and escalate systemic problems to engineering. Your focus: ensuring reliable, automated delivery of bare-metal infrastructure using Kubernetes, Metal , and industry standard tooling across diverse hardware from Supermicro, HPE, and Dell. Responsibilities Provide frontline s

Sr. Linux Site Reliability Engineer, IT Manufacturing Site Reliability Engineering

Tesla Motors

Fremont, California, USA

Full-time

We are seeking an enthusiastic SRE to join our dynamic IT Manufacturing Site Reliability Engineering (ITMFG-SRE) team at Tesla. Our team is responsible for building and managing an ecosystem of applications and platforms essential to manufacturing. As a Linux SRE, this role requires experience with hardware, software, networking, and automation to implement scalable solutions for manufacturing sites globally. You'll play a key role in maintaining, optimizing and scaling our infrastructure to sup

Senior Site Reliability Engineer-FedRAMP (FULLY REMOTE)

Splunk Inc.

California, USA

Full-time

Description Join us as we pursue our ground-breaking vision to make machine data accessible, usable, and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we are committed to our work, customers, having fun, and most significantly to each other's success. The Splunk Observability Cloud provides full-fidelity monitoring and fixing across infrastructure, applications, and user interf

Principal Site Reliability Engineer (Safety) - Nashville, TN Hybrid

Oracle Corporation

Remote

Full-time

Job Description We offer unique opportunities for smart, hands-on engineers with the expertise and passion to solve difficult architecture, engineering, and process problems. Our customers run their businesses on our cloud, and our mission is to provide them with the most secure cloud services. Our ideal candidate is a site reliability or devops engineer with expertise and passion in finding and improving how services are deployed and operated. If this is you, joining Oracle Cloud Infrastructur

Sr. DevOps/Site Reliability Engineer

MTW Recruit

Remote

Full-time

No 3rd party inquiries will be processed This is a 100% remote role in Eastern Standard Time zone - preference to EST and CST zones Seeking a talented DevOps/Site Reliability Engineer (SRE) with expertise in Kubernetes, Terraform, Azure, and observability tools like DataDog to deploy and manage scalable, reliable infrastructure. Requirements:Minimum of 7 years of DevOps experience, preferably with SRE background Terraform and Kubernetes are absolutely requiredStrong problem-solving, communicat

Senior Site Reliability Engineer with Kubernetes - W2 - Remote in EST hours (Posted by SAM)

Global Force USA

Remote

Contract

Requirements: 4 + years of experience working within a cloud engineer/SRE roleExpert knowledge of a cloud service providerExpert knowledge and hands on production experience in Kubernetes (bare metal or managed) cluster setup and management required.Experience with infrastructure as code (IaC) tools like Terraform, Pulumi.Experience with Kubernetes deployment tools like Helm, ArgoCD, FluxStrong awareness of networking and internet protocols.Understanding of identity and access management (IAM)Ex

Sr. Spclst , Cloud Engineering

Merck & Company Inc

Remote or Rahway, New Jersey, USA

Full-time

Job Description We are looking for an experienced and enthusiastic Senior Site Reliability Engineer to join our Agile Planning Product Team. As part of the DevXOps Product Line, you will enable product teams to deliver value faster to the business by improving platform services that accelerate agile development projects. You will design scalable solutions, create CI/CD pipelines, and implement automation to enhance reliability and efficiency. As a Senior Reliability Engineer, you will: Become

AI/ML Site Reliability Engineer (SRE)

Lockheed Martin Corporation

Remote or King of Prussia, Pennsylvania, USA

Full-time

Job Description Space is a critical domain, connecting our technologies, our security and our humanity. While others view space as a destination, we see it as a realm of possibilities, where we can do more - we can innovate, invest, inspire and integrate our capabilities to transform the future. At Lockheed Martin Space, we aim to harness the full potential of space to cultivate innovation, reduce costs, and push the boundaries of what technology can achieve. We're creating future-ready solutio

Principal Application Engineer (SRE)

DISCOVER

Remote or Riverwoods, Illinois, USA

Full-time

Discover. A brighter future. With us, you'll do meaningful work from Day 1. Our collaborative culture is built on three core behaviors: We Play to Win, We Get Better Every Day & We Succeed Together. And we mean it - we want you to grow and make a difference at one of the world's leading digital banking and payments companies. We value what makes you unique so that you have an opportunity to shine. Come build your future, while being the reason millions of people find a brighter financial future

Sr Staff Software Engineer, Reliability Engineering

Airbnb

Remote

Full-time

Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way. The Community You Will Join: We are a community based on connection and belonging - a community that was born in 2007 when two hosts welc