Lead Site Reliability Engineer Jobs in San Francisco, CA

Refine Results
1 - 20 of 43 Jobs

Lead Site Reliability Engineer II, Production Engineering

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

Lead Site Reliability Engineer - Remote

UnitedHealth Group

Remote or Minnetonka, Minnesota, USA

Full-time

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us adv

Lead Site Reliability Engineer, Observability - Remote

Cisco Systems, Inc.

Remote

Full-time

Application window is open until further notice. The Meraki cloud supports millions of customer devices from 10 data centers around the world. Meraki's customer base has grown by a factor of 2-3 every year, serving billions of HTTP requests per day globally. Our customers depend on our products to run their critical infrastructure of network switches, security appliances, wireless APs and security cameras. As SREs at Meraki, we are responsible for building and growing the cloud that supports t

Lead Site Reliability Engineer, Network - Remote

Cisco Systems, Inc.

Remote

Full-time

Application window is open until further notice. The Meraki cloud supports millions of customer devices from 10 data centers around the world. Meraki's customer base has grown by a factor of 2-3 every year, serving billions of HTTP requests per day globally. Our customers depend on our products to run their critical infrastructure of network switches, security appliances, wireless APs and security cameras. As SREs at Meraki, we are responsible for building and growing the cloud that supports t

Senior Lead Site Reliability Engineer - Remote

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Would you enjoy improving stability and safety of one of the largest global networks? \n Would you enjoy hands-on network operations work on a global scale to improve our operational efficiency? \n Join our Platform Security Engineering Team \n The Platform Security Engineering team is a group of engineers that support and secure Akamai's global network and Linode cloud systems. Our systems provide data security, server integrity, network access, and secure communications infrastructure. This is

Senior Dev Operations Engineer SRE

Buxton Consulting

Remote

Contract

Senior Dev Operations Engineer SRE Remote (Pleasanton, CA) 12+ Months Top 3 Must Haves Experience setting up alerts / alarms / notifications in AWS cloud. CloudWatch / Dynatrace Experience with AWS solutions using AWS services including Kafka, ECS, EKS. Experience with IaC (Infrastructure as code) CDK or Terraform. Thanks and Regards, Ajeet Singh Buxton Consulting 2010 Crow Canyon Place STE 100 San Ramon, CA 94583 Direct: Email:

Senior Site Reliability Engineer

Salesforce

San Francisco, California, USA

Full-time

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category Software Engineering Job Details About Salesforce We're Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too - driving you

Senior Site Reliability Engineer, Infrastructure

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

Senior Site Reliability Engineer

Circles Inc.

Remote or San Francisco, California, USA

Full-time

Circle is a financial technology company at the epicenter of the emerging internet of money, where value can finally travel like other digital data - globally, nearly instantly and less expensively than legacy settlement systems. This ground-breaking new internet layer opens up previously unimaginable possibilities for payments, commerce and markets that can help raise global economic prosperity and enhance inclusion. Our infrastructure - including USDC, a blockchain-based dollar - helps busines

Senior SRE

LiveRamp

San Francisco, California, USA

Full-time

LiveRamp is the data collaboration platform of choice for the world's most innovative companies. A groundbreaking leader in consumer privacy, data ethics, and foundational identity, LiveRamp is setting the new standard for building a connected customer view with unmatched clarity and context while protecting precious brand and consumer trust. LiveRamp offers complete flexibility to collaborate wherever data lives to support the widest range of data collaboration use cases-within organizations, b

Principal Site Reliability Engineer, Datastores

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

Senior Site Reliability Engineer, Test Platform- REMOTE

Cisco Systems, Inc.

Remote or San Francisco, California, USA

Full-time

At Cisco Meraki, we create magic through the energy and passion of our employees, who shape our dynamic community and empower us to solve problems for our customers. This magic unfolds when technology becomes intuitive, functions as intended, and when every individual is valued. By providing our employees with the autonomy to make an impact, we strive to fulfill our mission of simplifying technology so our customers can focus on what matters most to them-whether it's their students, patients, cu

Staff Site Reliability Engineer, Cell Software

Tesla Motors

Remote or Fremont, California, USA

Full-time

Tesla is re-thinking how batteries are made from the ground up. We're designing new factories, new equipment, new processes and new software to rapidly scale battery manufacturing, globally. The primary bottleneck to Tesla's future expansion (and the transition to sustainable transport and energy storage) is our ability to produce and procure batteries - that's why we're innovating in-house, with our collection of world-class engineers, to redefine the industry. Software, data and automation all

Sr. Site Reliability Engineer, Compute SRE

Roblox

San Mateo, California, USA

Full-time

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences- all created by our global community of developers and creators. At Roblox, we're building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device.We're on a mission to connect a billion people with op

Senior Staff Site Reliability Engineer - CDN

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. Our legacy of innovation is driven by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, y

Staff Site Reliability Engineer, Fleetnet

Tesla Motors

Remote or Palo Alto, California, USA

Full-time

We are a product focused global team creating the next-generation of server-side infrastructure and code to support the growing suite of Tesla products and services. We are looking for seasoned SREs with domain expertise in areas related to developing infrastructure as a service, Kubernetes, Gitops, K8s Operator development, and platform security. The Fleetnet SRE team is part of the Vehicle Software division and is embedded with our backend application, data platform and navigation development

Staff Site Reliability Engineer, AI Platform

Tesla Motors

Palo Alto, California, USA

Full-time

As a Site Reliability Engineer (SRE) for the AI Platform team, you will manage bleeding-edge bare-metal servers for Tesla's advanced generative AI platform. You will be responsible for the imaging, configuration management, observability, security, and scalability of these systems. You'll also manage the model benchmarks and their outputs. You should have a focus on automating anything required of this AI platform team and use various platforms to make it as easy as possible for the software eng

Sr. Site Reliability Engineer, Dojo

Tesla Motors

Palo Alto, California, USA

Full-time

We are seeking an experienced Site Reliability Engineer (SRE) to join our team responsible for ensuring the reliability, performance of our Dojo cluster infrastructure. The successful candidate will be responsible for providing exceptional customer response and support, managing third-party systems, and collaborating with various teams to ensure seamless operations. If you have a passion for troubleshooting, automation, and collaboration, we encourage you to apply. Responsibilities Respond to c

Sr. Site Reliability Engineer, Integration Tools

Tesla Motors

Palo Alto, California, USA

Full-time

The Integration Platforms team develops and operates critical technology to support our ever-expanding customer fleet from prototype to production. As an SRE on this team, you will ensure the reliability, scalability, and performance of our on-vehicle, desktop-based, and web-based systems, collaborating closely with software engineers to design, build, and operate these systems across multiple regions. Join us and you will work alongside world-class software and data engineers on some of the new

Senior SRE Engineer

M&T BANK CORPORATION

Remote or Buffalo, New York, USA

Full-time

Job Overview: We are looking for a highly motivated SR SRE Engineer with a strong background in Observability to join our growing team. This role requires a seasoned professional to guide our team in building, scaling, and maintaining observability solutions that help ensure our systems and services are highly available, performant, and secure. Responsibilities: Lead the development and implementation of observability tools and practices across multiple platforms, including monitoring, logging