Lead Site Reliability Engineer Jobs in California

Refine Results
21 - 40 of 71 Jobs

Senior Manager, Site Reliability Engineering

Motion Recruitment Partners, LLC

California, USA

Full-time

Job Description This financial software SaaS company has become the leading provider of cloud software and is looking for a Senior SRE Manager to join their team! You will be hands on, but also oversee and grow a small team of SREs. Required Skills & Experience Google Cloud Platform, Azure, or AWS Terraform / Ansible CI/CD tools Python, Go, or Java What You Will Be Doing Tech Breakdown 100% Linux Daily Responsibilities 50% hands on, 50% management Design, deploy, build, and support their cl

Senior Site Reliability Engineer

Motion Recruitment Partners, LLC

California, USA

Full-time

Job Description This OC based IT hardware company is looking to bring in multiple SREs to their team. This role will be geofencing a data center, modernizing their AWS and EKS, and much more! Required Skills & Experience AWS Kubernetes Python CI/CD tools Desired Skills & Experience Kibana Grafana What You Will Be Doing Tech Breakdown 100% Red Hat Linux Daily Responsibilities 100% Hands On The Offer 150-180k You will receive the following benefits: 100% covered Medical Insurance 100% cov

Senior SRE Engineer

M&T BANK CORPORATION

Remote or Buffalo, New York, USA

Full-time

Job Overview: We are looking for a highly motivated SR SRE Engineer with a strong background in Observability to join our growing team. This role requires a seasoned professional to guide our team in building, scaling, and maintaining observability solutions that help ensure our systems and services are highly available, performant, and secure. Responsibilities: Lead the development and implementation of observability tools and practices across multiple platforms, including monitoring, logging

Senior Site Reliability Engineer, Infrastructure

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

Sr Site Reliability Engineer (SASE)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of

Senior Site Reliability Engineer

Circles Inc.

Remote or San Francisco, California, USA

Full-time

Circle is a financial technology company at the epicenter of the emerging internet of money, where value can finally travel like other digital data - globally, nearly instantly and less expensively than legacy settlement systems. This ground-breaking new internet layer opens up previously unimaginable possibilities for payments, commerce and markets that can help raise global economic prosperity and enhance inclusion. Our infrastructure - including USDC, a blockchain-based dollar - helps busines

Principal Site Reliability Engineer - Storage

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Do you enjoy collaborating with teams to solve complex challenges? Do you have a passion for cutting edge technologies and tackling distributed system problems? Join our highly skilled Storage Team! We design, deploy, and manage applications and infrastructure that supports Akamai's internal and customer-facing cloud storage platforms. We do this while maintaining Akamai's mission to make life better for billions of people, billions of times a day. Partner with the best In this role, you'll

Senior Site Reliability Engineer

NVIDIA Corporation

Santa Clara, California, USA

Full-time

Join our team in Santa Clara, CA, USA as a Senior Site Reliability Engineer. At NVIDIA, you'll be part of the team shaping the future of computing and guaranteeing the smooth operation of our brand-new technologies. Our mission is to leverage AI's power to build outstanding and pioneering solutions that have a significant impact on the world. What you'll be doing: Own the solutions you build, collaborating with cross-functional teams to successfully implement them.Collaborate with various teams

Senior Staff Site Reliability Engineer - CDN

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. Our legacy of innovation is driven by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, y

Staff Site Reliability Engineer, Cell Software

Tesla Motors

Remote or Austin, Texas, USA

Full-time

Tesla is re-thinking how batteries are made from the ground up. We're designing new factories, new equipment, new processes and new software to rapidly scale battery manufacturing, globally. The primary bottleneck to Tesla's future expansion (and the transition to sustainable transport and energy storage) is our ability to produce and procure batteries - that's why we're innovating in-house, with our collection of world-class engineers, to redefine the industry. Software, data and automation all

Staff Site Reliability Engineer, Cell Software

Tesla Motors

Remote or Fremont, California, USA

Full-time

Tesla is re-thinking how batteries are made from the ground up. We're designing new factories, new equipment, new processes and new software to rapidly scale battery manufacturing, globally. The primary bottleneck to Tesla's future expansion (and the transition to sustainable transport and energy storage) is our ability to produce and procure batteries - that's why we're innovating in-house, with our collection of world-class engineers, to redefine the industry. Software, data and automation all

Senior Site Reliability Engineer - Azure - Remote

UnitedHealth Group

Remote or Eden Prairie, Minnesota, USA

Full-time

For those who want to invent the future of health care, here's your opportunity. We're going beyond basic care to health programs integrated across the entire continuum of care. Join us to start Caring. Connecting. Growing together. The Sr Site Reliability Engineer will architect, develop, and maintain Optum Serve's cloud environment in both the commercial and government cloud. The role will work closely with software engineers, architects, and DevOps engineers to architect and maintain a secur

Senior SRE

LiveRamp

San Francisco, California, USA

Full-time

LiveRamp is the data collaboration platform of choice for the world's most innovative companies. A groundbreaking leader in consumer privacy, data ethics, and foundational identity, LiveRamp is setting the new standard for building a connected customer view with unmatched clarity and context while protecting precious brand and consumer trust. LiveRamp offers complete flexibility to collaborate wherever data lives to support the widest range of data collaboration use cases-within organizations, b

Principal Site Reliability Engineer, Datastores

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

Sr. Site Reliability Engineer: Splunk Cloud Services

Splunk Inc.

Colorado, USA

Full-time

Description Sr. Site Reliability Engineer: Splunk Cloud Services Job Description Join us as we pursue our exciting vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we're committed to our work, customers, having fun and most significantly to each other's success. Learn more about Splunk careers and how you can become a part of our

Senior Site Reliability Engineer, HPC and LSF

NVIDIA Corporation

Santa Clara, California, USA

Full-time

NVIDIA is the leader in AI, machine learning and datacenter acceleration. NVIDIA is expanding that leadership into datacenter networking with ethernet switches, NICs and DPUs NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is a "learning machine" th

Staff Site Reliability Engineer, Fleetnet

Tesla Motors

Remote or Palo Alto, California, USA

Full-time

We are a product focused global team creating the next-generation of server-side infrastructure and code to support the growing suite of Tesla products and services. We are looking for seasoned SREs with domain expertise in areas related to developing infrastructure as a service, Kubernetes, Gitops, K8s Operator development, and platform security. The Fleetnet SRE team is part of the Vehicle Software division and is embedded with our backend application, data platform and navigation development

Principal Site Reliability Engineer (WildFire Cloud Infrastructure)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of

Sr Site Reliability Engineer (App Service Team)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of

Staff Site Reliability Engineer, AI Platform

Tesla Motors

Palo Alto, California, USA

Full-time

As a Site Reliability Engineer (SRE) for the AI Platform team, you will manage bleeding-edge bare-metal servers for Tesla's advanced generative AI platform. You will be responsible for the imaging, configuration management, observability, security, and scalability of these systems. You'll also manage the model benchmarks and their outputs. You should have a focus on automating anything required of this AI platform team and use various platforms to make it as easy as possible for the software eng