site reliability engineer Jobs in san francisco, ca

Refine Results
21 - 40 of 157 Jobs

Principal Site Reliability Engineer, Datastores

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

SRE (Linux / Golang Automation)

Bayside Solutions

Remote

Contract

Site Reliability Engineer (Linux / Golang Automation) W2 Contract Salary Range: $124,800 - $145,600 per year Location: Remote Role - PST Job Summary: We require a Site Reliability Engineer with a strong background and experience supporting extensive virtualization and Linux compute platforms. Requirements and Qualifications: Experience automating with Golang Experience with Infrastructure as a Service orchestration tools (OpenStack, CloudStack, etc.) Strong experience supporting Linux and

Senior Site Reliability Engineer, ML Platforms

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

Are you passionate about building and maintaining large-scale production systems that support advanced data science and machine learning applications? Do you want to join a team at the heart of NVIDIA's data-driven decision-making culture? If so, we have a great opportunity for you! NVIDIA is seeking a Senior Site Reliability Engineer (SRE) for the Data Science & ML Platform(s) team. The role involves designing, building, and maintaining services that enable real-time data analytics, streaming,

Internship, Site Reliability Engineer, Applications Engineering (Fall 2025)

Tesla Motors

Fremont, California, USA

Full-time

Consider before submitting an application: This position is expected to start around September 2025 and continue through the Fall term (approximately December 2025) or into Spring 2026 if available and there is an opportunity to do so. We ask for a minimum of 12 weeks, full-time and on-site, for most internships. Our internship program is for students who are actively enrolled in an academic program. entry level candidates seeking employment after graduation and not returning to school should a

Sr. Site Reliability Engineer, Compute SRE

Roblox

San Mateo, California, USA

Full-time

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences- all created by our global community of developers and creators. At Roblox, we're building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device.We're on a mission to connect a billion people with op

Staff Site Reliability Engineer, Cell Software

Tesla Motors

Remote or Fremont, California, USA

Full-time

Tesla is re-thinking how batteries are made from the ground up. We're designing new factories, new equipment, new processes and new software to rapidly scale battery manufacturing, globally. The primary bottleneck to Tesla's future expansion (and the transition to sustainable transport and energy storage) is our ability to produce and procure batteries - that's why we're innovating in-house, with our collection of world-class engineers, to redefine the industry. Software, data and automation all

Sr. Site Reliability Engineer, Dojo

Tesla Motors

Palo Alto, California, USA

Full-time

We are seeking an experienced Site Reliability Engineer (SRE) to join our team responsible for ensuring the reliability, performance of our Dojo cluster infrastructure. The successful candidate will be responsible for providing exceptional customer response and support, managing third-party systems, and collaborating with various teams to ensure seamless operations. If you have a passion for troubleshooting, automation, and collaboration, we encourage you to apply. Responsibilities Respond to c

Lead Observability Engineer Sumo Logic

VLink Inc

Remote

Contract, Third Party

VLink is a leading global provider of software engineering services with next-gen technologies and best-in-class talent. With offices in 7+ countries from North America-Europe to APAC & expansion plans in Middle East, VLink has helped SMBs, and large enterprises achieve their business goals, and gained the trust of Fortune-250 companies. VLink is a 'Great Place to Work Certified ' and has been a consistent winner as- Best Places to Work in CT. Trust, collaboration, and accountability are the th

Principal Site Reliability Engineer

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Do you enjoy collaborating with teams to solve complex challenges? Do you have a passion for cutting edge technologies? Join our Compute Team! Our team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We do this while maintaining Akamai's mission at the forefront of what we do: make life better for billions of people, billions of times a day. Partner with the best As a Principal Site Reliability Engineer in the Virtualizatio

Senior Site Reliability Engineer

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Are you excited by the prospect of working with innovative security products? Do you enjoy creating innovative and strategic solutions to solve complex problems? Join Guardicore (now Akamai Enterprise Security Group) Guardicore (now Akamai Enterprise Security Group!) is changing the way organizations protect their data centers and clouds. Our team boasts some of the most talented and experienced cyber security and data center. We're always looking for new people to inspire us and make us bett

SRE(Site Reliability Engineering)-Remote

iPeople Infosystems LLC

Remote

Contract

Job Description: Production support expertise with SRE Observability experience : Proactive issue identification using observability tools.Skills in using different monitoring & observability tools to track system performance Production support activities including proactive identification of issues leveraging observability tools, Corelating inputs from various dashboards & tools to drive resolutionExperience in swiftly identifying probable failure points through the analysis of multiple inputs

Site Reliability Engineer (Temp to Perm)

Leidos

Remote or Brownsville, Texas, USA

Full-time

This position will require up to 75% travel Come put your Site Reliability Engineer (SRE) skills into action! Leidos has openings for talented SREs to join our team and work real-time hands-on fielding challenges and develop reusable solutions that support our customers in any environment. You will have the opportunity to contribute to the design requirements and implementation of improvements that accelerate the secure delivery, implementation, and sustainment of software in the field. You wi

Mainframe Site Reliability Engineer

Fynbosys Inc

Remote

Full-time

A Mainframe Site Reliability Engineer (SRE) applies software engineering principles to mainframe operations to enhance system reliability, scalability, and efficiency. Acting as a bridge between development and operations, the mainframe SRE focuses on automation, proactive monitoring, incident response, and performance optimization of mission-critical mainframe systems. Key responsibilities typically include:Automating repetitive operational tasks to reduce manual intervention and human errorEnh

Senior Site Reliability Engineer

McKesson Corporation

Remote or Columbus, Ohio, USA

Full-time

McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Here, we focus on the health, happiness, and well-being of you and those we serve - we care. What you do at McKesson matters. We foster a culture where you can grow, make an impact, and are empowered to bring new ideas. Together, we thrive as we shape the future of health for patien

Senior Site Reliability Engineer - Remote

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Do you have a passion for cutting edge technologies and tackling system problems? Are you a self-starting professional who thrives in a fast-paced environment? Join our critical CPS SRE team! We ensure that infrastructure services have world-class reliability and uptime. Site Reliability Engineer(SRE)s are the driving force that keeps the system running smoothly and helps identify any bottlenecks before they become issues. We focus on optimizing services, building infrastructure, and eliminat

Director, Site Reliability Engineering

Walmart Inc.

Remote or Bentonville, Arkansas, USA

Full-time

Position Summary What you'll do Are you passionate about pioneering cutting-edge technology leveraging GenAI and big data to revolutionize Walmart's customer service experiences? Do you dream of working on innovative systems that make a significant impact on hundreds of millions of customers across the globe? We are seeking a visionary and hands-on Director of Site Reliability Engineering (SRE) to lead and scale a world-class SRE organization. This leader will be responsible for building a hig

Senior Site Reliability Engineer

General Motors

Remote

Full-time

Job Description Develop and design software applications for driverless technology company. Duties may include: Build out and improve observability systems, tools and the related codebase. Contribute code, perform code reviews, and create technical designs that improve performance and reliability of observability systems using software and systems engineering skills. Partner with other Software Engineering teams to better understand use-cases and guide the engineers to use the existing tools eff

Senior Site Reliability Engineer - (Institutional)

Coinbase

Remote

Full-time

Ready to be pushed beyond what you think you're capable of? At Coinbase, our mission is to increase economic freedom in the world. It's a massive, ambitious opportunity that demands the best of us, every day, as we build the emerging onchain platform - and with it, the future global financial system. To achieve our mission, we're seeking a very specific candidate. We want someone who is passionate about our mission and who believes in the power of crypto and blockchain technology to update the

Director Site Reliability Engineering

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Do you enjoy collaborating with teams to solve complex challenges? Do you have a passion for cutting edge technologies and tackling system problems? Join our highly skilled Network SRE team We build and operate the Network infrastructure powering Akamai's global cloud platform. Our mission is to deliver reliable, scalable, and performant systems that enable customers to run critical workloads with confidence. As part of this team, you'll help ensure reliability at scale, maintaining the avail

Site Reliability Engineer (need strong python coding skills)

Artech, LLC

Remote

Contract

Do something big and innovative! Stretch your creative muscles and work on big issues. Since 1989, we have developed technology environments, applications, and tools by providing experienced teams to implement, enhance, and maintain our clients essential systems and applications. Come join the Scalence team! Job Title: Site Reliability Engineer Location: 100% REMOTE (PST work hours) Duration: 6-12+ months Pay Rate : $60 - $65 /hr. Job Description You will play a pivotal role in ensuring the