lead site reliability engineer Jobs in san jose, ca

Refine Results
1 - 20 of 68 Jobs

Technical Lead, Site Reliability Engineer, Fleetnet

Tesla Motors

Remote or Palo Alto, California, USA

Full-time

We are a small team of experts focused on creating the next-generation server-side infrastructure for Tesla. We're the invisible link connecting every Tesla product, whether it's vehicles, robots, robotaxis, chargers or even mobile apps to bring customers the best user experience possible. We're looking for strong, hands on, technical leader with domain expertise in one or more of: containers, public clouds, or private clouds. Today, over 10 million Tesla users rely on our services to safely and

Site Reliability Engineer/Lead Site Reliability Engineer/Sr. Site Reliability Engineer/Senior Site Reliability Engineer/SRE Consultant/SRE Engineer/Lead SRE Engineer/Lead SRE Consultant

Orpine.com

Remote

Full-time

Job Title: Site Reliability Engineer Location: Remote Work Duration: Full Time Job Description: We are seeking a skilled Site Reliability Engineer (SRE) with strong expertise in Linux systems and Cloud platforms (Google Cloud Platform and AWS).In this role, you will ensure high availability, scalability, and performance of cloud-based infrastructure and services.Key Responsibilities: Manage and monitor Linux-based systems in cloud environmentsDesign and implement infrastructure solutions usi

Tech lead Site Reliability Engineer, Edge - USDS

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A104498 Apply to this job Share this listing: Responsibilities Site Reliability Engineering combines software and system engineering with system operations to build and run large-scale, massively distributed infrastructure. Our Edge SREs ensure infrastructure services are reliable, fault-tolerant, efficiently scalable and cost-effective. We dive deep into the stack, including network, hardware, OS, and applications, to quickly resol

Tech Lead, SRE - Recommendation Infrastructure

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A206446 Apply to this job Share this listing: Responsibilities Our Recommendation Infrastructure Team is responsible for building up and optimizing the architecture for our recommendation system to provide the most stable and best experience for our TikTok users. SREs in our team keep the systems up and running with the highest level of availability, and create highly automated systems and pipelines. What You'll Do Engage in and imp

Sr Implementation Lead, SRE (CoP)

Northern Trust

Remote or Chicago, Illinois, USA

Full-time

About Northern Trust: Northern Trust, a Fortune 500 company, is a globally recognized, award-winning financial institution that has been in continuous operation since 1889. Northern Trust is proud to provide innovative financial services and guidance to the world's most successful individuals, families, and institutions by remaining true to our enduring principles of service, expertise, and integrity. With more than 130 years of financial experience and over 22,000 partners, we serve the world'

Lead Site Reliability Engineer

Centene Corporation

Missouri, USA

Full-time

You could be the one who changes everything for our 28 million members by using technology to improve health outcomes around the world. As a diversified, national organization, Centene's technology professionals have access to competitive benefits including a fresh perspective on workplace flexibility. Position Purpose: We are seeking a highly skilled and experienced M365 Lead Site Reliability Engineer to join our team. The ideal candidate will be responsible for developing and creating monitor

Lead Site Reliability Engineer, Observability - Remote

Cisco Systems, Inc.

Remote

Full-time

Application window is open until further notice. The Meraki cloud supports millions of customer devices from 10 data centers around the world. Meraki's customer base has grown by a factor of 2-3 every year, serving billions of HTTP requests per day globally. Our customers depend on our products to run their critical infrastructure of network switches, security appliances, wireless APs and security cameras. As SREs at Meraki, we are responsible for building and growing the cloud that supports t

Senior Lead Site Reliability Engineer - Remote

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Would you enjoy improving stability and safety of one of the largest global networks? \n Would you enjoy hands-on network operations work on a global scale to improve our operational efficiency? \n Join our Platform Security Engineering Team \n The Platform Security Engineering team is a group of engineers that support and secure Akamai's global network and Linode cloud systems. Our systems provide data security, server integrity, network access, and secure communications infrastructure. This is

SRE Engineer (L3 Support)

Stanley David and Associates

San Jose, California, USA

Full-time

Role :: SRE Engineer (L3 Support) Location :: San Jose, CA / RTP, NC Type :: Fulltime Job Description Must Have Technical/Functional Skills SRE, NetApp Storage, Linux Certified, Kubernetes Certified, DevOps, Docker, etc.Roles & Responsibilities Experienced Senior SRE working on Kubernetes, On-Premises experienceCandidate should work independently with little guidance from the leads.Experience in working with AWS.Experience in DB technologies in PostGres and MongoDB.Experience in working with th

Tech Lead Machine Learning Ops Engineer, Global SRE

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A181865 Apply to this job Share this listing: Responsibilities MLOps - Global SRE team is responsible for the stability of machine learning systems under the Global Monetization Products and Technology organization, to ensure the stable and efficient operations of machine learning models from data preparation, development, training, deployment, serving and so on. Responsibilities 1) Responsible for setting SLOs of online machine lea

Sr. DevOps/Site Reliability Engineer (SRE)

JKV International

Mountain View, California, USA

Contract

Job Title: Sr. DevOps/Site Reliability Engineer (SRE)Location: Mountain View, CA (Onsite)Position Type: Fulltime | Independent | H1B TransferInterview Process: Final In-Person (F2F) Interview Required About the Role:We are looking for a passionate and experienced Sr. DevOps/Site Reliability Engineer (SRE) to join our dynamic Platform Engineering team. You will work on cutting-edge cloud platforms like Azure, AWS, or Google Cloud Platform, leveraging state-of-the-art CI/CD tools to support modern

Sr. Site Reliability Engineer - W2 only

Nasscomm, Inc.

Remote

Contract

Site Reliability Engineer - At least 6+ years of experience defining and implementing Monitoring solutions - alerts, Telemetry, and instrumentation for on-premises and cloud platforms for large enterprises - Site Reliability Engineer will be playing a key role in building Observability and Resilience capabilities on cloud platform (Azure).Responsibilities of the SRE will be: - Build and configure alerts, tracing, telemetry, and instrumentation required for Infrastructure Monitoring and Applicati

Sr. Site Reliability Engineer - U.S. Citizen - This role sits within Optum Serves Technology Product organization

Widescope Consulting and Contracting Services

Remote

Full-time

Job Title: Sr. Site Reliability Engineer Location: Headquarters / Telecommute Classification (HR only): Exempt Non-Exempt Reports To (Title): COO Widescope Consulting and Contracting JOB SUMMARY The statements below are not intended to be all-inclusive of the duties and responsibilities of the position. Based on leadership decisions and business needs, all other duties as assigned will be expected for each position.Grafana Widescope Consulting and Contracting is proud to serve our nation's mi

Site Reliability Engineer

iPeople Infosystems LLC

San Jose, California, USA

Full-time

Job Description: Experienced Senior SRE working on Kubernetes, On-Premises experienceCandidate should work independently with little guidance from the leads.Experience in working with AWS.Experience in DB technologies in PostGres and MongoDB.Experience in working with the DevOPS support model.Demonstrate a basic understanding of design principles and best practices in software development.Apply problem-solving skills to troubleshoot and resolve development issues.Understand and translate busine

Senior Site Reliability Engineer, ML Platforms

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

Are you passionate about building and maintaining large-scale production systems that support advanced data science and machine learning applications? Do you want to join a team at the heart of NVIDIA's data-driven decision-making culture? If so, we have a great opportunity for you! NVIDIA is seeking a Senior Site Reliability Engineer (SRE) for the Data Science & ML Platform(s) team. The role involves designing, building, and maintaining services that enable real-time data analytics, streaming,

Principal Site Reliability Engineer

JPMorgan Chase & Co.

Palo Alto, California, USA

Full-time

Job Description Join a globally recognized financial organization and advance your profession to new heights by contributing to revolutionary projects. You've discovered the perfect environment to have a major impact. As a Principal Site Reliability Engineer at JPMorgan Chase within the Enterprise Technology, AI/ML & Data Platforms division, you will utilize your expertise to create innovative solutions that improve critical incident management and streamline the software development lifecycle

Principal Site Reliability Engineer (DLP)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of

Senior Site Reliability Engineer, Global E-Commerce

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : Y7166 Apply to this job Share this listing: Responsibilities Global e-commerce is a content e-commerce business with international short video product as the carrier. It is committed to becoming the first choice for users to discover and purchase good products with affordable prices. Global e-commerce business team hopes to provide users with more tailored and efficient consumption experience, enabling merchants to receive reliable

Sr. Site Reliability Engineer

Adobe Systems

San Jose, California, USA

Full-time

Our Company Changing the world through digital experiences is what Adobe's all about. We give everyone-from emerging artists to global brands-everything they need to design and deliver exceptional digital experiences! We're passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen. We're on a mission to hire the very best and are committed to creating exceptional employee experiences wher

Principal Site Reliability Engineer (Prisma Access)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of