Site Reliability Engineer Jobs in San Jose, CA

Refine Results
1 - 20 of 250 Jobs

Site Reliability Engineer

Peritus Inc.

San Jose, California, USA

Full-time

-Experience with Site Reliability Engineering (SRE) concepts and practices -Strong understanding of monitoring/observability tools (e.g., Grafana) -Debugging experience across Java and React-based applications -Strong troubleshooting and incident management skills -Kubernetes not required

SRE Engineer (L3 Support)

Stanley David and Associates

San Jose, California, USA

Full-time

Role :: SRE Engineer (L3 Support) Location :: San Jose, CA / RTP, NC Type :: Fulltime Job Description Must Have Technical/Functional Skills SRE, NetApp Storage, Linux Certified, Kubernetes Certified, DevOps, Docker, etc.Roles & Responsibilities Experienced Senior SRE working on Kubernetes, On-Premises experienceCandidate should work independently with little guidance from the leads.Experience in working with AWS.Experience in DB technologies in PostGres and MongoDB.Experience in working with th

Site Reliability Engineer

TekVivid

Cupertino, California, USA

Third Party, Contract

Key Qualifications 4+ years of running services in a large scale *nix environment.Understanding of SRE principles and goals along with good Oncall experienceExperience and understanding on Scaling, Capacity Planning and Disaster RecoveryFast learner with excellent analytical problem solving and communication skillsThe ability to design, author, and release code in any language (Python, Java would be a plus)Deep understanding and experience in administration & usage of Apache Druid at scale.Deep

Site Reliability Engineer Druid

Veear

Cupertino, California, USA

Contract

Mid-level, infrastructure experience with Druid on AWSNot looking for Linux admin exp instead more oof Kubernetes and EKS experienceProfiecient with PythonOn call once in 6 weeks

Sr. DevOps/Site Reliability Engineer (SRE)

JKV International

Mountain View, California, USA

Contract

Job Title: Sr. DevOps/Site Reliability Engineer (SRE)Location: Mountain View, CA (Onsite)Position Type: Fulltime | Independent | H1B TransferInterview Process: Final In-Person (F2F) Interview Required About the Role:We are looking for a passionate and experienced Sr. DevOps/Site Reliability Engineer (SRE) to join our dynamic Platform Engineering team. You will work on cutting-edge cloud platforms like Azure, AWS, or Google Cloud Platform, leveraging state-of-the-art CI/CD tools to support modern

Site Reliability Engineer

iPeople Infosystems LLC

Remote

Contract

Role: Site Reliability Engineer Location: 100% Remote Type: Contract Position Job description: Production support expertise with SRE Observability experience : Proactive issue identification using observability tools.Skills in using different monitoring & observability tools to track system performanceProduction support activities including proactive identification of issues leveraging observability tools, Corelating inputs from various dashboards & tools to drive resolutionExperience in swiftl

Site Reliability Engineer

Fortinet

Sunnyvale, California, USA

Full-time

Job Description At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of b2b customers worldwide. Our team is growing, and we are looking for engineers with passion for automation. You will help support the Lacework p

Software Engineer Graduate (Global SRE) - 2025 Start (BS/MS)

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A233972 Apply to this job Share this listing: Responsibilities The monetization technology team works on building and running large-scale, globally distributed, fault-tolerant ads systems. SREs keep the systems up and running with the highest level of availability, ensuring our users have the best experience possible. We are looking for talented individuals to join our team in 2025. As a graduate, you will get unparalleled opportuni

Lead Observability Engineer Sumo Logic & SRE Location :Remote

NeoTech Solutions

US

Third Party, Contract

Role : Lead Observability Engineer Sumo Logic & SRE Location : Remote Hire type : Contract JD: Experience: 10+ years (with 3+ years in Sumo Logic & Cloud-native observability) Job Summary: We are seeking a highly skilled Lead Observability Engineer to lead a critical implementation of Sumo Logic for a client migrating from Dynatrace. This role requires deep expertise in Sumo Logic, Site Reliability Engineering (SRE) practices, and Kubernetes (EKS) observability. The ideal candidate will de

Site Reliability Engineer

Madison-Davis, LLC

Remote

Contract

Role: Drive the technical implementation of monitoring and alerting strategies across enterprise-scale applications and infrastructure.Collaborate directly with development teams to ensure each new initiative includes the correct telemetry, log tagging, and alert payloads.Act as a liaison to Level 2 and Level 3 support teams to maintain and enhance monitoring dashboards used by the enterprise command center (EMC).Standardize alert formats to ensure consistency with SRE policies and support downs

Site Reliability Engineer

Zachary Piper Solutions, LLC

Remote

Full-time

Piper Companies is seeking a Remote Site Reliability Engineer to join a leading cybersecurity and cloud consulting firm. The Site Reliability Engineer will play a key role in building and maintaining secure, scalable infrastructure while supporting automation, compliance, and operational excellence across client environments. Responsibilities of the Site Reliability Engineer include: Develop and deploy automation scripts, tooling, and infrastructure to meet client needsManage patching processes

Site Reliability Engineer, Compute - USDS

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A94176 Apply to this job Share this listing: Responsibilities Site Reliability Engineering(SRE) at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. In our team, you'll have the opportunity to manage the complex challenges of scale, while using expertise in coding, algorithms, complexity analysis, and large-scale system design. We embrace a culture of di

Principal Site Reliability Engineer

AJ Consulting Group, LLC

San Jose, California, USA

Contract

Title: Principal Site Reliability Engineer - Cloud Security & Trust(SPIFFE/SPIRE) Location: San Jose, CA ( Onsite )Duration: 6+ Months Rate: $60/hr on W2 VISA: U.S. Citizens, s due to legal or government contract requirements.Tax Term: W2 JD: Architect a cloud compute platform based on SPIFFE? Perform SRE roles including deployment, capacity management, observability, and performance tuning? Collaborate with our Security Architecture team to define attestation for a variety of workloads spanni

Senior Site Reliability Engineer, ML Platforms

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

Are you passionate about building and maintaining large-scale production systems that support advanced data science and machine learning applications? Do you want to join a team at the heart of NVIDIA's data-driven decision-making culture? If so, we have a great opportunity for you! NVIDIA is seeking a Senior Site Reliability Engineer (SRE) for the Data Science & ML Platform(s) team. The role involves designing, building, and maintaining services that enable real-time data analytics, streaming,

Principal Site Reliability Engineer (Prisma Access)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of

ClickHouse SRE, Data Platform -USDS

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A56552 Apply to this job Share this listing: Responsibilities Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed services and infrastructures. As a site reliability engineer in the data platform area, you will have the opportunity to manage the services and infrastructures in one of the largest data plaforms in the world that directly supports the TikTok a

Site Reliability Engineer, CapCut - TikTok USDS

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A248541A Apply to this job Share this listing: Responsibilities About The Team: CapCut is an all-in-one video editing app that empowers creators to express themselves and transform videos into creative masterpieces. In addition to its basic features, such as video editing, text, stickers, filters, colors and music, CapCut offers free advanced features, including keyframe animation, smooth slow-motion effects, chroma key, Picture-in-

Principal Site Reliability Engineer Cloud Identity & Trust (SPIFFE/SPIRE)

Pinnacle Software Solutions

San Jose, California, USA

Contract

Job Title: Cloud Compute Platform Architect Experience Level: Mid-SeniorSan Jose, CA Job SummaryWe are seeking an experienced Cloud Compute Platform Architect to design and manage a cutting-edge cloud compute platform based on SPIFFE. The ideal candidate will perform Site Reliability Engineering (SRE) responsibilities, including deployment, capacity management, observability, and performance tuning. You will collaborate closely with our Security Architecture team to define attestation methods fo

Devops engineer 6 months Contract - Mountain View, CA 94043

Mindsource Inc

Mountain View, California, USA

Contract

Title: DevOps Engineer Location: Mountain View, CA 94043 Duration: 6-Month Contract Summary: Seeking a seasoned DevOps Engineer to manage and migrate Kubernetes clusters (Rancher to AWS EKS) for a high-availability eCommerce platform. Role involves CI/CD optimization, IaC, observability, and security compliance (e.g., PCI-DSS). Key Responsibilities: Design/manage Rancher-based Kubernetes clusters Lead migration to AWS EKS Implement/maintain CI/CD pipelines and IaC (Terraform, Helm) Monitor perfo

SRE / Python Developer

Artech, LLC

Remote

Contract

Summary Our organization builds and provides systems and infrastructure that fuel our core services. We are the foundation on which our software developers build the products that our customers love. We are looking for passionate and dedicated Site Reliability Engineers to continue our focus on providing our customers the highest quality Services experience. Our services have to scale globally, stay highly available, and "just work. If you love designing, engineering and running systems and inf