site reliability engineer Jobs in san jose, ca

Refine Results
1 - 20 of 258 Jobs

Site Reliability Engineer

PsiQuantum

Palo Alto, California, USA

Full-time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a real quantum computer. PsiQuantum is on a mission to build the first real, useful quantum computers, capable of delivering the world-changing applications that the technology has long promised. We know that means we will need to build a system with roughly 1 million qubits that supports fault tolerant error correction within a scalable architecture, and a data center footprint. By harnes

Site Reliability Engineer

Kforce Technology Staffing

Remote or Redwood City, California, USA

Contract

RESPONSIBILITIES: Kforce has a client that is seeking a Reliability Engineer in Redwood City, CA. The client is looking for a consultant who can help our client with the following deliverables: * High-Level Deliverables for Contract Duration with Estimated Sizing * (Small) Remediate all known policy violations within existing Docker build images * Build image: 2 high, 4 medium, 32 low severity issues related to dependencies * Requires knowledge of Docker, tangential comfort with Rust software

Site Reliability Engineer

Hexaware Technologies, Inc

Remote

Full-time

Position: Site Reliability Engineer 100% Remote Hiring: Contract / Fulltime Overview: SRE with deep expertise in Azure cloud-native observability implementation, Datadog, and production incident management. Also, possess strong Azure cloud-native tech stack performance tuning skills to optimize system reliability, scalability, and efficiency. Required Skills and Qualifications: Strong experience with Azure cloud-native observability tools (e.g., Azure Monitor, Log Analytics, Application Insights

Sr. DevOps/Site Reliability Engineer (SRE)

JKV International

Mountain View, California, USA

Contract

Job Title: Sr. DevOps/Site Reliability Engineer (SRE)Location: Mountain View, CA (Onsite)Position Type: Fulltime | Independent | H1B TransferInterview Process: Final In-Person (F2F) Interview Required About the Role:We are looking for a passionate and experienced Sr. DevOps/Site Reliability Engineer (SRE) to join our dynamic Platform Engineering team. You will work on cutting-edge cloud platforms like Azure, AWS, or Google Cloud Platform, leveraging state-of-the-art CI/CD tools to support modern

Software Engineer Graduate (Global SRE) - 2025 Start (BS/MS)

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A233972 Apply to this job Share this listing: Responsibilities The monetization technology team works on building and running large-scale, globally distributed, fault-tolerant ads systems. SREs keep the systems up and running with the highest level of availability, ensuring our users have the best experience possible. We are looking for talented individuals to join our team in 2025. As a graduate, you will get unparalleled opportuni

Site Reliability Engineer only w2

Symphony Corporation

Remote

Contract

Site Reliability Engineer 6 Months Remote only W-2 The client is looking for a site reliability engineer.

Site Reliability Engineer - W2 Role

Boston Associate Software Systems

Remote

Contract

10+ years' experience in information technology and/or professional services, with emphasis on subject matter expertise. At least 4 years of experience as a Site Reliability Engineering or equivalent role.Strong track record of delivering projects of demonstrable complexity and scale.Experience with data visualization and monitoring tools such as Splunk, Grafana, Dynatrace, Datadog, New Relic, Oracle Enterprise Manager, etc.Experience with telemetry frameworks and tools: (including but not limit

SRE Architect

Alpha Silicon

US

Full-time

Roles & Responsibilities: 18+ years of Development and Operations experience in building and running applications in production that has uptime over 99%. Related experience and/or training; or equivalent combination of education and experience 8+ years of experience as a SRE Architect in running large Reliability & Observability Programs for large, complex infrastructure deployments / distributed systems for major Banking customers. Has a keen eye for industry trends, tries out newer tools/infra

SITE RELIABILITY ENGINEER

Mindbank Consulting Group

Remote

Full-time

JOB DESCRIPTION: As a Site Reliability Engineer (SRE) on our team, you will play a critical part in ensuring our systems and services reliability, scalability, and performance. You will work closely with cross-functional teams, including engineering, cloud infrastructure, and security, to deliver resilient, observable, and compliant solutions in AWS GovCloud and commercial cloud environments. This role requires applying consultative and technical expertise to support cloud initiatives with the

Principal Technical Architect, Site Reliability Engineering

Randstad Digital

Remote or Chicago, Illinois, USA

Full-time

job summary: You will be responsible for building a purposeful, proactive, and sustainable approach to reliability on a foundation of SRE principles. You will partner with multiple support teams, architects, developers, and other stakeholders to develop common tools and guidance and drive adoption of key reliability engineering practices in support of large-scale and mission-critical services. Through your deep SRE knowledge and history of implementation, you will have open, candid conversatio

Site Reliability Engineer (Amdocs)

Highbrow

Remote

Full-time

Key Responsibilities Design, build, and maintain scalable, reliable, and secure infrastructure across production and staging environments. Automate operational tasks and processes using code (Python, Go, Bash, etc.). Drive infrastructure as code (IaC) practices using tools like Terraform, Ansible, or similar. Monitor, troubleshoot, and improve system availability, latency, and performance. Collaborate closely with development, QA, and product teams to design scalable system architecture. Conduct

Site Reliability Engineer

AE Business Solutions

Remote

Full-time

AEBS is seeking a Site Reliability Engineer to take on a fully-remote, contract position! The Site Reliability Engineer must be able to work central time zone hours throughout this engagement. *No C2C inquiries, please Ideal Skills/Background: 3+ years of experience implementing Infrastructure-as-Code (IAC) with Terraform 3+ years of Python development experience 3+ years of experience in designing/developing/operating applications in the Azure cloud 2+ years of experience building and running

Principal Site Reliability Engineer - Enterprise AI Platform

NVIDIA Corporation

Santa Clara, California, USA

Full-time

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As

Senior Staff Machine Learning Engineer - DevOps/Site Reliability Engineer

ServiceNow, Inc.

Santa Clara, California, USA

Full-time

Company Description It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today - ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500 . Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But thi

Site Reliability Engineer-Fully Remote

Elite Technical

Remote

Contract

Our direct client , a major defense contracting company is seeking a Site Reliability Engineer for a contract to hire engagement Must be able to obtain TS/SCI (active TS is preferred) Requires a Bachelor's degree in a STEM field and 5+ years of job-related experience, or a Master's degree plus 3 years of job-related experience. Experience monitoring large scale systems and using automation to triage emerging issues Experience with Prometheus (preferred) and/or Grafana and Splunk. Experience au

Senior Site Reliability Engineer, AI Infrastructure

NVIDIA Corporation

Santa Clara, California, USA

Full-time

NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is a "learning machine" that constantly evolves by adapting to new opportunities that are hard to solve, that only we can tackle, and that matter to the world. This is our life's work, to amplify human

Senior DevOps and SRE Engineer

NVIDIA Corporation

Santa Clara, California, USA

Full-time

NVIDIA is seeking a passionate, motivated and technical Kubernetes Architect/Engineer to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Principal DevOps & SRE Engineer to support the design and implementation of Kubernetes solutions for the company's Cloud Platform. The position will be part of a fast-paced crew that develops and maintains sophisticated build & test environments for a multitude of hardware platforms both N

Senior Staff Site Reliability Engineer (Cortex Observability)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of

Principal Site Reliability Engineer (Cortex Cloud Security Posture Management)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of

Senior Site Reliability Engineer (Cortex Cloud Security Posture Management)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of