Site Reliability Engineer Jobs in San Jose, CA

Refine Results
1 - 20 of 260 Jobs

SRE Engineer (L3 Support)

Stanley David and Associates

San Jose, California, USA

Full-time

Role :: SRE Engineer (L3 Support) Location :: San Jose, CA / RTP, NC Type :: Fulltime Job Description Must Have Technical/Functional Skills SRE, NetApp Storage, Linux Certified, Kubernetes Certified, DevOps, Docker, etc.Roles & Responsibilities Experienced Senior SRE working on Kubernetes, On-Premises experienceCandidate should work independently with little guidance from the leads.Experience in working with AWS.Experience in DB technologies in PostGres and MongoDB.Experience in working with th

Site Reliability Engineer

Kforce Technology Staffing

Remote or Redwood City, California, USA

Contract

RESPONSIBILITIES: Kforce has a client that is seeking a Reliability Engineer in Redwood City, CA. The client is looking for a consultant who can help our client with the following deliverables: * High-Level Deliverables for Contract Duration with Estimated Sizing * (Small) Remediate all known policy violations within existing Docker build images * Build image: 2 high, 4 medium, 32 low severity issues related to dependencies * Requires knowledge of Docker, tangential comfort with Rust software

Site Reliability Engineer-Fully Remote

Elite Technical

Remote

Contract

Our direct client , a major defense contracting company is seeking a Site Reliability Engineer for a contract to hire engagement Must be able to obtain TS/SCI (active TS is preferred) Requires a Bachelor's degree in a STEM field and 5+ years of job-related experience, or a Master's degree plus 3 years of job-related experience. Experience monitoring large scale systems and using automation to triage emerging issues Experience with Prometheus (preferred) and/or Grafana and Splunk. Experience au

Principal Technical Architect, Site Reliability Engineering

Randstad Digital

Remote or Chicago, Illinois, USA

Full-time

job summary: You will be responsible for building a purposeful, proactive, and sustainable approach to reliability on a foundation of SRE principles. You will partner with multiple support teams, architects, developers, and other stakeholders to develop common tools and guidance and drive adoption of key reliability engineering practices in support of large-scale and mission-critical services. Through your deep SRE knowledge and history of implementation, you will have open, candid conversatio

Site Reliability Engineer (SRE) - Clearable/ Cleared Citizens Only+ STEM - Fully Remote - Contract to Hire Position

Elite Technical

Remote

Contract

Site Reliability Engineer (SRE) - RemoteOur client, a leading federal defense contractor is seeking a Site Reliability Engineer (SRE) responsible for maintaining survivability and reliability of mission critical resources. The SRE will monitor high priority systems and automate recovery mechanisms to ensure they remain operational for the warfighter. Responsibilities: - Ensuring Uptime of Critical Systems (Incident Response / Triage) - Monitor, and Troubleshoot Enterprise Services (Prometheus

Site Reliability Engineer - W2 Role

Boston Associate Software Systems

Remote

Contract

10+ years' experience in information technology and/or professional services, with emphasis on subject matter expertise. At least 4 years of experience as a Site Reliability Engineering or equivalent role.Strong track record of delivering projects of demonstrable complexity and scale.Experience with data visualization and monitoring tools such as Splunk, Grafana, Dynatrace, Datadog, New Relic, Oracle Enterprise Manager, etc.Experience with telemetry frameworks and tools: (including but not limit

Azure SRE

VDart, Inc.

US

Contract, Third Party

Job Title: Azure SRE Location: REMOTE Duration: Contract Term: 12+ months Job Description: Experience Desired: 12+ Years. Experience with securing/building Azure cloud environments Proficient in at least one scripting language (python, nodejs etc..). Security administration in Azure Developing & Securing Serverless applications Infrastructure as code tools (Terraform, CloudFormation etc..) Command Line experience (Bash, Powershell, AWS-CLI) The System Reliability Engineer (SRE) guides and

IAM Engineer

World Wide Technology

Menlo Park, California, USA

Contract

Title: SRE/IAM Engineer Location: Hybrid - Alpharetta, NYC, or Menlo CA (3 days onsite) Duration/Type of Job: 12+ months Job Summary: - Will accept candidates local to Alpharetta, NYC, or Menlo CA (If candidates are local to Alpharetta, their 2nd round interview will be onsite) - There is an on call rotation aspect to this role IAM skillset is most important, then SRE basic knowledge after that Must have basic command line programming and scripting experience Experience w Ping suite of products

SITE RELIABILITY ENGINEER

Mindbank Consulting Group

Remote

Full-time

JOB DESCRIPTION: As a Site Reliability Engineer (SRE) on our team, you will play a critical part in ensuring our systems and services reliability, scalability, and performance. You will work closely with cross-functional teams, including engineering, cloud infrastructure, and security, to deliver resilient, observable, and compliant solutions in AWS GovCloud and commercial cloud environments. This role requires applying consultative and technical expertise to support cloud initiatives with the

SRE Architect

Alpha Silicon

US

Full-time

Roles & Responsibilities: 18+ years of Development and Operations experience in building and running applications in production that has uptime over 99%. Related experience and/or training; or equivalent combination of education and experience 8+ years of experience as a SRE Architect in running large Reliability & Observability Programs for large, complex infrastructure deployments / distributed systems for major Banking customers. Has a keen eye for industry trends, tries out newer tools/infra

Site Reliability Engineer (Amdocs)

Highbrow

Remote

Full-time

Key Responsibilities Design, build, and maintain scalable, reliable, and secure infrastructure across production and staging environments. Automate operational tasks and processes using code (Python, Go, Bash, etc.). Drive infrastructure as code (IaC) practices using tools like Terraform, Ansible, or similar. Monitor, troubleshoot, and improve system availability, latency, and performance. Collaborate closely with development, QA, and product teams to design scalable system architecture. Conduct

Site Reliability Engineer

AE Business Solutions

Remote

Full-time

AEBS is seeking a Site Reliability Engineer to take on a fully-remote, contract position! The Site Reliability Engineer must be able to work central time zone hours throughout this engagement. *No C2C inquiries, please Ideal Skills/Background: 3+ years of experience implementing Infrastructure-as-Code (IAC) with Terraform 3+ years of Python development experience 3+ years of experience in designing/developing/operating applications in the Azure cloud 2+ years of experience building and running

Site Reliability Engineer only w2

Symphony Corporation

Remote

Contract

Site Reliability Engineer 6 Months Remote only W-2 The client is looking for a site reliability engineer.

Sr. DevOps/Site Reliability Engineer (SRE)

JKV International

Mountain View, California, USA

Contract

Job Title: Sr. DevOps/Site Reliability Engineer (SRE)Location: Mountain View, CA (Onsite)Position Type: Fulltime | Independent | H1B TransferInterview Process: Final In-Person (F2F) Interview Required About the Role:We are looking for a passionate and experienced Sr. DevOps/Site Reliability Engineer (SRE) to join our dynamic Platform Engineering team. You will work on cutting-edge cloud platforms like Azure, AWS, or Google Cloud Platform, leveraging state-of-the-art CI/CD tools to support modern

Sr. Site Reliability Engineer - W2 only

Nasscomm, Inc.

Remote

Contract

Site Reliability Engineer - At least 6+ years of experience defining and implementing Monitoring solutions - alerts, Telemetry, and instrumentation for on-premises and cloud platforms for large enterprises - Site Reliability Engineer will be playing a key role in building Observability and Resilience capabilities on cloud platform (Azure).Responsibilities of the SRE will be: - Build and configure alerts, tracing, telemetry, and instrumentation required for Infrastructure Monitoring and Applicati

Site Reliability Engineer

Madison-Davis, LLC

Remote

Contract

Role: Drive the technical implementation of monitoring and alerting strategies across enterprise-scale applications and infrastructure.Collaborate directly with development teams to ensure each new initiative includes the correct telemetry, log tagging, and alert payloads.Act as a liaison to Level 2 and Level 3 support teams to maintain and enhance monitoring dashboards used by the enterprise command center (EMC).Standardize alert formats to ensure consistency with SRE policies and support downs

Site Reliability Engineer

PsiQuantum

Palo Alto, California, USA

Full-time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a real quantum computer. PsiQuantum is on a mission to build the first real, useful quantum computers, capable of delivering the world-changing applications that the technology has long promised. We know that means we will need to build a system with roughly 1 million qubits that supports fault tolerant error correction within a scalable architecture, and a data center footprint. By harnes

Software Engineer Graduate (Global SRE) - 2025 Start (BS/MS)

TikTok

San Jose, California, USA

Full-time

Location : San Jose Employment Type : Regular Job Code : A233972 Apply to this job Share this listing: Responsibilities The monetization technology team works on building and running large-scale, globally distributed, fault-tolerant ads systems. SREs keep the systems up and running with the highest level of availability, ensuring our users have the best experience possible. We are looking for talented individuals to join our team in 2025. As a graduate, you will get unparalleled opportuni

Senior DevOps and SRE Engineer

NVIDIA Corporation

Santa Clara, California, USA

Full-time

NVIDIA is seeking a passionate, motivated and technical Kubernetes Architect/Engineer to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Principal DevOps & SRE Engineer to support the design and implementation of Kubernetes solutions for the company's Cloud Platform. The position will be part of a fast-paced crew that develops and maintains sophisticated build & test environments for a multitude of hardware platforms both N

Senior Staff Site Reliability Engineer (Cortex Observability)

PaloAlto Networks

Santa Clara, California, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of