Kubernetes Infrastructure Solution Architect Jobs in San Francisco, CA

Refine Results
161 - 180 of 522 Jobs

Staff Linux DevOps Engineer / AWS / Remote

Motion Recruitment Partners, LLC

Remote or Phoenix, Arizona, USA

Full-time

A company in the medical research industry is currently looking for a Staff Linux DevOps Engineer to add to their growing team. This engineer will help with the design, implementation and maitenance of Linux infrastructure based both on prem and in AWS. They will also help implement further optimization, assist in cloud migrations and help manage compliance initiatives. Required Skills & Experience 8+ years of experience in a Systems, DevOps or SRE focused position 8+ years of experience workin

Senior AI Infrastructure Engineer - DGX Cloud

NVIDIA Corporation

Remote or Santa Clara, California, USA

Full-time

AI Infrastructure Engineers at NVIDIA ensures that our internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and at the same time enabling developers to make changes to the existing system through careful preparation and planning while keeping an eye on capacity, latency and performance. We are looking for engineers with a strong background in computer science fundamentals who are interested in building tooling, reporting, automation, and AI

SRE Engineer

Synergis

Remote

Full-time

Job Title: SRE Engineer Job Location: Remote Type: Direct Hire * Status Required *must have prior experience in a SAAS Based Software Company and a startup / or small company environment Synergis client, a software organization focused on an AI powered, unified platform for data discovery, observability, and governance. The Site Reliability Engineer will design and implement automations on their Cloud Infrastructure SRE Engineer Background and Scope Ensure the organization has security policies

Fullstack Software Engineer, Machine Learning Platform, AI Infrastructure

Tesla Motors

Palo Alto, California, USA

Full-time

As a Software Engineer within the Autopilot AI Infrastructure team, you will work on reinforcing, optimizing, and scaling our infrastructure components supporting AI research activities for Autopilot and the Optimus. At the core of our autonomy capabilities are neural networks that the research team is designing to train on very large amounts of data, across large-scale GPU clusters and our supercomputer Dojo. Robustly training these models at scale and in the shortest amount of time is critical

Backend Software Engineer, Machine Learning Platform, AI Infrastructure

Tesla Motors

Palo Alto, California, USA

Full-time

As a Software Engineer within the Autopilot AI Infrastructure team, you will work on reinforcing, optimizing, and scaling our infrastructure components supporting AI research activities for Autopilot and the Optimus. At the core of our autonomy capabilities are neural networks that the research team is designing to train on very large amounts of data, across large-scale GPU clusters and our supercomputer Dojo. Robustly training these models at scale and in the shortest amount of time is critica

Remote AWS Data Engineer - 1 year W2 contract

Irvine Technology Corporation (ITC)

Remote or New Jersey, USA

Contract

Senior Data Engineer (W2 Only Remote, CST/EST Hours) Location: Remote (Work from Home) Schedule: CST/EST business hours Type: 1-Year W2 Contract Pay Rate: $70 $80/hr (W2 Only No C2C) Overview: We are seeking a highly skilled Senior Data Engineer to support a long-term project focused on building a data consolidation tool that integrates diverse datasets from multiple systems into a centralized Snowflake data lake. This role emphasizes query optimization, efficient data flow, and ensuring fast,

Senior DevOps Engineer

Oracle Corporation

Redwood City, California, USA

Full-time

Job Description Our team Fusion Release Engineering (FRE) is a central portfolio team for Fusion Applications Development that specializes in automation frameworks and continuous integration between Fusion Applications, Middleware and RDBMS. We specialize in SaaS and Cloud automation as well as server-side abstraction frameworks. We are seeking dynamic people to join this high-profile, agile development team to own the planning, execution, and delivery of customer implementations of Oracle Clou

Lead Software Engineering- Middleware Reliability Engineering

Visa Inc.

Foster City, California, USA

Full-time

Company Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure payments network, enabling individuals, businesses, and economies to thrive while driven by a common purpose - to uplift everyone, everywhe

Sr. Network Observability Engineer SME

NetPace

Remote

Full-time

Remote Role(PST) (SME-Level) Senior Network Observability Engineer Key Responsibilities: Design and deploy scalable network observability frameworks for multi/hybrid-cloud environments (Azure, Google Cloud Platform, OCI) using Grafana, Prometheus, OpenTelemetry, and cloud-native tools.Implement custom dashboards, alerts, and log analytics for network performance metrics (latency, packet drops, BGP routing health, throughput) and security telemetry (firewall logs, flow logs, IDS/IPS).Integrate ob

Observability Software Engineer, AI Infrastructure

Tesla Motors

Fremont, California, USA

Full-time

As a member of Tesla's "Insane Visibility" team, you will design, implement & maintain end-to-end observability across our AI Infrastructure stack and develop the framework to benchmark performance & processing of pipelines. You'll be responsible for building dashboards, alerts & monitoring necessary for Autopilot & AI teams to address observability issues in our FSD, Robotaxi & Optimus applications, ensuring these programs run smoothly throughout the full infrastructure stack. Responsibilities

Head of DevOps- 100% Remote

Motion Recruitment Partners, LLC

Remote or Arlington, Virginia, USA

Full-time

Head of DevOps (Hands On) We're seeking a Head Of DevOps to help shape and manage our deployment infrastructure across multiple environments, including classified, air-gapped networks. As our first dedicated DevOps hire, you'll have a unique opportunity to build and influence our infrastructure strategy from the ground up-supporting the secure and efficient deployment of LLMs and other large-scale models. The candidate must have an Active U.S. DoD Security Clearance (minimum Secret). The company

Staff Software Engineer, Automation, Vehicle Ownership Applications

Tesla Motors

Fremont, California, USA

Full-time

We are looking for experienced software engineer with experience in building frameworks to test product programmatically and build tools to monitor health. This position will support ownership related applications, which includes supercharging, robotaxi, tesla roaming initiatives, upgrades, subscriptions, tesla electric, tesla external APIs, AI agents and other enterprise workflows. This is backend orchestration team focused on building distributed systems to power customer facing interfaces inc

Platform Engineer - Specialist

Ascension Health

Remote

Full-time

Details Department: Data Delivery and GovernanceSchedule: Full Time Monday - Friday 8-5pm CTLocation: RemoteBenefits Paid time off (PTO) Various health insurance options & wellness plans Retirement benefits including employer match plans Long-term & short-term disability Employee assistance programs (EAP) Parental leave & adoption assistance Tuition reimbursement Ways to give back to your community Benefit options and eligibility vary by position. Compensation varies based on factors includin

Senior Infrastructure Engineer

U-Haul International, Inc

Remote or Phoenix, Arizona, USA

Full-time

Location: 2727 N Central Ave, Phoenix, Arizona 85004 United States of America U-Haul is seeking an experienced Senior Infrastructure Engineer with a strong background in Kubernetes, Automation, Linux, Scripting, and Load Balancing. Primary Responsibilities: Take ownership of assigned projects and actively drive them to completion. Leverage new and existing technologies to help Teams achieve their automation goals. Proactively evaluate the security of new and existing solutions. Work with secur

Senior Backend Software Engineer

Oracle Corporation

Redwood City, California, USA

Full-time

Job Description Fusion Applications (FA) is Oracle's leading SaaS offering of several critical business applications like Enterprise Resource Planning(ERP), Human Capital Management(HCM), Customer Relationship Management(CRM) and many more. The enterprise grade application suite serves as one of the focal points of Oracle's business value. While Fusion Applications has been a huge success, the basic architecture is based on an on-premise, Fusion Middleware stack that has not changed since its i

AWS DevSecOps Engineer

Dynanet Corporation

Remote

Full-time

Position Details: Job Title: AWS DevSecOps Engineer Job Type: Full-time Location: Remote, DC Dynanet Corporation Overview: Dynanet started with a focus on IT infrastructure and operations, helping organizations enhance their networks and overcome the limitations of 1990s technology. From strengthening communication channels to introducing innovative ways to collaborate and share information, Dynanet played a crucial role in shaping the early stages of digital transformation. The company s effort

Staff Infrastructure Engineer (Data Acquisition & Observability)

PsiQuantum

Palo Alto, California, USA

Full-time

Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a real quantum computer. PsiQuantum is on a mission to build the first real, useful quantum computers, capable of delivering the world-changing applications that the technology has long promised. We know that means we will need to build a system with roughly 1 million qubits that supports fault tolerant error correction within a scalable architecture, and a data center footprint. By harnes

Principal Backend Software Engineer-Pipeline Infrastructure (FULLY REMOTE Bay Area, Seattle)

Splunk Inc.

Remote or San Jose, California, USA

Full-time

Description Splunk, a Cisco company, is here to build a safer and more resilient digital world. The world's leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. While customers love our technology, it's our people that make Splunk stand out as an amazing career destination and why we've won so many awards as a best place to work. If you become a Splunker, we want your whole, authentic self, what we call your "million data poi

MLOps Engineer

QualiTest

Remote

Full-time

Qualitest seeking a skilled MLOps Engineer. This is remote position offers an exciting opportunity to work on cutting-edge machine learning operations, model deployment, and cloud infrastructure management. You will play a key role in building, deploying, and maintaining scalable ML pipelines and production environments. Key Responsibilities Design, develop, and maintain robust ML pipelines using tools such as Airflow, MLflow, and DVC.Manage containerized applications with Kubernetes and Docker

Relativity Server Administrator

Ekcel Technologies Inc

Remote

Full-time

Role: Relativity and L3 Infrastructure Specialist Job Description: Overview: This role requires a strong combination of hands-on expertise in Relativity (Rel) Servers and a deep understanding of infrastructure components. The candidate should have a proven track record in managing and troubleshooting Relativity environments while also possessing advanced knowledge of infrastructure elements such as load balancers, IIS, firewalls, and data flow management. This is a Level 3 (L3) role, requiring