site reliability engineer Jobs

Refine Results
1 - 20 of 996 Jobs

Site Reliability Engineer

Cynet Systems

Phoenix, Arizona, USA

Contract

Job Description: We are seeking a highly skilled Senior Java Developer with over 5 years of experience to join our team. The ideal candidate will have a strong background in Java/J2EE, REST API, Redis, and microservice architecture, along with expertise in Spring frameworks and Spring Boot. Experience with automation, scripting, and monitoring tools is also essential for this role. Requirement/Must Have: 5+ years of experience in Java/J2EE development. Strong proficiency in REST API developme

Site Reliability Engineer

Avance Consulting

Bellevue, Washington, USA

Full-time

Job Description Required Qualification: At least 4 years of Information Technology experience. SRE Mindset in Production support: Proactive issue identification using observability tools. Skilled in using different monitoring & observability tools to track system performance Incident commander: Ability to diagnose complex issues and actively drive incident calls working with technical, product SMEs, and Tier 2 SREs. Experience in Splunk (including Splunk APM and Splunk O11y), AppDynamics, E

Site Reliability Engineer

Purple Hires

Chicago, Illinois, USA

Contract

Position : Site Reliability EngineerLocation : Plano, TX/ Chandler, AZ/ Chicago, IL - Hybrid (3 days onsite)Duration : Long Term Job Summary:Primary skills:OpenShift, Rancher Kubernetes(RKE), Python and Shell Scripting, Linux, and Azure Cloud. ResponsibilitiesResponsible for reliability and support of Container Platform on-prem and external clouds (Azure /AWS /Google)Monitor and troubleshoot Container platform (Openshift), Rancher (RKE) and Azure (AKS) environment performance issues, connectivit

Site Reliability Engineer

Purple Hires

Plano, Texas, USA

Contract

Position : Site Reliability EngineerLocation : Plano, TX/ Chandler, AZ/ Chicago, IL - Hybrid (3 days onsite)Duration : Long Term Job Summary:Primary skills:OpenShift, Rancher Kubernetes(RKE), Python and Shell Scripting, Linux, and Azure Cloud. ResponsibilitiesResponsible for reliability and support of Container Platform on-prem and external clouds (Azure /AWS /Google)Monitor and troubleshoot Container platform (Openshift), Rancher (RKE) and Azure (AKS) environment performance issues, connectivit

Site Reliability Engineer

Tandym Tech

New York, New York, USA

Contract

A recognized media services organization in California is currently seeking a new Site Reliability Engineer (SRE) to maintain and improve an existing system focused on linear channel delivery whilst working on a new system to modernize this process. Responsibilities: Supporting the Streaming Engineers and providing hands-on Site Reliability Engineering support to manage production observability, incident response and assisting teams with DevOps processes Managing infrastructure as code and leve

Site Reliability Engineer

Aduril Industries

Seattle, Washington, USA

Full-time

Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century's most innovative companies to the defense industry, Anduril is changing how military systems are designed, built and sold. Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and c

Site Reliability Engineer

iTvorks Inc

Reston, Virginia, USA

Contract

Job Title: Site Reliability Engineer Location: Reston, VA Duration: 24 Months Overall years of experience: 8+ years of related experience in their specific area with experience leading teams on projects with similar scope and complexity. Certifications: AWS Solutions Architect, Agile Certified Practitioner (ACP), or relevant cloud certifications. Job Description: We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a stro

Site Reliability Engineer

Aduril Industries

Costa Mesa, California, USA

Full-time

Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century's most innovative companies to the defense industry, Anduril is changing how military systems are designed, built and sold. Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and c

SRE

Galent

Omaha, Nebraska, USA

Full-time

SRE Location : Omaha, Nebraska (High Priority) Local preferred or Near by States . Job Summary Seasoned Site Reliability Engineer (SRE) with 5+ years of experience in supporting complex, large-scale distributed systems. Highly skilled in managing production failures, conducting root cause analysis, and driving effective remediation. Strong communicator with expertise in ing, monitoring, and release management, complemented by automation proficiency and a keen ability to learn quickly. This role

Site Reliability Engineer

Peritus Inc.

San Jose, California, USA

Full-time

-Experience with Site Reliability Engineering (SRE) concepts and practices -Strong understanding of monitoring/observability tools (e.g., Grafana) -Debugging experience across Java and React-based applications -Strong troubleshooting and incident management skills -Kubernetes not required

Site Reliability Engineer

Compsciprep LLC

Owings Mills, Maryland, USA

Contract

Site Reliability Engineer-Location- Owings Mills, MD (must be onsite 2 days) Duration- 6 months (with possible extension) Site Reliability Engineer- 5+ years of Site Reliability Engineering experience 3+ years of Amazon Web Services (AWS) platform experience Strong experience with Monitoring and Alerting tools such as Prometheus, Grafana, New Relic

SRE

Fynbosys Inc

Texas City, Texas, USA

Full-time

For SRE they need basic system monitoring, Ansible Scripting, Azure, Cloud operating Network, Python, basic understanding of the cloud. Job Description: We are seeking a dedicated Site Reliability Engineer II to join our team. In this role, you will be responsible for ensuring the reliability, availability, and performance of our systems and services. You will work closely with cross-functional teams to implement best practices and drive improvements in our infrastructure. Responsibilities- Moni

Data Center Site Reliability Engineer (SRE)

ARROWCORE GROUP

Memphis, Tennessee, USA

Full-time

Title: Data Center Site Reliability Engineer (SRE) Location: Memphis, TN (Onsite) Duration: FTE About the Role We are seeking a Data Center Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of large-scale data center infrastructure supporting advanced AI workloads. In this role, you will collaborate with cross-functional teams to automate operations, enhance observability, and maintain high availability for distributed systems. This is a hands-on technical p

SRE Engineer (L3 Support)

Stanley David and Associates

San Jose, California, USA

Full-time

Role :: SRE Engineer (L3 Support) Location :: San Jose, CA / RTP, NC Type :: Fulltime Job Description Must Have Technical/Functional Skills SRE, NetApp Storage, Linux Certified, Kubernetes Certified, DevOps, Docker, etc.Roles & Responsibilities Experienced Senior SRE working on Kubernetes, On-Premises experienceCandidate should work independently with little guidance from the leads.Experience in working with AWS.Experience in DB technologies in PostGres and MongoDB.Experience in working with th

Site Reliability Engineer

ITECS

Frisco, Texas, USA

Third Party

Position: Site Reliability Engineer Number of positions: 7 Location: FRISCO, TX (Onsite only) Any Visa will work. Required skills Monitors the T-Cloud Platform using the T-Cloud Observability tooling, based on ServiceNow. Resolves incidents across the SDN (Cisco ACI), the CaaS layer (Red Hat OpenShift) and the Telco applications (first application is Mavenir IMS). Monitors the automation and resolves any issues with these automations (if possible). Oversees incident resolution process Escalat

Site Reliability Engineer

MDMS Recruiting

Florham Park, New Jersey, USA

Contract

W2 ONLY | No C2C / No Corp to Corp HYBRID IN FLORHAM PARK, NJ (MUST BE ONSITE 3 DAYS A WEEK) Our client is seeking a Site Reliability Engineer (SRE) to work with teams to understand the standards of product development and recommend changes to increase stability of the products and applications. The SRE will build software to improve DevOps, IT Ops, and support processes which support code models - Infrastructure as code and Platform as a service. Experience and skills needed: Experience working

SRE Engineer

Empower Professionals

Fort Mill, South Carolina, USA

Contract, Third Party

Role: SRE Locations: Fort Mill, SC (Fully Onsite) Duration: 12+ Months Contract Responsibilities: Design and architect systems that are highly available, scalable, and reliable through collaboration with cross-functional teams.Lead incident response efforts during system outages or performance degradations, coordinating with various teams tquickly diagnose issues and implement solutions. Develop and refine incident management processes.Provide mentorship and guidance thelp develop technical skil

Site Reliability Engineer

Epis Data Inc

Florham Park, New Jersey, USA

Full-time

Role: Site Reliability EngineerLocation: Florham Park, NJ - Hybrid 3 days onsite(Onsite Day 1)Experience: 7+ yearsRate: $60/hr. on C2CClient: ADP **Due to client requirements, we need or candidates.** The interview will be virtual on July 11 AM or 11:30 AM EST Experience/Skills: Strong Windows and OpenStack experience Ability to analyze and resolve problems in systems, networks, software, and APIs; understanding where all sources of information can come from. Strong experience with Splunk and Dy

SRE

Ajace Inc

Reston, Virginia, USA

Full-time

Key Responsibilities: 1. Cloud Infrastructure & Automation: Design, implement, and manage cloud-based infrastructure using platforms like AWS, Azure, or Google Cloud Platform. Utilize Infrastructure-as-Code (IaC) tools such as Terraform, CloudFormation, and Ansible to automate deployments and configurations. Create robust automation targeted at anomaly detection, toil reduction, recovery processes, and self-healing mechanisms, and optimize cloud costs. 2. DevSecOps & CI/CD: Deep understanding of

Site Reliability Engineer

Pyramid Consulting, Inc.

Westlake, Texas, USA

Contract

Immediate need for a talented Site Reliability Engineer. This is a 15+months contract opportunity with long-term potential and is located in Westlake, TX(Hybrid). Please review the job description below and contact me ASAP if you are interested. Job ID: 25-75208 Pay Range: $70 - $73/hour. Employee benefits include, but are not limited to, health insurance (medical, dental, vision), 401(k) plan, and paid sick leave (depending on work location). Key Requirements and Technology Experience: Key Sk