Site Reliability Engineer

Austin, TX, US • Posted 10 hours ago • Updated 10 hours ago
Full Time
On-site
Depends on Experience
Company Branding Image
Fitment

Dice Job Match Score™

🎯 Assessing qualifications...

Job Details

Skills

  • Continuous Improvement
  • Innovation
  • Adaptability
  • Teamwork
  • Cost-benefit Analysis
  • Data Entry
  • Forms
  • Reporting
  • Specification Gathering
  • Scheduling
  • Software Engineering
  • Collaboration
  • Capacity Management
  • Scalability
  • Effective Communication
  • Systems Engineering
  • DevOps
  • Reliability Engineering
  • Linux
  • Unix
  • Scripting
  • Python
  • Java
  • Bash
  • Amazon Web Services
  • Google Cloud
  • Google Cloud Platform
  • Cloud Computing
  • Orchestration
  • Docker
  • Kubernetes
  • Service Level
  • Budget
  • Incident Management
  • Root Cause Analysis
  • Workflow
  • PASS
  • Articulate
  • Communication
  • Analytical Skill
  • Conflict Resolution
  • Problem Solving
  • Attention To Detail
  • Multitasking
  • Management
  • Streaming
  • Grafana
  • Splunk
  • CHAOS
  • Testing
  • Documentation
  • Dashboard
  • Health Care
  • Insurance
  • Productivity
  • System Integration Testing
  • Professional Development
  • Regulatory Compliance
  • Law

Summary

Overview

JOB TITLE:

Site Reliability Engineer

CAYUSE COMPANY:

Cayuse Civil Services, LLC

LOCATION

Hybrid in Austin, TX

SALARY:

$108,160.00-$153,920.00

EMPLOYEE TYPE:

Full-Time Salary Exempt

TRAVEL

No

RELOCATION

No



Employment in this role is conditional upon successful execution of the contract by the client.



The Work

The Site Reliability Engineer (SRE) is responsible for ensuring the reliability, availability, scalability, and performance of the organization's production systems. This role combines software engineering and systems engineering practices to automate and improve infrastructure operations, reduce manual work, and enable rapid response to incidents. The SRE partners with development, operations, and business teams to drive continuous improvement, implement resilient systems, and meet well-defined service level objectives (SLOs).

This position aligns with Cayuse's core values of Innovation, Excellence, Collaboration, Adaptability, and Integrity by fostering technical solutions that meet customer needs, promoting teamwork, and prioritizing quality in deliverables.

Responsibilities

  • Understand business objectives and operational challenges, and identify alternative technical solutions.
  • Perform studies and cost/benefit analyses to evaluate potential solutions.
  • Analyze user requirements, operational procedures, and workflow problems to identify opportunities for automation or improvement of computer systems.
  • Consult with personnel from different departments to understand current procedures, identify issues, and gather specific input and output requirements (e.g., data entry forms, reporting formats).
  • Write detailed descriptions of user needs, desired program functions, and the steps required to develop or modify computer programs.
  • Review computer system capabilities, technical specifications, and scheduling limitations to assess the feasibility of requested program changes.
  • Ensure the reliability, availability, performance, and scalability of production systems using software engineering practices.
  • Collaborate closely with development teams to design, build, and maintain resilient, observable, and automated platforms that meet defined service level objectives (SLOs).
  • Develop and implement automation tools to streamline manual and repetitive operational tasks.
  • Document processes, workflows, and system configurations to support ongoing operations and future enhancements.
  • Continuously monitor production systems, proactively addressing incidents and performance issues.
  • Participate in capacity planning and ongoing improvements to system resilience and scalability.
  • Maintain effective communication with executive management, business stakeholders, and cross-functional technical teams.
  • Stay current with emerging site reliability engineering practices, tools, and technologies.
  • Other duties as assigned .


Qualifications

Here's What You Need
  • 8 years of experience in systems engineering, DevOps, or site reliability engineering roles.
  • 8 years of strong experience with Linux/Unix systems and system internals.
  • 8 years of proficiency in one or more programming/scripting languages (e.g., Python, Go, Java, Bash).
  • 8 years of experience designing and operating highly available, distributed systems.
  • 8 years of strong knowledge of cloud platforms (such as AWS or Google Cloud Platform) and cloud-native services.
  • 8 years of experience with containerization and orchestration (e.g., Docker, Kubernetes).
  • 8 years of strong understanding of monitoring, alerting, and logging concepts.
  • 8 years of experience defining and managing Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets.
  • 8 years of familiarity with incident management, root cause analysis (RCA), and postmortems.
  • 8 years of experience integrating security and compliance into operational workflows.
  • Must be able to pass a background check. May require additional background checks as required by projects and/or clients at any time during employment.

Minimum Skills:
  • Exceptional interpersonal skills with the ability to communicate in a clear, professional, and articulate manner.
  • Exceptional verbal and written communication skills.
  • Excellent organizational, analytical, and problem-solving skills with high-level attention to detail.
  • Ability to analyze systems and procedures
  • Strong multitasking skills with the ability to manage multiple design streams across concurrent work effort.
  • Must be self-motivated and able to work well independently as well as on a multi-functional team.
  • Ability to handle sensitive and confidential information appropriately.



Desired Qualifications:
  • 4 years of familiarity with observability tools such as Prometheus, Grafana, Application Insights, Datadog,

or Splunk.
  • 4 years of experience operating 24x7 production environments, including participation in on-call rotations.
  • 4 years of experience with chaos engineering and resiliency testing.
  • 4 years of experience with feature flags, canary deployments, and progressive delivery strategies.
  • 4 years of strong documentation skills for creating runbooks, dashboards, and operational standards.


Our Commitment to you / overview of benefits
  • Medical, Dental and Vision Insurance; Wellness Program
  • Flexible Spending Accounts (Healthcare, Dependent Care, Commuter)
  • Short-Term and Long-Term Disability options
  • Basic Life and AD&D Insurance (Company Provided)
  • Voluntary Life and AD&D options
  • 401(k) Retirement Savings Plan with matching after one year
  • Paid Time Off



Reports to: Program Manager

Working Conditions
  • Professional office environment, with the ability to work onsite in the main office.
  • Must reside in the Austin area.
  • Must be physically and mentally able to perform duties extended periods of time.
  • Ability to use a computer and other office productivity tools with sufficient speed to meet the demands of this position.
  • Must be able to establish a productive and professional workspace.
  • Must be able to sit for long periods of time looking at computer screen.
  • May be asked to work a flexible schedule which may include holidays.
  • May be asked to travel for business or professional development purposes.
  • May be asked to work hours outside of normal business hours.
  • Travel costs, per diem, and other related expenses must be pre-approved in compliance with State of Texas travel guidelines.

Other Duties: Please note this job description is not designed to cover or contain a comprehensive list of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.

Cayuse is an Equal Opportunity Employer. All employment decisions are based on merit, qualifications, skills, and abilities. All qualified applicants will receive consideration for employment in accordance with any applicable federal, state, or local law.

Pay Range

USD $108,160.00 - USD $153,920.00 /Yr.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91099930
  • Position Id: 104327
  • Posted 10 hours ago

Company Info

About Cayuse Holdings, LLC

Founded initially in 2006, Cayuse Holdings is today comprised of a family companies headquartered near Pendleton, Oregon, with offices in the Washington D.C. metro area, and Honolulu, HI. Tribally owned by the Confederated Tribes of the Umatilla Indian Reservation (CTUIR), Cayuse Holdings is a 100% Indian Owned Economic Enterprise and a foremost provider of responsible sourcing/certified diversity solutions for commercial, government, and tribal clients.

Our Mission

To become the #1 American Indian-owned commercial, government, and tribal contractor, providing trusted value for our clients, reliable and rewarding careers for our employees and contributing to the growth of the CTUIR.

 

Our Vision

Grow the Company, Grow the People!

Šapásttawaxt kutkutpama,́ Šapásttawaxt natítayt

Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Austin, Texas

Today

Full-time

Compensation information provided in the description

Hybrid in Austin, Texas

Yesterday

Easy Apply

Contract

Depends on Experience

Austin, Texas

Today

Full-time

Compensation information provided in the description

Hybrid in Austin, Texas

Today

Easy Apply

Contract, Third Party

Depends on Experience

Search all similar jobs