RESPONSIBILITIES:
Kforce has a client that is seeking a Site Reliability Engineer (SRE) - Observability Focus in Greenwood Village, CO.
Overview:
We are looking for a steady-handed, systems-minded Site Reliability Engineer who thrives on making complex platforms more reliable, measurable, and predictable. This role partners closely with Software Engineering, Architecture, Operations, and Customer-facing teams to improve system health across a broad and evolving technology landscape.
You will own the observability lens - monitoring, alerting, deep-dive analysis, and reliability insights - helping teams stay within error budgets and proactively address customer-impacting issues. This is not a heavy feature-coding role. If your passion is writing production applications all day, this may not be the right fit. If you enjoy understanding how everything fits together and ensuring it runs smoothly at scale, you'll feel right at home.
Duties Include:
* Drive continuous improvement of system reliability across distributed systems
* Design and maintain monitoring, alerting, and observability strategies
* Perform deep-dive investigations into system behavior and performance anomalies
* Partner with Engineering, Architecture, Operations, and Front-End teams to identify root causes and recommend improvements
* Review and approve code from a reliability and observability standpoint (without owning the feature implementation)
* Maintain and support existing software solutions from an SRE perspective
* Analyze upstream and downstream dependencies including APIs, data stores, CI/CD pipelines, and client systems
* Provide guidance to teams on operating within established error budgets
REQUIREMENTS:
Technical Foundations:
* Strong proficiency in Python (primary scripting language for this role)
* Solid understanding of coding syntax, functions, and object-oriented programming patterns
Hands-on experience with:
* Docker (docker images, container runtimes, containerized environments)
* Kubernetes
* Linux shell
* YAML
* Experience working in cloud-native environments, particularly AWS
Strong understanding of:
* REST APIs
* System dependencies (upstream/downstream integrations)
* CI/CD pipelines
* Observability workflows and data flows
Observability & Monitoring:
* Experience with Datadog or Splunk (experience with both is a strong differentiator)
* Ability to query and analyze logs (including Splunk DDUPE and advanced queries)
* Familiarity with OpenSearch
* Understanding of how observability signals (logs, metrics, traces) flow through distributed systems
Source Control & DevOps:
* Experience with Git and GitLab
* Working knowledge of modern development workflows and code review processes
Nice to Have:
* Bash scripting
* Infrastructure as Code tools such as: Terraform; Ansible
* Deeper familiarity with design patterns and common engineering frameworks
What Makes Someone Successful Here:
* High learnability and intellectual curiosity
* Strong systems thinking - able to understand complex architectures across multiple technologies
* Comfort collaborating across development, operations, architecture, and customer-facing teams
* Analytical mindset with patience for deep investigation
* A steady pace - someone who wants to build sustainable reliability practices rather than chase constant feature delivery
This is a role for someone who enjoys understanding how systems behave under real-world conditions and making them better over time. You'll influence reliability without carrying the burden of feature ownership - a critical role that keeps the engine running smoothly while others are adding horsepower.
The pay range is the lowest to highest compensation we reasonably in good faith believe we would pay at posting for this role. We may ultimately pay more or less than this range. Employee pay is based on factors like relevant education, qualifications, certifications, experience, skills, seniority, location, performance, union contract and business needs. This range may be modified in the future.
We offer comprehensive benefits including medical/dental/vision insurance, HSA, FSA, 401(k), and life, disability & ADD insurance to eligible employees. Salaried personnel receive paid time off. Hourly employees are not eligible for paid time off unless required by law. Hourly employees on a Service Contract Act project are eligible for paid sick leave.
Note: Pay is not considered compensation until it is earned, vested and determinable. The amount and availability of any compensation remains in Kforce's sole discretion unless and until paid and may be modified in its discretion consistent with the law.
This job is not eligible for bonuses, incentives or commissions.
Kforce is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.
By clicking ?Apply Today? you agree to receive calls, AI-generated calls, text messages or emails from Kforce and its affiliates, and service providers. Note that if you choose to communicate with Kforce via text messaging the frequency may vary, and message and data rates may apply. Carriers are not liable for delayed or undelivered messages. You will always have the right to cease communicating via text by using key words such as STOP.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: kforcecx
- Position Id: ITTVT2169241
- Posted 5 hours ago