Why UKG:
At UKG, the work you do matters. The code you ship, the decisions you make, and the care you show a customer all add up to real impact. Today, tens of millions of workers start and end their days with our workforce operating platform. Helping people get paid, grow in their careers, and shape the future of their industries. That's what we do.
We never stop learning. We never stop challenging the norm. We push for better, and we celebrate the wins along the way. Here, you'll get flexibility that's real, benefits you can count on, and a team that succeeds together. Because at UKG, your work matters-and so do you.
About the Team
Staff Site Reliability Engineers (SREs) at UKG are senior individual contributors who play a critical role in ensuring the reliability, scalability, and performance of our services. They bring a breadth of knowledge across service delivery and apply software engineering principles to operational challenges.
In this role, you will ensure the reliability, availability, and performance of production systems by applying software engineering practices to operations. SREs proactively monitor system health, manage risk through SLOs and error budgets, lead incident response, and enable safe, rapid change while balancing reliability and delivery velocity.
Staff SREs are passionate about learning and evolving with modern technologies. They strive to innovate and relentlessly pursue an excellent customer experience, with an "automate everything" mindset that enables services to be delivered with speed, consistency, and high availability.
This is a senior individual contributor role, focused on technical leadership, influence, and reliability impact.
About the Role and Job Responsibilities
- Engage in and improve the lifecycle of services from conception to end-of-life, including system design reviews, capacity planning, and production readiness.
- Define and implement standards and best practices for system architecture, service delivery, reliability, and automation, including the definition and monitoring of service health indicators (latency, traffic, error rates, and resource saturation), service level objectives (SLOs), and the use of error budgets to guide operational and delivery decisions.
- Support service, product, and engineering teams by providing common tooling and frameworks to increase availability and improve incident detection and response.
- Improve system performance, availability, and efficiency through automation, process refinement, post-incident reviews, and in-depth configuration analysis.
- Collaborate closely with engineering teams across the organization to deliver and operate reliable services.
- Increase operational efficiency, effectiveness, and service quality by treating operational challenges as software engineering problems (reducing toil).
- Guide junior team members and serve as a champion for Site Reliability Engineering best practices.
- Actively participate in incident responses, including on-call rotations and post-incident reviews, collaborating with engineering teams to restore service and reduce recurrence.
- Partner with stakeholders to influence and help drive the best possible technical and business outcomes.
Required Qualifications
- 5+ years of hands-on experience in software engineering, systems engineering, or cloud-based environments.
- 5+ years of experience working with public cloud platforms (e.g., Google Cloud Platform (preferred), AWS, or Azure).
- 5+ years of experience configuring, operating, and maintaining applications and/or systems infrastructure in a large-scale, customer-facing environment.
- Demonstrated understanding of observability best practices, including metric generation and collection, log aggregation pipelines, time-series databases, and distributed tracing.
- Experience coding in one or more higher-level programming languages (e.g., Python, Java, or C++).
- Strong working knowledge of Linux systems, including troubleshooting, performance analysis, and scripting in production environments.
- Experience with GitHub Actions and modern CI/CD practices.
- Experience building operational dashboards and alerts using observability tools such as Splunk or Grafana.
- Excellent communication and collaboration skills, with experience of mentoring and guiding engineers.
Preferred Qualifications
- Experience with distributed system design and architecture.
- Hands-on experience with cloud-native applications and containerization technologies (Kubernetes, containers).
- Experience with infrastructure-as-code and configuration management tools (e.g., Terraform, Ansible).
- Experience operating production workloads in Google Cloud Platform (Google Cloud Platform).
- Solid grounding in at least two of the following areas: Computer Science fundamentals, Cloud Architecture, Security, or Network Design.
- This position is 3 days a week on site in Lowell, MA.
Company Overview:
UKG is the Workforce Operating Platform that puts workforce understanding to work. With the world's largest collection of workforce insights, and people-first AI, our ability to reveal unseen ways to build trust, amplify productivity, and empower talent, is unmatched. It's this expertise that equips our customers with the intelligence to solve any challenge in any industry - because great organizations know their workforce is their competitive edge. Learn more at ukg.com.
Equal Opportunity Employer
UKG is an equal opportunity employer. We evaluate qualified applicants without regard to race, color, disability, religion, sex, age, national origin, veteran status, genetic information, and other legally protected categories.
View The EEO Know Your Rights poster
UKG participates in E-Verify. View the E-Verify posters here.
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
Disability Accommodation in the Application and Interview Process
For individuals with disabilities that need additional assistance at any point in the application and interview process, please email
The pay range for this position is $129,500.00 to $186,100.00. The actual base pay offered may vary depending on skills, experience, job-related knowledge and work location. In addition to base pay, employees may be eligible to participate in a performance-based bonus plan and to receive restricted stock unit awards as part of total compensation. Learn more about UKG's benefits and rewards at
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: RTL83235
- Position Id: c706e156835079bfce4ac770050e1d35
- Posted 1 day ago