Job#: 3024858 Job Description: Site Reliability Engineer III
Location: Chandler, Arizona (Hybrid)
Employment Type: Contract
Duration: 12 months
Role Overview
This position is responsible for building, managing the reliability of, and supporting an internal, on-premises Cloud Container Platform. The role involves monitoring, troubleshooting, and performing deep analysis of systemic reliability issues. The engineer will partner with various teams to implement fixes, improve automation, and ensure the platform's resilience and security.
Key Responsibilities
- Monitor and troubleshoot performance, connectivity, and security issues for container platforms including OpenShift, Rancher, and VMware Kubernetes Service (VKS/TKG).
- Perform deep dives into systemic reliability issues, manage incidents and problems, and conduct blameless root cause analyses (RCA).
- Identify, analyze, and resolve infrastructure vulnerabilities and application deployment issues.
- Manage application onboarding and provide troubleshooting support throughout the application lifecycle.
- Identify and drive opportunities to improve automation, reduce operational toil, and enhance operational excellence.
- Collaborate with risk and compliance teams to implement controls and remediate vulnerabilities.
- Ensure platform resiliency during implementation and partner with engineering teams to resolve resiliency problems.
- Act as a key stakeholder in the design of cloud services, working with architecture and product teams.
- Participate in a 24x7 on-call rotation following a follow-the-sun model.
- Adhere to a hybrid work schedule, with a requirement of a minimum of three days per week on-site.
Required Qualifications
Education: A Bachelor's or Master's degree in Computer Science or a related technical field, or equivalent practical experience.
Experience: A minimum of five years of hands-on experience supporting Kubernetes or OpenShift Container platforms. Experience working in a highly available, multi-datacenter environment is required.
Technical Skills:- Proficiency in Python, Ansible, Golang, and shell scripting.
- Strong experience with compute, storage, network, and security services.
- Advanced knowledge of Linux OS, DNS, DHCP, Kerberos, and Windows Authentication.
- Experience with CI/CD tools such as Git/Jenkins and GitOps deployment methodologies.
- Experience with monitoring tools like Prometheus, Splunk, Dynatrace, or Sysdig.
- Experience with IAM infrastructure, including Active Directory and SSO solutions like Ping Identity.
- Familiarity with container security, vulnerability remediation, OpenShift virtualization on VMware ESX, and container networking.
Preferred Qualifications
- Certifications in Kubernetes, OpenShift, Rancher, or Terraform are a plus.
- Experience with Terraform, ArgoCD, Tekton, and K-native technologies.
- Knowledge of various container runtimes and familiarity with the operator deployment pattern.
- Understanding of cost and inventory management principles.
Compensation & Benefits
The anticipated pay range for this position is $65.00 to $70.24 per hour. Please note that the final pay rate will be determined by a variety of factors, including the candidate's experience, qualifications, and location. A benefits package is available to eligible employees.
We are an equal opportunity employer and welcome applications from all qualified candidates regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.
Apex uses a virtual recruiter as part of the application process. Click for more details.
If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Benefits Department at or .
Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico. Apex uses a virtual recruiter as part of the application process. Click for more details.
Apex Benefits Overview: Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our 'Welcome Packet' as well, which an Apex team member can provide.