Role: Site Reliability Engineer on w2
Location: Hybrid - Chandler, AZ (On Site 3x/Week)
Duration: Eighteen-Month Contract
The Site Reliability Engineer (SRE) Lead is responsible for ensuring the reliability, scalability, performance, and security of enterprise Linux-based systems. This role combines deep technical expertise in Linux administration with leadership in automation, observability, incident management, and infrastructure engineering. The SRE Lead drives operational excellence by implementing best practices, improving system resilience, and mentoring engineering teams.
Platform Engineering Operations
Lead the administration, monitoring, and performance tuning of Oracle Enterprise Linux (OEL) environments in a large-scale enterprise ecosystem.
Oversee the design, build, and lifecycle management of Linux servers, including storage, virtualization, and associated infrastructure.
Manage high availability (HA) configurations, clustering, and load-balanced environments to ensure minimal downtime.
Required Qualifications
5+ years of experience in Linux system administration in enterprise environments.
Strong expertise in Oracle Enterprise Linux (OEL) systems.
Proven experience in high availability systems, virtualization, and storage management.
Hands-on experience with automation and configuration management tools (Ansible preferred).
Proficiency in at least one scripting/programming language (Bash, Python preferred).
Strong experience with infrastructure troubleshooting, performance tuning, and incident management.
Solid understanding of enterprise infrastructure (compute, storage, network).
Excellent analytical, problem-solving, and organizational skills.
Strong communication and collaboration skills in a global team environment.