Overview
Skills
Job Details
Job Description Role: Technology Engineer Lead (SRE)
Location: Phoenix, AZ (2 days a week in-office)
Shift: EST | Mon-Fri 07:00 am ET - 3:30 pm ET
Duration: 6 months
We are seeking a Technology Engineer Lead with expertise in Site Reliability Engineering (SRE) principles and a strong operational background. The ideal candidate will excel in critical incident management, possess hands-on experience with monitoring tools like BigPanda, Dynatrace, and TrueSight, and demonstrate a continuous improvement mindset.
Key Responsibilities
System Monitoring & Incident Management
- Monitor systems and infrastructure to ensure operational and performance levels.
- Collaborate with SRC professionals to troubleshoot, resolve incidents, and consult on issues.
- Engage in rotational on-call responsibilities for critical incident management.
- Conduct post-incident reviews to identify improvements and prevent recurrence.
Automation & Process Optimization
- Develop automation solutions to reduce manual tasks and operational noise.
- Enhance processes, tools, and services to create sustainable, reliable systems.
Data Analysis & Performance Tuning
- Gather and analyze metrics from tools and logs to diagnose and resolve system/application issues.
- Anticipate potential problems and collaborate on preventive solutions.
Collaboration & Knowledge Sharing
- Work closely with application and infrastructure support engineers to improve reliability.
- Document processes and procedures for testing and support requirements.
Technical Skills & Expertise
Server Management
- Proficient in Linux and Windows administration, troubleshooting, patching, and scripting (PowerShell, Bash).
Converged Solutions
- Hands-on experience with VCE/UCP, VMWare (v6+), platform/network connectivity, and threat remediation.
Storage Systems
- Expertise in CIFS/NFS, DPA reporting, Avamar, Data Domain administration, and scripting.
Middleware Management
- Experience with WebSphere, Apache, IIS, WebLogic, Tomcat on Linux and Windows.
Mainframes & Networking
- Knowledge of JCL, CICS SYSPLEX, OSI model, and networking protocols (Network+ certification preferred).
Workflow & Collaboration Tools
- Proficiency in ServiceNow, TrueSight, Jira, and Confluence.
ITSM Processes
- Skilled in ITIL fundamentals (Problem, Change, Release, Event, Incident Management).
- Familiarity with Lean operations analytics methodologies.
Soft Skills
- Strong communication and interpersonal skills for effective stakeholder collaboration.
- Ability to prioritize incidents in a high-volume environment.
- Self-motivated and capable of handling multiple projects with minimal supervision.
- Problem resolution skills for managing critical production incidents.
- Adaptability to quickly learn new technologies and processes.
Education & Experience
- Bachelor s degree in Engineering, Computer Science, or a related field (or equivalent experience).
- Minimum of 2 years of experience supporting a large enterprise center.