Overview
Skills
Job Details
What Working at Hexaware offers:
Hexaware is a dynamic and innovative IT organization committed to delivering cutting-edge solutions to our clients worldwide. We pride ourselves on fostering a collaborative and inclusive work environment where every team member is valued and empowered to succeed.
Hexaware provides access to a vast array of tools that enhance, revolutionize, and advance professional profile. We complete the circle with excellent growth opportunities, chances to collaborate with highly visible customers, chances to work alongside bright brains, and the perfect work-life balance.
With an ever-expanding portfolio of capabilities, we delve deep into and identify the source of our motivation. Although technology is at the core of our solutions, it is still the people and their passion that fuel Hexaware s commitment towards creating smiles.
At Hexaware we encourage to challenge oneself to achieve full potential and propel growth. We trust and empower to disrupt the status quo and innovate for a better future. We encourage an open and inspiring culture that fosters learning and brings talented, passionate, and caring people together.
We are always interested in, and want to support, the professional and personal you. We offer a wide array of programs to help expand skills and supercharge careers. We help discover passion the driving force that makes one smile and innovate, create, and make a difference every day.
What would you do?
Job Description:
Position: Site Reliability Engineering (SRE) Lead
Location: Fortmill, SC (3 days onsite in a week, hybrid)
Summary
A senior technical leader responsible for owning reliability strategy, leading an SRE team, and ensuring the operational health, scalability, and availability of services. Combines hands on engineering, automation, and people leadership to drive reliability across the organization.
Core responsibilities
Strategy & process
- Define SRE strategy, process frameworks, standards, and best practices.
- Establish SLIs, SLOs, and error budget policies; embed reliability into the SDLC.
- Promote a culture of service ownership and maintain strong cross team feedback loops.
Reliability & capacity
- Oversee monitoring and maintenance to meet SLAs and uptime targets.
- Drive capacity planning and forecasting to ensure performance at scale.
- Use data and metrics to prioritize reliability investments and tradeoffs.
Automation & tooling
- Lead automation efforts to eliminate operational toil and streamline runbooks.
- Oversee Infrastructure as Code practices (for example Terraform, CloudFormation) and configuration management.
- Improve CI/CD pipelines to enable safer, faster releases.
Incident & change management
- Lead incident response and communications during outages.
- Conduct blameless postmortems and ensure corrective actions are executed.
- Govern change control to ensure safe, tested production deployments.
Collaboration & communication
- Partner with engineering, architecture, and product teams to bake reliability into designs and roadmaps.
- Translate technical issues and tradeoffs for technical and nontechnical stakeholders.
Team leadership
- Hire, mentor, and develop SRE engineers; set team goals and a roadmap.
- Lead calmly and effectively under pressure during critical incidents and drive customer focused decisions.
Qualifications & skills
Technical
- Proven SRE/DevOps/infrastructure experience (typically 6+ years) with leadership experience (about 2 3 years).
- Strong cloud experience (AWS preferred), containerization (Docker), and orchestration (Kubernetes).
- Expertise with IaC and automation tools (Terraform, CloudFormation, Ansible, or similar).
- Proficient in scripting and programming for automation (Python, Bash, or similar).
- Deep experience with monitoring and observability tooling (Prometheus, Grafana, ELK/ELK Stack, Splunk, Datadog, etc.).
Leadership & soft skills
- Strong people leadership and coaching skills with proven stakeholder communication.
- Excellent problem solving, analytical thinking, and adaptability.
- Strategic mindset balancing engineering excellence with business priorities.
Deliverables
- A measurable reliability roadmap aligned to business goals.
- Reduced operational toil through automation and improved runbooks.
- Clear SLIs, SLOs and established error budget governance.
- A high performing SRE team with documented processes for incident and change management.
Equal Opportunities Employer:
Hexaware Technologies is an equal opportunity employer. We are dedicated to providing a work environment free from discrimination and harassment. All employment decisions at Hexaware are based on business needs, job requirements, and individual qualifications. We do not discriminate based on race including colour, nationality, ethnic or national origin, religion or belief, sex, age, disability, marital status, sexual orientation, parental status, gender reassignment, or any other status protected by law. We encourage candidates of all backgrounds to apply.
Find out more at Hexaware.com.