Site reliability Lead Engineer-Hybrid- Independent Candidate
Hybrid in Rock Hill, SC, US • Posted 7 hours ago • Updated 7 hours ago

Visionary Innovative Technology Solutions
Dice Job Match Score™
👾 Reticulating splines...
Job Details
Skills
- Cloud Computing
- Ansible
- DevOps
- Change Management
- Kubernetes
- Python
Summary
Summary
A senior technical leader responsible for owning a reliability strategy, leading an SRE team, and ensuring the operational health, scalability, and availability of services. Combines handson engineering, automation, and people leadership to drive reliability across the organization.
Core responsibilities
Strategy & process
- Define SRE strategy, process frameworks, standards, and best practices.
- Establish SLIs, SLOs, and error budget policies; embed reliability into the SDLC.
- Promote a culture of service ownership and maintain strong crossteam feedback loops.
Reliability & capacity
- Oversee monitoring and maintenance to meet SLAs and uptime targets.
- Drive capacity planning and forecasting to ensure performance at scale.
- Use data and metrics to prioritize reliability investments and tradeoffs.
Automation & tooling
- Lead automation efforts to eliminate operational toil and streamline runbooks.
- Oversee Infrastructure as Code practices (for example Terraform, CloudFormation) and configuration management.
- Improve CI/CD pipelines to enable safer, faster releases.
Incident & change management
- Lead incident response and communications during outages.
- Conduct blameless postmortems and ensure corrective actions are executed.
- Govern change control to ensure safe, tested production deployments.
Collaboration & communication
- Partner with engineering, architecture, and product teams to bake reliability into designs and roadmaps.
- Translate technical issues and tradeoffs for technical and nontechnical stakeholders.
Team leadership
- Hire, mentor, and develop SRE engineers; set team goals and a roadmap.
- Lead calmly and effectively under pressure during critical incidents and drive customer focused decisions.
Qualifications & skills
Technical
- Proven SRE/DevOps/infrastructure experience (typically 6+ years) with leadership experience (about 2 3 years).
- Strong cloud experience (AWS preferred), containerization (Docker), and orchestration (Kubernetes).
- Expertise with IaC and automation tools (Terraform, CloudFormation, Ansible, or similar).
- Proficient in scripting and programming for automation (Python, Bash, or similar).
- Deep experience with monitoring and observability tooling (Prometheus, Grafana, ELK/ELK Stack, Splunk, Datadog, etc.).
Leadership & soft skills
- Strong people leadership and coaching skills with proven stakeholder communication.
- Excellent problem solving, analytical thinking, and adaptability.
- Strategic mindset balancing engineering excellence with business priorities.
Deliverables
- A measurable reliability roadmap aligned to business goals.
- Reduced operational toil through automation and improved runbooks.
- Clear SLIs, SLOs and established error budget governance.
- A high performing SRE team with documented processes for incident and change management.
- Dice Id: 91020323
- Position Id: 8876151
- Posted 7 hours ago
Company Info
VITS provide staffing and recruitment services along with technology consulting to more than 50+ clients globally Our skilled & expertise professionals help clients to manage varying skill needs, skills gaps and changing staffing needs to encounter project deadlines. VITS staff augmentation services provide skilled resources which assist clients to develop, maintain, manage and support their applications. Our vigorous pursuit for excellence in hiring, delivery model, work ethics, and approach has enabled us to become a highly trusted & preferred recruitment solution provider.
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs