Overview
Skills
Job Details
Title: Site Reliability Engineering (SRE) Lead OR Principal DevOps Engineer
Duration: 12+ Months
Location : Frisco , TX (Onsite)
Minimum 10+ years of experience in relevant area. Team Leadership: Strong ability to mentor and manage teams using collaborative platforms like Jira, Teams, and Confluence. Excellent communication and collaboration skills. System Design and Architecture: Expertise in designing scalable and reliable systems using tools like Kubernetes, Docker, and cloud services (AWS, Azure, Google Cloud Platform). Experience with Kafka, Cassandra, and other infrastructure tools
Familiarity with middleware technologies such as Kafka, APIs, and Microservices architecture.
Incident Management: Proficiency in managing incidents using tools like Pager Duty, xMatters, alongside conducting effective post-mortems.
Monitoring and Analytics: Experience with monitoring tools such as Splunk, AppDynamics, Grafana Nice to have skills