Hi,
Please find the detailed job description for the role below. If you are interested and feel it's a good fit, kindly share your updated resume.
Tittle – Resiliency and Recovery Engineer - Tech Lead
Location – Charlotte NC (5x/ week onsite)
Descriptions:
Own end‑to‑end application reliability, availability, and performance for client‑critical systems.
Define and govern SLIs, SLOs, and error budgets aligned with business and regulatory expectations.
Lead production support and incident management, acting as Incident Commander for P1/P2 issues.
Ensure robust monitoring, alerting, logging, and observability across application landscapes.
Drive automation and self‑healing to reduce manual toil and improve operational efficiency.
Partner with development and DevOps teams to embed SRE practices into CI/CD and release pipelines.
Oversee change and release readiness, ensuring risk‑based production deployments.
Provide on‑site client leadership, serving as the primary SRE point of contact and trusted advisor.
Conduct and govern post‑incident reviews (RCA/PIR) and ensure preventive actions are implemented.
Ensure compliance with security, audit, and regulatory controls relevant to the client environment.
Lead and mentor onshore and offshore SRE/support teams, ensuring SLA adherence and skill uplift.
Report operational KPIs, reliability trends, and improvement roadmaps to client and internal leadership.
SYSMIND LLC is an Equal Employment Opportunity employer. All qualified applicants will receive consideration for employment without any discrimination. We promote and support a diverse workforce at all levels in the company. All job offers are contingent upon completion of a satisfactory background check and reference checks. Additionally passing the drug test may also be required. All contractors intending to work on SYSMIND's W2 are "at will" employees.