Overview
Skills
Job Details
Role: Disaster Recovery Engineer
We re seeking a highly skilled Disaster Recovery Engineer with deep experience in designing and implementing resilient infrastructure across cloud and hybrid environments. This role requires both strategic thinking and hands-on technical expertise to ensure high availability, failover readiness, and rapid recovery in the event of service disruption.
Key Responsibilities: Architect and automate disaster recovery solutions across AWS, Azure, and Google Cloud Platform Design and implement failover strategies (including non-DNS-based approaches)
Build and manage infrastructure as code using Terraform, Helm, and Ansible Develop and maintain monitoring systems that go beyond basic health checks (e.g., behavioral checks, blue dye testing)
Manage API caching strategies (hot, warm, cold) and system synchronization
Support CI/CD pipelines using tools like Jenkins and GitHub Lead on-prem to cloud migrations, including backup data center connectivity and point-in-time recovery
Work with bare metal setups, EC2 routing, and cloud-integrated networking (Cisco/Juniper)
Ideal Candidate Profile: Proven experience with high availability systems and disaster recovery as a core competency
Strong understanding of load balancing, DNS strategies, and cloud-native failover Ability to explain complex recovery scenarios and infrastructure decisions clearly
Hands-on experience with OpenShift, dedicated fiber interconnects, and backup redundancy models