Must have
- Terraform Modules
- Data Recovery
- Python
- Ansible
• 7-10 years of experience in infrastructure engineering, IT operations, or a security-adjacent technical role.
• Experience at a financial institution (bank, broker-dealer, asset manager, insurance firm, or equivalent regulated entity), OR in a similarly regulated environment (healthcare, utilities, government).
• Exposure to disaster recovery or backup operations — including participating in DR tests, managing backup jobs, or executing restore procedures.
• Familiarity with at least one enterprise backup or replication platform: Cohesity, Rubrik, Veeam, Zerto, Commvault, or NetBackup.
• Basic scripting ability in Python, Bash, or PowerShell; comfort running and modifying existing scripts.
• Understanding of core networking concepts (VLANs, firewall rules, DNS, routing) relevant to isolated environment configuration.
• Strong documentation habits; ability to write clear, accurate technical procedures and test records.
• Awareness of regulatory frameworks such as FFIEC, NIST CSF, or NYDFS as they apply to technology and resilience.
Good to have/ desired skills
• Direct participation in an IRE or clean room recovery exercise, even in a supporting capacity.
• Exposure to regulatory examinations or audit walkthroughs in a technology or cybersecurity context.
• Familiarity with IaC tooling: Terraform, Ansible, or equivalent configuration management platforms.
• Coursework or self-study in cybersecurity, resilience engineering, or cloud infrastructure.
• Certifications (in progress or completed): CompTIA Security+, CySA+, AWS/Azure fundamentals, or vendor backup platform training.
• Experience with ticketing, change management, and ITSM workflows (ServiceNow or equivalent).
• Exposure to ransomware response or cyber incident response tabletop exercises.
What will be the roles and responsibilities of this candidate who will be hired for this position?
Isolated & Clean Room Recovery Support
- Assist in maintaining and operating the Isolated Recovery Environment (IRE) and clean room infrastructure under senior engineer guidance.
- Execute assigned steps in recovery runbooks during tabletop exercises, simulation drills, and full recovery tests.
- Document recovery test procedures, results, and deviations; flag anomalies to senior team members for triage.
- Support forensic validation tasks within the IRE, including integrity checks and configuration comparisons prior to production re-entry.
- Learn and apply clean room protocols, including network isolation verification and identity access controls during recovery scenarios.
Backup Platform Operations
- Perform day-to-day operational tasks across enterprise backup and replication platforms (e.g., Cohesity, Rubrik, Veeam, Zerto, or equivalent).
- Monitor backup job health, investigate failures, and escalate persistent issues with documented findings.
- Assist in configuring and validating backup policies, retention schedules, and replication targets for critical systems.
- Support testing of restore procedures for servers, databases, and applications; record RTO/RPO outcomes against targets.
Regulatory Documentation & Audit Support
- Assist in preparing evidence packages, control narratives, and test result documentation for regulatory examinations and internal audits.
- Maintain organized records of recovery test logs, exercise outcomes, and remediation tracking in line with regulatory standards (FFIEC, NIST CSF, NYDFS).
- Participate in walkthroughs and working sessions with regulators or internal audit teams alongside senior engineers; develop familiarity with examination processes.
- Support mapping of regulatory guidance to recovery engineering controls under direction of senior staff.
Recovery Engineering & Automation
- Execute scripted recovery automation tasks (Python, Bash, PowerShell) and assist in maintaining IaC-driven recovery environment configurations (Terraform, Ansible).
- Contribute to runbook updates and playbook maintenance as procedures evolve.
- Participate in post-exercise after action reviews (AARs); contribute observations and help track findings to closure.
- Collaborate with infrastructure, application, and database teams to understand system dependencies relevant to recovery sequencing.