Develop and lead the operational readiness strategy for the new Fraud product under the Customer s direction and oversight, ensuring successful transitions to production
Develop and enforce readiness checklists, including monitoring, logging, alerting, failover strategies, SLIs/SLOs, and runbooks
Partner with AppOpps site reliability engineers (SREs), product engineers, and fraud subject matter experts (SMEs) to proactively validate observability dashboards, runbooks, and alerting
Conduct Operational Readiness Reviews (ORRs), chaos drills, and production simulations prior to launch
Drive the creation of incident response playbooks specific to fraud platform services
Support the integration of operational readiness with other dependent applications (Real Time Rails)
Collaborate with platform and cloud teams to ensure infrastructure-as-code, deployment patterns, and rollback strategies align with fraud system criticality
Promote a culture of continuous improvement and learning through Failure Isolation & Recovery of Errors (FIRE) postmortems and readiness retrospectives
Facilitate operational automation reducing toil through tooling, self-healing scripts, and shift-left reliability practices
Define and track KPIs (Customer Journey Success) that measure operational maturity and launch health
Ensure the operational framework for the Fraud product is aligned with the Customer s broader product operations,
promoting standardization, improving issue attribution, and enabling level 1 support to accurately triage and route incidents to the right teams
Play a key role in embedding the Fraud product into the Customer s Central Command Center model, ensuring unified visibility, incident flow, and team accountability
Oversee knowledge management deliverables, ensuring required operational documentation (runbooks, escalation paths, FAQs, service definitions, etc.) is developed by technical writers in partnership with SMEs and available prior to launch