Location: JERSEY CITY, NJ (Hybrid)
Primary Skills: Production Support (L3)Database/ Network/ Server admin, Automation (Python/Shell)
Secondary Skills: Platform Support, Security & Compliance, Modernization, Microservices, Loan IQ (plus)
REQUIRED_SKILL:
Support and execute complex software upgrades across multi‑tier environments including application servers, databases, container platforms, web services, and middleware; experience with Loan IQ is a plus, and familiarity with microservices architecture is advantageous.
Drive ongoing stability and scalability improvements across interconnected applications, infrastructure components, and batch processing systems.
Demonstrate a thorough understanding of connectivity using NDM/Connect:Direct, Secure+ configurations, and secure file transfer flows; experience with certificate management and ID/credential handling using secure vaults is required.
Perform triage, monitoring, and performance analysis for recurring jobs and batch workflows; until automated, participate in rotation support cycles.
Participate in operational on‑call rotations and support scheduled weekend releases when required.
Identify repetitive operational tasks and implement automation using Python, scripting frameworks, or AI‑based tools.
Maintain and troubleshoot enterprise application environments spanning servers, databases, interfaces, automation platforms, and messaging components.
Execute environment refreshes, replication efforts, and controlled data lifecycle tasks such as archival and purging.
Manage access controls, credential vaulting, and password rotation processes for supported systems.
Support audit, compliance, and risk‑related activities including patching cycles, vulnerability remediation, and required evidence collection.
Contribute to modernization, optimization, and engineering‑driven improvements across the platform.
Build lightweight tooling—scripts, dashboards, or internal utilities—to reduce manual intervention and enhance observability.
Participate in cross‑functional projects such as rewrites, migrations, internal tooling enhancements, and platform upgrades.
Improve the reliability of recurring jobs and workflows by enhancing monitoring, alerting, automation, and recovery capabilities.
Take proactive preventive measures against recurring issues, document new problems and the guardrails needed to prevent re‑occurrence, and share knowledge with team members through consistent knowledge transfer.
DESIRED_SKILL:
Support and execute complex software upgrades across multi‑tier environments including application servers, databases, container platforms, web services, and middleware; experience with Loan IQ is a plus, and familiarity with microservices architecture is advantageous.
Drive ongoing stability and scalability improvements across interconnected applications, infrastructure components, and batch processing systems.
Demonstrate a thorough understanding of connectivity using NDM/Connect:Direct, Secure+ configurations, and secure file transfer flows; experience with certificate management and ID/credential handling using secure vaults is required.
Perform triage, monitoring, and performance analysis for recurring jobs and batch workflows; until automated, participate in rotation support cycles.
Participate in operational on‑call rotations and support scheduled weekend releases when required.
Identify repetitive operational tasks and implement automation using Python, scripting frameworks, or AI‑based tools.
Maintain and troubleshoot enterprise application environments spanning servers, databases, interfaces, automation platforms, and messaging components.
Execute environment refreshes, replication efforts, and controlled data lifecycle tasks such as archival and purging.
Manage access controls, credential vaulting, and password rotation processes for supported systems.
Support audit, compliance, and risk‑related activities including patching cycles, vulnerability remediation, and required evidence collection.
Contribute to modernization, optimization, and engineering‑driven improvements across the platform.
Build lightweight tooling—scripts, dashboards, or internal utilities—to reduce manual intervention and enhance observability.
Participate in cross‑functional projects such as rewrites, migrations, internal tooling enhancements, and platform upgrades.
Improve the reliability of recurring jobs and workflows by enhancing monitoring, alerting, automation, and recovery capabilities.
Take proactive preventive measures against recurring issues, document new problems and the guardrails needed to prevent re‑occurrence, and share knowledge with team members through consistent knowledge transfer.