Mainframe Engineer
Remote Work + Travel (Atlanta)
Position Overview: Engineer Projects and Tier 2/3 Operations for mainframe z/OS (v/VSE). Provide Engineering support for Moves, Add, Changes and upgrades to existing Mainframe (LPARs) and ticket/incident support for operations, as required.
Main Job Responsibilities: You will work closely with key business stakeholders to support the following key activities:
Ensure adherence to enterprise architecture standards, technical roadmaps, and platform governance for z/VSE Mainframe environments.
Follow established change and release management practices, including planning, approvals, implementation, and post-change validation.
Complete service requests assigned to z/VSE Mainframe, ensuring accurate execution, documentation, and closure within SLAs.
Provide advanced troubleshooting for incidents assigned to z/VSE Mainframe, including deep-dive diagnostics and resolution coordination.
Deliver incident support for z/VSE Mainframe OS and vendor products running on z/VSE, ensuring operational stability and rapid recovery.
Act as an escalation point for z/VSE Mainframe support, providing technical leadership during high-severity events and complex outages.
Plan and execute z/VSE operating system upgrades and vendor software upgrades with minimal downtime and controlled risk.
Perform conversions and upgrades of z/VSE Assembler written exits and other language-based exits, ensuring compatibility with platform changes and vendor updates.
Implement and validate configuration changes to z/VSE systems, including system parameters, product configurations, and performance-related tuning.
Support Disaster Recovery (DR) testing for z/VSE systems, including runbook execution, issue remediation, and post-test reporting.
Provide 24x7 production support through on-call rotation, including incident response, escalations, and off-hours change support as required.
Document procedures, configurations, troubleshooting steps, and support processes for internal knowledge sharing and operational continuity.
Required experience
z/VSE mainframe operations/support: 8+ years of hands-on production support for z/VSE operating system (OS) environments.
Incident, problem, and service request execution: Proven experience owning incidents end-to-end (triage restore RCA inputs) and fulfilling service requests in an IT service management (ITSM) process.
Advanced troubleshooting: Demonstrated ability to diagnose complex z/VSE issues (performance, storage, jobs/batch, subsystems, vendor products) and drive resolution under SLA pressure.
Escalation leadership: Experience acting as an escalation point, coordinating with application teams, infrastructure teams, and vendor support.
Change/release management: Experience executing changes using formal change and release practices (risk/impact, backout plans, approvals, maintenance windows).
Installations/upgrades: Delivered z/VSE OS upgrades and vendor software upgrades (planning, implementation, testing, post-change validation).
Configuration management: Implemented and validated system configuration changes in z/VSE (with proper documentation and controls).
Assembler/language exit upgrades: Experience maintaining or upgrading z/VSE assembler-written exits (and/or other language exits), including conversions tied to OS/vendor upgrades.
Disaster recovery (DR) support: Participated in DR tests (runbooks, cutover/failover steps, validation, issue triage).
On-call: Prior experience supporting a 24x7 on-call rotation for critical infrastructure.
Required technical skills
Deep z/VSE OS administration: System IPL/maintenance concepts, logs/diagnostics, operational controls, and production hygiene.
Troubleshooting toolset: Ability to interpret system messages/logs/dumps (as applicable), job/batch behavior, storage/performance symptoms, and vendor product diagnostics.
Vendor product support on z/VSE: Competency supporting common z/VSE-adjacent/vendor components in your shop (list your actual products in the posting).
ITSM fluency: Strong working knowledge of incident/change/problem workflows and disciplined ticket documentation.
Release hygiene: Maintenance window execution, validation steps, backout planning, and stakeholder communications.
DR readiness: Runbook-driven execution, environment validation, and gap identification for DR procedures.
Documentation: Ability to create/update runbooks, upgrade checklists, and operational procedures.