Mainframe SRE (Site Reliability Engineer)

Overview

Remote
Depends on Experience
Contract - Independent
Contract - W2
Contract - 12 Month(s)

Skills

z/OS
SRE
Cobol
Banking
Hogan

Job Details

Site Reliability Engineer (SRE) with Mainframe Technologies

The Site Reliability Engineer will ensure the reliability and performance of mainframe applications. Responsibilities include developing automation tools, monitoring systems, conducting root cause analysis, and collaborating with teams to enhance system resilience.

About the Role:

Site Reliability Engineer (SRE) with deep expertise in Mainframe technologies like COBOL, JCL, etc., to support and enhance our z/OS Banking Systems. This role will be responsible for ensuring the reliability, high availability, scalability, stability, and performance of mission-critical mainframe software applications and systems, with a strong focus on mainframe technologies. This role blends software engineering with systems engineering to automate operations, improve system resilience, and drive continuous improvement.

Responsibilities:

  • Act as a hybrid engineer, balancing responsibilities across software development and site reliability engineering.
  • Design, develop, and maintain Mainframe applications using COBOL, JCL, SORT, and related technologies.
  • Monitor, maintain, and improve the reliability and performance of mainframe systems and applications.
  • Develop and implement automation tools for system health checks, incident response, and performance tuning to streamline operations and reduce manual intervention.
  • Collaborate with cross-functional (development, infrastructure, and operations) teams to design and develop proactive engineering solutions that enhance system scalability, reliability, and resilient systems.
  • Monitor and improve system performance, availability, and scalability across card management platforms.
  • Perform root cause analysis of production incidents and drive postmortem processes.
  • Drive incident response, root cause analysis, and continuous improvement initiatives. Participate in on-call rotations and ensure rapid incident resolution.

The qualified candidate will have the following experience/skills:

  • 7 to 10 years of experience working as a Site Reliability Engineer (SRE) with deep expertise in Mainframe-based technologies
  • Good Experience in COBOL, JCL (HP JCL knowledge added advantage), VSAM, File-Aid or Insync, Xpeditor, ENDEVOR, etc.
  • Good to have experience in IMS DB/DC and DB2
  • Ability to apply software development practices to operational challenges.
  • Banking Systems

Nice to have experience/skills include:

  • Hogan Core Banking
  • Certifications in SRE, ITIL, or Mainframe technologies.
  • Proven experience in Site Reliability Engineering, including automation, monitoring, and performance tuning
  • Experience with observability tools and incident management frameworks.
  • Excellent problem-solving and analytical skills with a pragmatic engineering approach.
  • Proficiency in automation and scripting tools.
  • Experience with monitoring tools (e.g., Splunk, IBM OMEGAMON etc.).

This is a 100% remote consulting opportunity. For immediate consideration and additional information, please reply with your resume.

 

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.