Overview
On Site
Full Time
Skills
Finance
Internet
SAP BASIS
Linux
Cloud Computing
Application Support
Web Portals
Web Services
Bridging
Turnover
Requirements Elicitation
Estimating
Design Documentation
Change Management
FTS
Open Systems
Root Cause Analysis
Collaboration
Risk Management
Auditing
Management
IT Operations
Database
IT Risk
Disaster Recovery
Recovery
Job Details
Job Title: Site Reliability Engineer
Assignment Type: ~12-month contract to hire
Location: Berkley Heights, NJ
Employment Type- W2 only no C2C
Summary of Position:
The SRE candidate will assume a key role in supporting the day-to-day operations of the Digital group within the Card Services organization of Client. The candidate is responsible for providing on-call service continuity and escalated support for Web Portal, Web Services, Micro Services and other assigned production applications.
The candidate will work closely with development and the technology groups to monitor and triage problems identified by different departments of the organization. This is a high performance culture and the candidate must demonstrate the ability to work efficiently and quickly on a financial Web Portals. You will be measured on your ability to meet individual and departmental objectives, and reduction of service affecting issues.
Responsibilities:
Provide 24x7 support of production Internet applications on a rotating basis.
Hands on understanding of Linux systems
Good understanding of cloud concepts
Point of escalation for application support to diagnose and resolve complex customer issues in accessing the Portal and Web Services environments
Drive Open Systems SEVerity crisis technical bridges and/or management bridges, as required and leverages experience and organizational knowledge to reduce MTTR
Review turnover paperwork to ensure that they are complete prior to production installs
Participate in the requirements gathering process, representing the production environments, to ensure that all operational aspects are identified and documented. Provide all tasks and detailed estimates to project managers, review and approve design documentation to ensure understanding of business logic changes and technical solution being implemented
Works with Change Management/ Release Managers to review propose change events for production
Work with FTS and Open System development to perform project code installations with assistance from the development and business groups. Validate successful implementations or fallbacks.
Document install-defects and assign severity to the problems that occurred. After fallback, perform post mortem to identify root cause analysis (RCA)
Direct incident recovery, and cross-functional teams to collaborate on identified issues
Identify and implement improvements to incident recovery, incident engagement, and incident communications
Perform trending and analysis of problems; anticipate problems and develop risk mitigation plans
Participate in internal and external audits, as requires by management
Ensures monitoring alerts and system events are assessed, prioritized, and worked aggressively
Escalate issues to the technology, operations, and/or vendor(s) where appropriate
Ensure database/application controls and procedures remain compliant with Corporate IT risk
Support Disaster Recovery tests and live recovery for all production environments
Work with Card Services architects to validate and design enterprise solutions and application monitoring tools
Assignment Type: ~12-month contract to hire
Location: Berkley Heights, NJ
Employment Type- W2 only no C2C
Summary of Position:
The SRE candidate will assume a key role in supporting the day-to-day operations of the Digital group within the Card Services organization of Client. The candidate is responsible for providing on-call service continuity and escalated support for Web Portal, Web Services, Micro Services and other assigned production applications.
The candidate will work closely with development and the technology groups to monitor and triage problems identified by different departments of the organization. This is a high performance culture and the candidate must demonstrate the ability to work efficiently and quickly on a financial Web Portals. You will be measured on your ability to meet individual and departmental objectives, and reduction of service affecting issues.
Responsibilities:
Provide 24x7 support of production Internet applications on a rotating basis.
Hands on understanding of Linux systems
Good understanding of cloud concepts
Point of escalation for application support to diagnose and resolve complex customer issues in accessing the Portal and Web Services environments
Drive Open Systems SEVerity crisis technical bridges and/or management bridges, as required and leverages experience and organizational knowledge to reduce MTTR
Review turnover paperwork to ensure that they are complete prior to production installs
Participate in the requirements gathering process, representing the production environments, to ensure that all operational aspects are identified and documented. Provide all tasks and detailed estimates to project managers, review and approve design documentation to ensure understanding of business logic changes and technical solution being implemented
Works with Change Management/ Release Managers to review propose change events for production
Work with FTS and Open System development to perform project code installations with assistance from the development and business groups. Validate successful implementations or fallbacks.
Document install-defects and assign severity to the problems that occurred. After fallback, perform post mortem to identify root cause analysis (RCA)
Direct incident recovery, and cross-functional teams to collaborate on identified issues
Identify and implement improvements to incident recovery, incident engagement, and incident communications
Perform trending and analysis of problems; anticipate problems and develop risk mitigation plans
Participate in internal and external audits, as requires by management
Ensures monitoring alerts and system events are assessed, prioritized, and worked aggressively
Escalate issues to the technology, operations, and/or vendor(s) where appropriate
Ensure database/application controls and procedures remain compliant with Corporate IT risk
Support Disaster Recovery tests and live recovery for all production environments
Work with Card Services architects to validate and design enterprise solutions and application monitoring tools
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.