Lead Java Production Support(Splunk & Newrelic Mandatory)

Overview

Remote
Full Time

Skills

splunk
Newrelic

Job Details

Area(s) of responsibility

We are seeking a highly experienced and technically proficient Senior Production Support Engineer to join our dynamic technology team in the USA. The ideal candidate will have 7-10 years of hands-on experience in supporting high-volume, critical applications, with deep expertise across our core technology stack and a solid understanding of the financial services domain.

Key Responsibilities

Incident Management & Resolution: Act as the primary point of contact for high-priority production incidents. Drive timely resolution, perform root cause analysis (RCA), and implement preventive measures to minimize future occurrences.
Application Monitoring & Health: Proactively monitor the health, performance, and capacity of production applications using advanced monitoring tools like Splunk and New Relic. Develop and maintain dashboards, alerts, and runbooks.
Change Management: Evaluate, approve, and oversee production changes, adhering strictly to Change Management protocols to ensure stability and minimize risk. Participate in release and deployment activities.
Performance Optimization: Identify performance bottlenecks in application code and infrastructure (Java, Database, Cache) and collaborate with development teams to implement fixes and efficiency improvements.
System Maintenance: Perform regular system maintenance, health checks, and capacity planning for application infrastructure running on AWS and Pivotal Cloud Foundry (PCF).
Documentation & Knowledge Sharing: Create and maintain comprehensive support documentation, knowledge base articles, and troubleshooting guides.
On-Call Support: Participate in an on-call rotation to provide 24/7 support for critical production systems.

Required Technical Skills & Experience (7-10 Years)
Core Programming: Java (Deep proficiency) 7+ Years
Frameworks: Spring Boot (Microservices architecture-Extensive
Frontend: React (Understanding of application flow): Proficient
Cloud/PaaS: AWS, Pivotal Cloud Foundry (PCF): Strong
Database: MySQL (Querying, optimization, troubleshooting): Strong
Caching/Messaging: Redis, Cache Management principles: Expert
Monitoring/Logging: Splunk, New Relic: Expert (Developing queries, dashboards, alerts)
Process: Incident Management, Change Management: Strong (ITIL framework knowledge is a plus)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.