Overview
Skills
Job Details
Job Title: Application Support Engineer III
Job Category: Application Dev.
Location: 100% Remote
Duration: 15+ Months (Extendable)
Must Have
- AWS
- Containers
- Java
- Java Springboot
- Linux
- Messaging/Eventing (MQ + Kafka)
- Networking
- SiteScope
- Splunk
- SQL
JOB SUMMARY
Senior Application Support En
JOB DESCRIPTION
RTP Application Support Engineer III (Onshore: Technical Lead and Sr TechOps)
Role Summary:
Serves as the technical lead for RTP Rail TechOps, overseeing incident response, mentoring offshore engineers, and driving continuous improvement across monitoring, tooling, and documentation. Acts as a strategic liaison between TechOps and development teams supporting RTP services.
Key Responsibilities:
Leadership & Strategy
- Build deep expertise in RTP integrations and transaction flows.
- Establish cross-functional relationships across payment rail teams.
- Own escalation decisions and handoffs from TechOps to development.
- Mentor offshore engineers.
- Lead postmortems, RCA reviews, and playbook updates.
Incident Monitoring & First Response
- Monitor SiteScope and Splunk alerts for system health and transaction anomalies.
- Perform initial triage using documented playbooks.
- Escalate critical issues to on-call development and leadership.
Troubleshooting & Analysis
- Investigate API errors, transaction failures, and infrastructure issues using Splunk, database queries, and admin tools.
- Run SQL queries on RTP-related databases (Aurora or equivalent) to validate data integrity.
Ticket Management & Shift Coverage
- Log and manage incidents in ServiceNow and Confluence.
- Ensure smooth shift handovers with detailed documentation.
- Own EST to IST shift transitions and knowledge continuity.
Technical Leadership
- Lead L2 troubleshooting for RTP services and integrations.
- Evaluate customer impact of unplanned dependency outages, and planned maintenance.
- Collaborate with SRE and DevOps teams to debug containerized services (e.g., AWS/ECS/Fargate).
- Drive RCA and postmortem processes for critical incidents.
- Maintain and improve technical documentation and playbooks.
Tooling & Automation
- Enhance Splunk alerting and dashboard visibility.
- Collaborate with SRE/DevOps to improve SiteScope/Splunk alerts & configurations.
- Develop automation scripts for monitoring.
- Define and maintain SLO-based alerting thresholds.
On-Call & Reliability
- Participate in 24x7 on-call rotation.
- Partner with development teams to improve CI/CD reliability and observability.
Key Skills:
- Java/Springboot, REST API debugging, AWS (Aurora, ECS, Lambda)
- Splunk, SiteScope
- Grafana, Snowflake
- Kafka, MQ, Networking, SQL
- Linux, Oracle
- Mainframe, COBOL, CICS, DB2
- ServiceNow, Confluence