Overview
Skills
Job Details
Job Title: Site Reliability Engineer (AWS)
Location: Jersey City, NJ (Primary) | Open to MMK, TX, BOS
Interview: 1 round Manager + Developer (behavioral/technical)
Key Must-Have Skills
Strong knowledge of Tax and/or Corporate Actions
Experience with Disaster Recovery Testing
Hands-on with AWS (EKS, Lambda, DynamoDB, Kafka, API Gateway, SQS, EC2, CloudWatch, IAM)
Java/Spring Boot Microservices development & support
Excellent communication skills (must be solid communicator)
Responsibilities
Support and enhance production systems ensuring stability, reliability, and scalability.
Monitor and analyze system metrics to improve performance and operational efficiency.
Contribute to automation initiatives and reliability improvements.
Collaborate with product and engineering teams to strengthen resilience and stability.
Participate in disaster recovery and business continuity planning/testing.
Troubleshoot and support Spring Boot microservices, containerized applications (Docker, Kubernetes), and messaging/EDA solutions (Kafka/MQ/SNS).
Leverage CI/CD pipelines and tools (GitHub, Jenkins, Stash, Artifactory, Terraform).
Soft Skills
Strong communicator and collaborator
Analytical and detail-oriented mindset
Problem solver with ownership mentality