F2F interview is mandatory at Dallas TX, Seattle WA, Cary NC, Iselin NJ, Columbus, OH (any location) |
Job description: |
• 7+ years of Software Engineering experience |
• 4+ years of experience in Site Reliability Engineering teams with continued focus on improving Platform health |
• Familiar with Agile or other rapid application development practices |
• Hands-on expertise in building dashboards using APM tools. |
• Experience with distributed (multi-tiered) systems, algorithms, relational databases, and NoSQL databases. |
• Knowledge & Exposure caching tools (Redis, memcache) or messaging tools such as MQ, Kafka. |
• Must have working knowledge of APM tools such as splunk, GCL, ELK, Grafana, Prometheus etc. |
• Able to create Dashboards using GCL/Splunk/ELK and setup alerts. |
• Working knowledge of CICD is a plus – Source control like Git, Continuous Integration – Jenkins / UCD Release etc. . |
• Ability to work with Engineering teams across the ecosystem such as Security, Networking & Infrastructure challenges which can impact platform health & resiliency. |
• Shell Scripting / DevOps tools like Ansible with good knowledge of yaml file to write playbooks . |
• Experience with distributed storage technologies like NFS as well as dynamic resource management frameworks PCF, Kubernetes / OpenShift, AWS or Azure. |
• Tech Stack: Java/J2EE (Spring, Spring Boot, Python, Shell Scripting, Kafka, Oracle, MongoDB etc.). |
• A proactive approach to spotting problems, areas for improvement, and performance bottlenecks. |
| |