Job Title: Cloud DevOps Engineer / SRE – Systems Engineer
Work Location: Piscataway, NJ – Hybrid Working Model (1 or 2 Days per Week ONSITE)
Duration: Long Term Contract
Must Have Skills (Most Important):
- Experience in Linux, Python and Ansible.
- Experience designing and collaborating with senior design/architects.
- Four or more years of experience with AWS – provisioning, operating, and managing AWS environments.
- Monitoring and auditing systems experience.
- Experience working in agile/scrum environment.
- Experience with IT DevOps tooling and CI/CD practices.
- AWS Certified SysOps Administrator Associate & Linux Admin certification.
Even better if you have one or more of the following:
- Experience in a high-performance, high-availability environment development.
- Ability to troubleshoot production problems, identify RCA and automate fixes.
- Three or more years of experience managing DevOps products like, Gitlab, Jira, Jenkins, Artifactory etc.
- Excellent communication and presentation skills.
- Technology & Cloud certification.
- AWS Certified DevOps Engineer Professional certification.
- Kubernetes cluster management experience (installing, configuring, securing, upgrading, scaling, managing, monitoring).
- Operational knowledge of AWS EKS, ECS, Fargate and Lambda services.
- Solid experience with SRE Practices.
- Two or more years of experience as a Linux administrator in a systems operations role.
You will contribute towards creating, operating and supporting Cloud and DevOps Platform based Products/Services for mission and business-critical applications. Under the mentorship of PO & PMs, you will work as a lead team member on various products.
- Provisioning, deploying and configuring AWS application as per solution architecture using CI/CD automation.
- Mentoring junior developers and supporting them to perform their job activities.
- Providing inputs to AWS solution architecture and driving the implementation of it with the team.
- Creating automation to manage and monitor AWS environments for cost overruns, security vulnerabilities and governance requirements.
- Troubleshooting production issues and providing automated fixes, where possible, to reduce manual burden.
- Creating and managing centralized logging and monitoring infrastructure on AWS.