Overview
Skills
Job Details
- Collaborates with a diverse set of engineers, architects, and teams to design, develop, test, and implement secure, robust, highly available and scalable solutions for Client's External Cloud Platform
- Collaborates other software engineers and teams to design and implement deployment approaches using highly scalable, automated, continuous integration and continuous delivery pipelines.
- Responsible for all aspects of reliability, collaborates with technical experts, key stakeholders, and team members to resolve complex problems, owning the issue until you are sure it will not reoccur.
- Deep understanding of SRE practices, service level indicators, and service level objectives; proactively utilize them to resolve issues before they impact customers.
location: Jersey City, New Jersey
job type: Contract
salary: $63.68 - 73.68 per hour
work hours: 8am to 5pm
education: Bachelors
responsibilities:
- Gather, analyze, synthesize, and develop visualizations and reporting from large, diverse data sets in service of continuous improvement of the platform.
- Implement infrastructure, configuration, and network as code for the applications and platforms in your remit.
- Identify opportunities to eliminate toil and automate the triage of issues to improve overall operational stability.
- Collaborate with others to identify, analyze, and resolve platform vulnerabilities.
- Proactively promotes the adoption of site reliability engineering best practices within the team and organization.
- Participate in 24x7 on-call coverage follow the sun model and performs blameless Postmortems (RCAs) as needed.
qualifications:
- 15 years of combined experience in either SRE, software development, or infrastructure engineering (10 years with an advanced degree in Computer Science or related technical field).
- 7+ years of hands-on experience building and maintaining cloud platforms on a major cloud service provider.
- Strong experience in implementing, monitoring, and maintaining a highly scalable and resilient Data Services platform on Amazon Web Services
- Strong experience with monitoring tools such as Grafana, Prometheus, Splunk, or Dynatrace, as well as AWS native tools like CloudWatch CloudTrail, Azure Monitor and Log Analytics
- Proficiency in implementing, monitoring, and maintaining a Databricks, RDS, or OpenAI platform.
skills:
- Proficient in at least one programming language such as Python, Java/Spring Boot, and .Net; 5+ years applied experience in Python/Java
- Proficiency in implementing CI/CD pipelines with tools such as git and Jenkins, familiarity with using a GitOps model.
- Strong experience working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and PingIdentity, Okta, or other SSO solutions.
- Proficiency in creating automation using Python, Terraform, or Ansible
- Proficiency in implementing, monitoring, and maintaining a Databricks, CosmosDB, or OpenAI platform.
- Experience in implementing, monitoring, and maintaining a highly scalable and resilient enterprise platform on Microsoft Azure using native services related to compute, storage, networking, security, and observability.
- Experience with containerization technologies such as EC2, EKS, Fargate, Openshift, or Kubernetes.
- Understanding of cost management, inventory management, FinOps model
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including: medical, prescription, dental, vision, AD&D, and life insurance offerings, short-term disability, and a 401K plan (all benefits are based on eligibility).
This posting is open for thirty (30) days.