Overview
Remote
$40 - $45
Full Time
Skills
Ansible
C++
Golang
HPC
IBM
GitLab
Jenkins
Kubernetes
Linux
Operating Systems
Programming Languages
Python
Terraform
Job Details
Title:AI Site Reliability Engineer
Location: 100% Remote
Any visa is fine
Requirements include:
Location: 100% Remote
Any visa is fine
Requirements include:
- Experience deploying and administering NVIDIA (DGX) or equivalent high-performance-compute (HPC) clusters (e.g. Cray, HPE, IBM).
- 5+ years administering and supporting Linux based operating systems.
- Experience writing code in general-purpose programming languages such as: Python, GoLang, C/C++ and using GIT and CI/CD systems (e.g., GitLab, GitHub Actions, Jenkins).
- Experience in deploying Enterprise Grade Kubernetes cluster (RedHat OpenShift preferred) and/or Google Anthos.
- Sophisticated knowledge of Kubernetes, Dockers, Terraform, Ansible, Jenkins, GitOps, Git, Linux
- Software development lifecycle includes design, development, testing, packaging, deployment using Python or Golang
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.