Overview
Remote
Depends on Experience
Contract - Independent
Contract - W2
Contract - 1 Year(s)
Skills
HPC
High-Performance Computing
AWs
Devops
GitHub
PBS
Portable Batch System
Python
RHEL
Rocky Linux 8
CI/CD
Ec2
CodeDeploy
Job Details
HPC DevOps Engineer - HPC DevOps Engineer SaltStack | AWS | PBS
100 % Remote
Contract
Direct Client
Skills:
- SaltStack Must have skill **
- GitHub
- AWS
- Devops
- PBS (Portable Batch System) & Python
Remote
Schedule: Flexible, first shift. Must be available for emergencies.
Only evening weekend hours if emergency
Interview: Dependent upon total amount of candidates. Likely 1 round
Support and enhance our High-Performance Computing (HPC) environment in collaboration with IT and R&D stakeholders.
This role involves managing and optimizing supercomputing infrastructure, ensuring reliable system administration, and driving automation and cloud integration efforts.
Key responsibilities include:
- Administering and maintaining HPC systems, including accelerated nodes and node imaging, with a strong focus on RHEL 8 and Rocky Linux 8 environments.
- Supporting HPC job submission workflows and workload managers, preferably PBS, to ensure efficient resource utilization.
- Installing and configuring scientific and engineering software applications and managing environment modules.
- Designing and implementing networking and enterprise storage solutions to support high-throughput computing workloads.
- Utilizing configuration and automation tools such as SaltStack, Packer, and GitHub Actions to streamline system provisioning and CI/CD workflows.
- Managing cloud-based HPC resources using AWS services including EC2, FSx, EFS, CloudFormation, Route 53, and DevOps tools like CodeBuild and CodeDeploy.
- Developing and maintaining scripts in Bash and Python to automate tasks and improve system reliability.
- Applying best practices in version control and code deployment using GitHub.
Skills:
- High-Performance Computing (HPC) & System Administration
- Strong background in supercomputing and HPC system architecture
- Experience with accelerated nodes and node imaging
- Proficient in system administration for RHEL 8 and Rocky Linux 8
- Skilled in HPC job submission workflows and workload managers (preferably PBS)
- Hands-on experience with software application installation and environment configuration
- In-depth knowledge of networking and enterprise storage solutions
- Configuration & Automation Tools
- SaltStack for configuration management
- Packer for image creation and automation
- GitHub Actions for CI/CD workflows
- Cloud Computing (AWS)
- Proficient with core AWS services: EC2, CloudFormation, FSx, EFS, Route 53
- Experience with AWS DevOps tools: CodeBuild and CodeDeploy
- Solid understanding of AWS networking and infrastructure best practices
- Scripting & Version Control
- Strong scripting skills with Bash and Python
- Experience managing GitHub repositories with best practices for code deployment and version control
Education:
Bachelor s degree from an accredited university or equivalent professional experience in a related field
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.