Google Cloud Platform Cloud - Devops Engineer (HPC) - Canada Remote
Remote • Posted 18 hours ago • Updated 1 hour ago

Merican Inc
Dice Job Match Score™
👾 Reticulating splines...
Job Details
Skills
- Big Data
- Python
- scripting
- Terraform
- shell scripting
- Linux
- gcp
- Docker
- NETWORKING
- PERFORMANCE TUNING
- Problem-Solving
- Deployment
- Continuous Integration/Delivery
- DEV OPS
- Migrations
- Operations
- Data Management
- Resource Utilization
- Compute Engine
- JOB Scheduling
- Maintenance
- ONYX
- VMS
- High Performance Computing
- File Systems
- Virtual Machine
- MPI
- VM
- HPC
- Scheduling
Summary
Job Title: Google Cloud Platform Cloud - Devops Engineer (HPC)
Location: Canada (Remote)
Google Cloud Platform DevOps HPC Engineer - Onyx
About the Role
As a Senior DevOps-HPC Engineer, you will join a dynamic Engineering team in a high-energy and collaborative environment. This role is ideal for a seasoned HPC engineer with deep expertise in SLURM, Linux, and cloud migration expertise in SLURM, Linux, and cloud migrations, who thrives on leading complex projects, designing robust architectures, and implementing high-performance solutions in Google Cloud.
Responsibilities:
- Lead the migration of on-premises SLURM-based HPC clusters to Google Cloud Platform.
- Design, implement, and manage scalable and secure HPC infrastructure solutions on Google Cloud Platform.
- Optimize SLURM configurations and workflows to ensure efficient use of cloud resources.
- Manage and optimize HPC environments, focusing on workload scheduling, job efficiency, and scaling SLURM clusters.
- Automate cluster deployment, configuration, and maintenance tasks using scripting languages (Python, Bash) and automation tools (Ansible, Terraform).
- Integrate HPC software stack using tools like Spack for dependency management and easy installation of HPC libraries and applications.
- Deploy, manage, and troubleshoot applications using MPI, OpenMP, and other parallel computing frameworks on Google Cloud Platform instances.
- Collaborate with engineering, support teams, and stakeholders to ensure smooth migration and ongoing operation of HPC workloads.
- Provide expert-level support for performance tuning, job scheduling, and cluster resource optimization.
- Stay current with emerging HPC technologies and Google Cloud Platform services to continually improve HPC cluster performance and cost efficiency.
Requirements:
Basics:
- Minimum 5 years of experience with HPC environments, including SLURM workload manager, MPI, and other HPC-related software.
- Extensive hands-on experience managing Linux-based systems, including performance tuning and troubleshooting in an HPC context.
- Proven experience migrating and managing SLURM clusters in cloud environments, preferably Google Cloud Platform.
- Proficiency with automation tools such as Ansible and Terraform for cluster deployment and management.
- Experience with Spack for managing and deploying HPC software stacks.
- Strong scripting skills in Python, Bash, or similar languages for automating cluster operations.
- In-depth knowledge of Google Cloud Platform services relevant to HPC, such as Compute Engine (GCE), Cloud Storage, and VPC networking.
- Strong problem-solving skills with a focus on optimizing HPC workloads and resource utilization.
Recommended:
- Google Cloud Professional DevOps Engineer or similar Google Cloud Platform certifications.
- Familiarity with Google Cloud Platform's HPC-specific offerings, such as Preemptible VMs, HPC VM images, and other cost-optimization strategies.
- Experience with performance profiling and debugging tools for HPC applications.
- Advanced knowledge of HPC data management strategies, including parallel file systems and data transfer tools.
- Understanding of container technologies (e.g., Singularity, Docker) specifically within HPC contexts.
- Experience with Spark or other big data tools in an HPC environment is a plus.
- Dice Id: RTX1ca091
- Position Id: 2026-2005
- Posted 18 hours ago
Company Info
With our diverse collection of skilled and committed consultants, technology companies, businesses and digital experts, we provide our subject expertise and our unique client service approach, a best-in-class global model of delivery suited to the business demands of our clients. We ensure that we implement future-oriented solutions for our clients via investments in people, solutions, technologies, competencies and infrastructure.
Our full-service offering and flexible delivery choices that combine cost, quality, effort, and risk are focused on addressing future issues for our customers. Our worldwide delivery method enables our customers to select the optimal supply plan for them.
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs