Senior Azure SRE Consultant_ONSITE@Seattle, WA

Overview

On Site
Depends on Experience
Contract - W2
Contract - Independent

Skills

Azure SRE
HPC
MPI
Linux
Microsoft Windows
Microsoft Operating Systems
Microsoft Azure
Orchestration

Job Details

Role : Senior Azure SRE Consultant

Location : Seattle, WA

Onsite

Duration : Long Term Contract

Job Description

We are seeking a highly skilled Service Engineer having 7 to 10 Years of experience with strong infrastructure expertise and a deep understanding of High-Performance Computing (HPC) environments. The ideal candidate will have proven experience in managing complex systems, driving third-party application integration, and ensuring seamless operations across distributed computing platforms. This role requires exceptional problem-solving ability, strong communication skills, and a proactive approach to onboarding and supporting HPC workloads.

Required Skills & Qualifications

  • Development Experience: Minimum 3 years in software development (PowerShell, Azure Bash, Go or Python).
  • Administration Experience: 7+ years in system administration with a focus on infrastructure in PLM and HPC environments.
  • Very Strong Communication (Written and Oral), Collaboration, driving skills.
  • Proactively managing the backlog and bringing efficiencies through solution design and development.

Preferred Attributes

  • Passion for HPC technologies and infrastructure optimization.
  • Ability to learn and adapt to new tools and frameworks quickly.
  • Strong analytical and problem-solving mindset.
  • Collaborative approach with cross-functional teams.

Key Responsibilities

  • Drive integration and operational support for third-party applications on HPC and PLM infrastructure.
  • Manage and maintain Windows Server and Linux OS environments with VM and VMSS.
  • Configure and optimize HPC schedulers such as Windows HPC Pack, OpenPBS, Slurm, or similar.
  • Support distributed computing workloads, including MPI-based applications.
  • Troubleshoot and optimize networking components, focusing on TCP/IP, name resolution, and RDMA.
  • Implement and manage Azure Cycle Cloud and Azure Batch for HPC workload orchestration.
  • Collaborate with application owners to understand infrastructure dependencies and optimize performance.
  • Ensure compliance with security and operational best practices across all systems
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.