HPC Support Engineer

Charlottesville, VA, US • Posted 6 hours ago • Updated 6 hours ago
Full Time
On-site
Company Branding Image
Fitment

Dice Job Match Score™

✨ Finding the perfect fit...

Job Details

Skills

  • High Performance Computing
  • Remote Direct Memory Access
  • InfiniBand
  • Computational Science
  • GRID
  • Workflow
  • CPU
  • Scheduling
  • Red Hat Enterprise Linux
  • Parallel Computing
  • Programming Languages
  • C
  • C++
  • Fortran
  • Computer Networking
  • GNU Compiler Collection
  • LLVM
  • Resource Allocation
  • Writing
  • Science
  • Security Clearance
  • Command-line Interface
  • Linux
  • Scripting
  • Bash
  • Python
  • Management
  • HPC
  • MPI
  • OpenMP
  • GPU
  • CUDA
  • Research
  • DoD
  • Internal Communications
  • IC
  • Integrated Circuit
  • Information Technology
  • Systems Engineering
  • FOCUS

Summary

Job ID: 2610673

Location: Charlottesville, VA, US

Date Posted: 2026-03-26

Category: Engineering and Sciences

Subcategory: Systems Engineer

Schedule: Full-Time

Shift: Day Job

Travel: No

Minimum Clearance Required: Top_Secret

Clearance Level Must Be Able to Obtain: TS/SCI

Potential for Remote Work: ORA_ON_SITE

Description

SAIC is looking for a highly qualified HPC Support Engineer to support the Army's Golden Dome initiative. The engineer will support users executing workloads within Linux-based High Performance Computing (HPC) cluster environments used for distributed compute workloads, simulation environments, and GPU-enabled processing.

The environment will include:
  • multi-node Linux compute clusters
  • workload scheduling platforms such as Slurm or PBS
  • distributed parallel compute workloads utilizing MPI or OpenMP
  • GPU-enabled compute resources supporting CUDA-based processing
  • high-performance networking technologies including RDMA / InfiniBand

The system will be used to support scientific computing, simulation workloads, and other distributed compute operations within a secure research environment.

Candidates should be comfortable working within cluster-scale computing environments where performance, scheduler configuration, and distributed workload execution are critical operational factors.

The HPC Support Engineer will assist users executing computational workloads within HPC cluster environments.

The role focuses on:
  • supporting distributed compute workloads
  • troubleshooting job execution issues
  • assisting users with scheduler job submission scripts
  • identifying workload performance bottlenecks
  • supporting GPU-enabled workloads
  • promoting efficient cluster utilization and HPC best practices


Candidates should have experience working with distributed compute workloads and Linux-based HPC environments.

Core Technical Capabilities

Candidates should demonstrate capability in most of the following areas.

HPC Workload Execution

Experience supporting execution of distributed workloads on HPC cluster platforms.

Candidates should understand how compute workloads interact with cluster schedulers, compute nodes, and distributed resources.

Workload Scheduling Platforms

Experience executing and troubleshooting workloads using schedulers such as:
  • Slurm
  • PBS / PBS Pro
  • Torque
  • Grid Engine


Candidates should understand job submission workflows and resource allocation concepts such as CPU, memory, and GPU scheduling.

Candidates should be comfortable reading and troubleshooting scheduler job submission scripts used to execute distributed workloads.

Linux Systems Usage

Strong Linux experience including:
  • command-line system usage
  • execution of compute workloads within Linux environments
  • troubleshooting application execution issues

Experience with RHEL-based environments is preferred.

Distributed Compute Workloads

Experience supporting distributed workloads utilizing parallel computing frameworks such as:
  • MPI
  • OpenMP


Experience supporting the compilation and execution of scientific or engineering applications within Linux HPC environments.

Familiarity with common HPC programming languages and compiler toolchains including:
  • C/C++
  • Fortran


Candidates should understand how compiled applications interact with scheduler configuration, compute resources, cluster networking, and distributed runtime environments.

Experience troubleshooting application build or runtime issues related to compiler configuration, library dependencies, or MPI environments is desirable.

Familiarity with common HPC compiler toolchains such as GCC, Intel, or LLVM-based compilers is desirable.

GPU Compute Workloads

Experience executing or supporting workloads utilizing GPU-enabled compute environments and CUDA frameworks is desirable.

Performance Troubleshooting

Ability to identify issues affecting workload execution including:
  • inefficient resource allocation
  • scheduler configuration issues
  • application execution failures
  • distributed compute performance bottlenecks


Automation and Operational Tooling

Experience writing scripts or tooling using languages such as:
  • Bash
  • Python

Automation experience supporting workload execution or operational tasks is beneficial.

Qualifications

Candidates must meet the following requirements:
  • Bachelor degree in science/technology; 4 additional YoE can be substituted for degree
  • 8+ years of experience is required
  • Minimum 5 years of experience working in Linux environments supporting distributed compute workloads or HPC cluster platforms
  • An Active Top Secret clearance is required; an active TS/SCI clearance must be obtained prior to beginning work.
  • 100% onsite support in Charlottesville, VA
  • Experience executing or troubleshooting workloads using HPC workload schedulers such as Slurm, PBS, Torque, or similar systems
  • Experience using command-line Linux environments
  • Experience with scripting or automation tools (Bash, Python, or similar)
  • Ability to obtain required DoD 8140 (8570) IAT Level II certification
  • Candidates must have direct experience working with HPC or distributed compute workloads.

Candidates with the following experience are strongly preferred:
  • Experience supporting HPC cluster environments used for distributed compute workloads
  • Experience executing or troubleshooting MPI or OpenMP workloads
  • Experience supporting GPU-enabled workloads and CUDA frameworks
  • Experience supporting scientific or engineering compute applications
  • Experience supporting research, laboratory, or mission computing environments
  • Experience supporting systems within DoD/DoW or IC environments



Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10111346
  • Position Id: 2610673
  • Posted 6 hours ago

Company Info

About SAIC

SAIC® is a premier Fortune 500 mission integrator focused on advancing the power of technology and innovation to serve and protect our world. Our robust portfolio of offerings across the defense, space, civilian and intelligence markets include secure high-end solutions in mission IT, enterprise IT, engineering services and professional services. We integrate emerging technology, rapidly and securely, into mission critical operations that modernize and enable critical national imperatives.

We are approximately 24,000 strong; driven by mission, united by purpose, and inspired by opportunities. Headquartered in Reston, Virginia, SAIC has annual revenues of approximately $7.5 billion. For more information, visit saic.com. For ongoing news, please visit our newsroom.

About_Company_One
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Charlottesville, Virginia

Today

Full-time

Remote

Today

Full-time

USD 120,001.00 - 160,000.00 per year

No location provided

Today

Full-time

Huntsville, Alabama

Today

Full-time

Search all similar jobs