Overview
On Site
$60 - $80
Contract - W2
Contract - Independent
Contract - 6 Month(s)
Skills
Amazon Web Services
Bash
Cabling
HPC
Hardware Support
GPU
Lenovo
Linux
Linux Administration
Technical Support
Red Hat Enterprise Linux
InfiniBand
Google Cloud Platform
High Performance Computing
Storage
Good Clinical Practice
System Administration
Microsoft Azure
IT Infrastructure
Computer Hardware
Cloud Computing
Backup Administration
Ansible
Job Details
Our client is seeking a Senior IT Systems Administrator / Site Operations Engineer to serve as the primary technical presence supporting high-performance computing (HPC) infrastructure at Skokie location in IL facility. This individual will play a hands-on role in the setup, maintenance, and ongoing support of a new HPC environment leveraging Lenovo hardware, Infiniband networking, and Linux-based systems.
The ideal candidate brings strong Linux system administration experience in enterprise or HPC environments, confidence working with physical server infrastructure, and a proactive, self-driven approach to IT support.
Key Responsibilities:
Serve as the on-site technical point of contact for the HPC infrastructure, including compute, GPU, head, and worker nodes.
Perform hands-on setup, configuration, and maintenance of servers, storage arrays, and networking equipment (including physical cabling).
Work closely with Lenovo s hardware support team and internal stakeholders to ensure optimal performance and uptime.
Support general IT operations for the site, including desktop, network, and basic server troubleshooting.
Act as a proactive technology advocate for the institute, including interactions with executive leadership when needed.
Monitor and support a large storage array used for HPC backups.
Assist in scaling the environment as research or compute demands grow.
Technical Environment:
Lenovo compute infrastructure (1 head node, 10 worker nodes, 2 GPU nodes)
Storage array for HPC data and backups
Infiniband interconnects for high-speed data transfer
Linux (preferred: RHEL/CentOS/Ubuntu) in an HPC context
Physical data center/server room cabling and rack management
Qualifications:
9+ years of experience in systems administration, IT infrastructure, or SiteOps roles.
Strong Linux administration skills with experience in HPC or enterprise compute environments.
Comfortable with physical server work (rack & stack, cabling, power, cooling).
Solid understanding of enterprise networking and storage hands-on experience with switches, cabling, storage devices.
Familiarity with NUMA architecture and its performance implications is a strong plus.
Prior exposure to Infiniband networking or high-speed interconnects preferred (training available).
Experience with high-performance workloads, whether on-premises or cloud-based (AWS/Google Cloud Platform/Azure).
Self-starter with the ability to manage IT incidents independently.
Comfortable engaging with senior leadership and cross-functional teams.
Lab or research environment exposure is helpful but not required.
Nice to Have:
Exposure to Lenovo HPC systems
Experience with job schedulers like SLURM or PBS (helpful but not mandatory)
Scripting experience in Bash or Python for automation tasks
Familiarity with configuration management tools (Ansible, Puppet, etc.).
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.