HPC Consultant

company banner
Apex Systems
Systems, IT, Consultant, Linux, Interface, Manager, Python, Networking, Graphics, Development, Software, Programming
Full Time

Job Description

Apex Systems, the nation's second largest IT staffing firm, has an immediate opportunity for a HPC Consultant to support one of our top clients Remotely. Please find the details below.

If interested please email your resume and best phone number to Diana McDermott at for consideration

HPC Consultant Overview

Start Date: ASAP

Contract Length: Long term rolling contract

Location: 100 % Remote

Pay: Negotiable based on experience

Responsibilities:

What you'll achieve:

• The prospective candidate should possess experience with HPC service delivery. Primary experience should be within the HPC domain, specifically with head and compute node clusters, Infiniband and Omni-Path fabrics, ethernet networking, SUSE Linux Enterprise Server, GPUs, and related HPC technologies, tools and software products. Integration and testing of varied software applications to verify system performance and stability. Develop and maintain automation of system configuration utilizing scripting languages and rest APIs.

You will:

• Be a part of the integration of hardware and software for HPC service delivery

• Interface and provide support to end users and application focals

• Escalate break/fix troubleshooting and issue resolution

• Contribute to improving delivery quality and optimization

Essential Requirements:

• Knowledge of HPC systems, architecture, deployment, and operation

• Experience with batch schedulers including PBSPro

• Familiarity with the Linux operating system (SLES)

Desirable Requirements:

• InfiniBand and Omni-Path fabric expertise and knowledge of MPI

• Experience with cluster management software including Bright Cluster Manager

• Experience with Python and Ansible for automating system configurations

• Knowledge of remote graphics technologies including Nice DCV

Day to Day:

• Provide support to end user and application focals for all issue encountered in the HPC service

• Troubleshoot and resolve HPC hardware issues

• Maintain operational documentation and Ansible automation playbooks

• Maintain configuration of Bright Cluster Manager, Altair PBSPro, Nice DCV, SLES

• Linux Systems Administration- familiar with SUSE Linux, good general system practices, awareness of ramifications of changes at scale, operating system tuning

• Networking- familiar with Infiniband and Omni-Path hardware/switches and associated software stacks (OFED, Subnet Managers) and tools for debugging fabric issues

• Parallel file systems - Panasas storage management and troubleshooting, Samba configurations, NFS

• Accelerators- GPUs and NVIDIA toolsets, awareness of workload interaction and NVLINK/RDMA software setup

• PBS Professional - Familiar queue and qmgr configuration, job maintenance, and queueing system start up, shut down, and backup, and hook Python scripting

• Remote Graphics - familiar with remote graphics concepts in non-virtualized and virtualized environments (NICE DCV), and operating system setup for X11/desktops

• Development Software - ability to use compilers from different vendors (open-source, Intel, NVIDIA/PGI) to compile codes from source, build dependencies, and build maintainable distributions with Environment modules

• Parallel Software stacks - familiar with MPI concepts and debug with different vendor stacks (OpenMPI/Intel MPI)

• Python - familiar with building modules, using pip, and maintaining multiple Python distributions with virtualenv from source and vendor provided distributions

• Programming - able to craft solutions based on engineering code/workflow requirements that may require extension of resource manager, operating system, or software packages

EEO Employer

Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at or .


Company Information

Apex Systems is a world class technology services business that incorporates industry insights and experience to deliver solutions that fulfill our clients’ digital visions. We provide a continuum of service from workforce mobilization and modern enterprise solutions to digital innovation to drive better results and bring more value to our clients. Apex transforms our customers with modern enterprise solutions tailored to the industries we serve. Apex has a presence in over 70 markets across US, Canada and Mexico. Apex is a segment of ASGN Inc. (NYSE: ASGN)

.
Dice Id : apexsan
Position Id : BHJOB2374_1194995
Originally Posted : 2 months ago

Similar Positions at Apex Systems

HPC Systems Administrator
  • Dayton, OH
  • 19 hours ago
Systems Engineer II (Redhat Linux, Bash, Python)
  • Town And Country, MO
  • 19 hours ago
HPC Linux Systems Admin
  • Aberdeen, MD
  • 19 hours ago
HPC Linux System Administrator
  • Aberdeen, MD
  • 19 hours ago
Lustre File Systems Consultant
  • Lemont, IL
  • 19 hours ago
Production Support Consultant
  • Chicago, IL
  • 19 hours ago
Product Designer Consultant \/ Expert
  • Dearborn, MI
  • 19 hours ago
IT - Data Analytics Consultant - Senior
  • Santa Ana, CA
  • 19 hours ago