High Performance Computing (HPC) System Administrator

CentOS, Red Hat Linux, Configuration management, InfiniBand, Network management, HPC, Linux administration, System administration, IBM GPFS, NFS, Linux, Puppet, Cloud, Data storage, Ansible, Job scheduling, Distributed computing, Systems engineering, High performance computing
Full Time
$0 - $0
Work from home not available Travel not required

Job Description

RedLine Performance Solutions (RedLine) has been in the HPC solutions engineering services business for 21 years and is consistently determined to keep the "bar of excellence" quite high for new hires. This enables RedLine to accomplish what other firms cannot and promotes a high level of staff retention. We offer services ranging from full life cycle HPC systems engineering to remote managed services to HPC program analysis.

 

RedLine is looking for High Performance Computing (HPC) System Administrators to join us in Phoenix, AZ and Manassas, VA. This position will work on the National Oceanic and Atmospheric Administration (NOAA)'s Weather and Climate Operational Supercomputing System II (WCOSS II).  The HPC System Administrator will be an experienced individual with a strong Linux, HPC, configuration management, systems automation and networking background.

 

United States citizenship and the ability to obtain a Public Trust security clearance are mandatory requirements for this position. The positions are full-time onsite either in Phoenix, AZ or Manassas, VA.. Preference is for local candidates, but we will consider relocation as well. This full time position offers a full benefits package including paid time off, 401k match, and health care benefits.  

 

Job Responsibilities:

 

  • Work with systems staff to enhance configuration management infrastructure
  • Evaluate performance impacts of planned operating system changes
  • Update and expand existing systems monitoring capabilities
  • Develop automation tools for cluster administration
  • Participate in resource optimization and job scheduling software and policies
  • Provide technical support to researchers using HPC resources, troubleshoot problems and develop appropriate computational strategies
  • Consult and collaborate with scientist coworkers to determine best system configurations for applications.

 

Other Requirements:

  • Minimum of 5 years RedHat or CentOS Linux system administrator experience in an HPC environment.
  • Experience with batch systems such as SLURM or PBS
  • Experience managing parallel and cluster file systems such as NFS, GPFS, or Lustre
  • Network management experience, including in an HPC context (e.g., InfiniBand, OmniPath)
  • Demonstrated ability to configure, deploy and manage a major system area such as batch system, network, data storage, backup system, database system, or distributed computing
  • Provide leadership and technical expertise to improve HPC cluster performance and resiliency
  • Ability to work both independently and as part of the team; flexibility in dealing with assignments and in working on several projects simultaneously
  • Ability to effectively communicate with people of diverse backgrounds and computer knowledge.

 

Preferred Skills:

  • Prior experience with configuration management tools, such as Ansible and/or Puppet
  • Experience integrating applications with cloud provider software stack
  • Experience presenting and/or teaching
Dice Id : 10115922
Position Id : 6336465
Originally Posted : 2 months ago
Have a Job? Post it

Similar Positions

Linux Administrator
  • Frontier Strategies
  • Fort Belvoir, VA
LINUX Systems Administrator - NIH
  • ASD, Inc.
  • Bethesda, MD
Linux System Administrator
  • Attain
  • Washington, DC
Linux Server Administrator
  • Smart Synergies
  • Herndon, VA
Linux Systems Administrator/DevOps
  • TCG
  • Washington D.c., DC
Unix/Linux System Administrator IV
  • Information International Associates, Inc.
  • Alexandria, VA
Senior Linux Administrator
  • Medical Science & Computing, Inc.
  • Bethesda, MD
Linux Systems Administrator
  • MacroPros
  • Herndon, VA
Linux Systems Administrator
  • VeriSign
  • Reston, VA
Linux Systems Administrator
  • Piper Companies
  • Springfield, VA
Linux Systems Administrator
  • CompuGain Corporation
  • Mclean, VA
Cyber System Administrator (Linux)
  • Kforce Technology Staffing
  • Reston, VA