Senior HPC Linux System Administrator

Overview

On Site

Compensation information provided in the description

Full Time

Skills

Service Operations

Public Health

Scientific Research

Data Analysis

Security Clearance

Adobe Flash

Storage

Amazon S3

Resource Management

VMware vSphere

Provisioning

Computer Hardware

Software Implementation

System Monitoring

Performance Tuning

Project Planning

Status Reports

Configuration Management

Puppet

Ansible

Collaboration

Technical Support

Workflow

Systems Architecture

Technical Drafting

Optimization

Team Leadership

Access Control

Regulatory Compliance

HIPAA

Backup

Disaster Recovery

Computer Science

System Administration

High Performance Computing

Linux

Performance Monitoring

Computational Science

Soft Skills

Conflict Resolution

Problem Solving

Communication

Bioinformatics

Network

Routers

Switches

Incident Management

Security Architecture

Patch Management

Vulnerability Management

Risk Management

Information Assurance

Security Analysis

Authorization

Documentation

VMware

Virtual Machines

Leadership

Mentorship

Computer Cluster Management

HPC

Bash

Python

Scripting

Docker

Kubernetes

Research

Infrastructure Architecture

Red Hat Certified Engineer

Red Hat Linux

Red Hat Certified Architect

Computer Networking

TCP/IP

UDP

HTTP

DHCP

DNS

Dragon NaturallySpeaking

Network Design

Management

LAN

WAN

Virtual Private Network

Migration

Amazon Web Services

Microsoft Azure

Cloud Computing

Recruiting

Market Analysis

Law

Job Details

Job Description

Description

The Public Health and Human Services Operation of Leidos is seeking a Senior HPC Linux System Administrator to lead a team of system administrator professionals in managing a high-performance computing (HPC) infrastructure used by public health researchers and scientists. This senior-level position requires extensive Linux expertise combined with a deep understanding of the specialized hardware, software, and networking required for scientific research and large-scale data analysis.

Candidate MUST:

be located in the Atlanta, GA area for partial onsite work

be a US Citizen with the ability to obtain a Public Trust Clearance

The candidate provides secure and always-on infrastructure services, accessed by researchers to customer-sponsored data hosted in an on-premise infrastructure and the cloud, and secure access to the high performance computing resources for scientific researches.

High-performance Computing infrastructure management: Deploy, administer, monitor HPC clusters. Manage multi-petabtyes of data using Pure Storage flash memory storage, AWS S3 Glacier.
Software and resource management: Install, maintain, and upgrade scientific software, libraries, and batch schedulers such as GridEngine and Slurm. The role also involves developing effective process and solution for sharing resources across multiple research teams.
VMware: Manage the VMware vSphere Foundation for virtual server provisioning, deployment, and configuration, as well as hardware and software implementation and maintenance.
System Operations: System monitoring, routine and ad hoc security patch management, trouble shooting, performance tuning,
Project planning and coordination: Advise customer and Project Manager in designing and documenting technical solutions. Support infrastructure projects, from planning, coordinating team activities, executing planned activities, and providing status update. Communicate and work collaboratively with internal and client team members across the program, provide technical council, and/or alternative designs, solutions, and or processes to leadership.
Automation and scripting: Lead automation efforts to streamline system management tasks using scripting languages (Bash, Python) and configuration management tools (Puppet,Ansible).
Research collaboration: Work closely with scientists, bioinformatics developers, and principal investigators to understand their computational needs and translate scientific goals into technical configurations. This includes providing technical support to help optimize workflows.
System architecture and deployment: Lead the technical design, integration, and optimization of on-site HPC and cloud resources.
Mentorship and team coordination: Guide and mentor other system administrators on best practices for system administration and troubleshooting. Some roles involve managing a team of system administrators.
Security and compliance: Implement robust security measures, manage access controls, and design architectures that meet compliance standards such as HIPAA or NIST. Support SA&A process
Disaster recovery and monitoring: Design and implement backup and disaster recovery plans. Integrate monitoring and alerting systems to ensure system availability and reliability.

REQUIRED EDUCATION AND EXPERIENCE

A Bachelor's degree in computer science or a related field, plus 10 years of System Administration experience.
Requires extensive experience (7+ years) in designing and operating HPC infrastructure. (High performance computing)
Linux expertise: Mastery of Linux systems and administration, including troubleshooting, security, performance monitoring, and various distributions (e.g., Red Hat, Ubunut) to support scientific computing.
Soft skills: Strong problem-solving and communication skills are critical for collaborating with customers, bioinformatics developers, researchers and leading a team. Experience working with a team to introduce and integrate new technologies and process into existing production environments

Network: Proficiency in working with applicable network devices to include routers and switches, gateways and hubs

Security: Develop the infrastructure deliverables, continuous diagnostics and mitigation, threat mitigation and incident response, security architecture support, critical infrastructure protection, patch management, vulnerability management, risk management, information assurance, and Security Assessment and Authorization (SA&A) documentation.
VMWare: Experienced in managing VM infrastructure.
Leadership: Proven leadership in planning, coordinating infrastructure support activities, leading and mentoring system administrators
HPC and cluster management: Proven experience with HPC clusters, job schedulers (Slurm), and high-speed networking (10/40/100Gb)
Other technical skills: Proficiency in Bash and Python scripting for automation is essential. Experience with cloud technologies (hybrid-cloud integration) and container environments (e.g., Docker, Singularity, Kubernetes).

DESIRED QUALIFICATIONS:

A Master s Degree in in IT, engineering, or other relevant fields.
Experience of working at a federal government agency or a research organization
Large scale infrastructure design and implementation project experience
Red Hat Certified Engineer (RHCE), Red Hat Certified Architect (RHCA), or equivalent certifications.
Experience with computer networking protocols including, but not limited to TCP, IP, UDP, HTTP, DHCP, and DNS. Understanding of network design and management - LAN, WAN, and VPN.
Experience optimizing Cloud utilization patterns, support development, validation, operations, and security with migration experience from an on-premises model to a hybrid model.
AWS or Azure Cloud engineer certification

If you're looking for comfort, keep scrolling. At Leidos, we outthink, outbuild, and outpace the status quo because the mission demands it. We're not hiring followers. We're recruiting the ones who disrupt, provoke, and refuse to fail. Step 10 is ancient history. We're already at step 30 and moving faster than anyone else dares.

Original Posting:September 29, 2025

For U.S. Positions: While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above.

Pay Range:Pay Range $89,700.00 - $162,150.00

The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.

#Remote

#Featuredjob

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Job Description

Share