Senior HPC Linux System Administrator

    • Leidos
  • Decatur, GA
  • Posted 11 hours ago | Updated 11 hours ago

Overview

On Site
Compensation information provided in the description
Full Time

Skills

Service Operations
Public Health
Scientific Research
Data Analysis
Security Clearance
Adobe Flash
Storage
Amazon S3
Resource Management
VMware vSphere
Provisioning
Computer Hardware
Software Implementation
System Monitoring
Performance Tuning
Project Planning
Status Reports
Configuration Management
Puppet
Ansible
Collaboration
Technical Support
Workflow
Systems Architecture
Technical Drafting
Optimization
Team Leadership
Access Control
Regulatory Compliance
HIPAA
Backup
Disaster Recovery
Computer Science
System Administration
High Performance Computing
Linux
Performance Monitoring
Computational Science
Soft Skills
Conflict Resolution
Problem Solving
Communication
Bioinformatics
Network
Routers
Switches
Incident Management
Security Architecture
Patch Management
Vulnerability Management
Risk Management
Information Assurance
Security Analysis
Authorization
Documentation
VMware
Virtual Machines
Leadership
Mentorship
Computer Cluster Management
HPC
Bash
Python
Scripting
Docker
Kubernetes
Research
Infrastructure Architecture
Red Hat Certified Engineer
Red Hat Linux
Red Hat Certified Architect
Computer Networking
TCP/IP
UDP
HTTP
DHCP
DNS
Dragon NaturallySpeaking
Network Design
Management
LAN
WAN
Virtual Private Network
Migration
Amazon Web Services
Microsoft Azure
Cloud Computing
Recruiting
Market Analysis
Law

Job Details

Job Description

Description

The Public Health and Human Services Operation of Leidos is seeking a Senior HPC Linux System Administrator to lead a team of system administrator professionals in managing a high-performance computing (HPC) infrastructure used by public health researchers and scientists. This senior-level position requires extensive Linux expertise combined with a deep understanding of the specialized hardware, software, and networking required for scientific research and large-scale data analysis.

Candidate MUST:

be located in the Atlanta, GA area for partial onsite work

be a US Citizen with the ability to obtain a Public Trust Clearance

The candidate provides secure and always-on infrastructure services, accessed by researchers to customer-sponsored data hosted in an on-premise infrastructure and the cloud, and secure access to the high performance computing resources for scientific researches.

  • High-performance Computing infrastructure management: Deploy, administer, monitor HPC clusters. Manage multi-petabtyes of data using Pure Storage flash memory storage, AWS S3 Glacier.
  • Software and resource management: Install, maintain, and upgrade scientific software, libraries, and batch schedulers such as GridEngine and Slurm. The role also involves developing effective process and solution for sharing resources across multiple research teams.
  • VMware: Manage the VMware vSphere Foundation for virtual server provisioning, deployment, and configuration, as well as hardware and software implementation and maintenance.
  • System Operations: System monitoring, routine and ad hoc security patch management, trouble shooting, performance tuning,
  • Project planning and coordination: Advise customer and Project Manager in designing and documenting technical solutions. Support infrastructure projects, from planning, coordinating team activities, executing planned activities, and providing status update. Communicate and work collaboratively with internal and client team members across the program, provide technical council, and/or alternative designs, solutions, and or processes to leadership.
  • Automation and scripting: Lead automation efforts to streamline system management tasks using scripting languages (Bash, Python) and configuration management tools (Puppet,Ansible).
  • Research collaboration: Work closely with scientists, bioinformatics developers, and principal investigators to understand their computational needs and translate scientific goals into technical configurations. This includes providing technical support to help optimize workflows.
  • System architecture and deployment: Lead the technical design, integration, and optimization of on-site HPC and cloud resources.
  • Mentorship and team coordination: Guide and mentor other system administrators on best practices for system administration and troubleshooting. Some roles involve managing a team of system administrators.
  • Security and compliance: Implement robust security measures, manage access controls, and design architectures that meet compliance standards such as HIPAA or NIST. Support SA&A process
  • Disaster recovery and monitoring: Design and implement backup and disaster recovery plans. Integrate monitoring and alerting systems to ensure system availability and reliability.

REQUIRED EDUCATION AND EXPERIENCE

  • A Bachelor's degree in computer science or a related field, plus 10 years of System Administration experience.
  • Requires extensive experience (7+ years) in designing and operating HPC infrastructure. (High performance computing)
  • Linux expertise: Mastery of Linux systems and administration, including troubleshooting, security, performance monitoring, and various distributions (e.g., Red Hat, Ubunut) to support scientific computing.
  • Soft skills: Strong problem-solving and communication skills are critical for collaborating with customers, bioinformatics developers, researchers and leading a team. Experience working with a team to introduce and integrate new technologies and process into existing production environments
  • Network: Proficiency in working with applicable network devices to include routers and switches, gateways and hubs
  • Security: Develop the infrastructure deliverables, continuous diagnostics and mitigation, threat mitigation and incident response, security architecture support, critical infrastructure protection, patch management, vulnerability management, risk management, information assurance, and Security Assessment and Authorization (SA&A) documentation.
  • VMWare: Experienced in managing VM infrastructure.
  • Leadership: Proven leadership in planning, coordinating infrastructure support activities, leading and mentoring system administrators
  • HPC and cluster management: Proven experience with HPC clusters, job schedulers (Slurm), and high-speed networking (10/40/100Gb)
  • Other technical skills: Proficiency in Bash and Python scripting for automation is essential. Experience with cloud technologies (hybrid-cloud integration) and container environments (e.g., Docker, Singularity, Kubernetes).

DESIRED QUALIFICATIONS:

  • A Master s Degree in in IT, engineering, or other relevant fields.
  • Experience of working at a federal government agency or a research organization
  • Large scale infrastructure design and implementation project experience
  • Red Hat Certified Engineer (RHCE), Red Hat Certified Architect (RHCA), or equivalent certifications.
  • Experience with computer networking protocols including, but not limited to TCP, IP, UDP, HTTP, DHCP, and DNS. Understanding of network design and management - LAN, WAN, and VPN.
  • Experience optimizing Cloud utilization patterns, support development, validation, operations, and security with migration experience from an on-premises model to a hybrid model.
  • AWS or Azure Cloud engineer certification

If you're looking for comfort, keep scrolling. At Leidos, we outthink, outbuild, and outpace the status quo because the mission demands it. We're not hiring followers. We're recruiting the ones who disrupt, provoke, and refuse to fail. Step 10 is ancient history. We're already at step 30 and moving faster than anyone else dares.

Original Posting:September 29, 2025

For U.S. Positions: While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above.

Pay Range:Pay Range $89,700.00 - $162,150.00

The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.

#Remote

#Featuredjob

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.