Platform & HPC Data Engineer

Overview

On Site
USD 107,000.00 - 229,000.00 per year
Full Time

Skills

DoD
Security Clearance
Data Engineering
Scalability
Organizational Skills
Auditing
Workflow Optimization
Data Flow
Resource Allocation
Performance Metrics
Data Governance
Encryption
Access Control
Scripting
Python
Bash
Perl
System Monitoring
Data Integration
Collaboration
Data Storage
Reporting
Documentation
Data Management
Computer Science
Information Technology
Management
File Systems
Workflow
Distributed File System
IBM
Spectrum
IBM GPFS
HPC
NetApp
Dell
EMC
Performance Tuning
Capacity Management
Meta-data Management
Regulatory Compliance
Computer Networking
InfiniBand
Remote Direct Memory Access
Storage
Optimization
rsync
NFS
Continuous Integration

Job Details

Location: Dranesville, VA
Salary: $107,000.00 USD Annually - $229,000.00 USD Annually
Description: Our client is currently seeking a Platform & HPC Data Engineer with an active DOD TS/SCI security clearance

This job will have the following responsibilities:
  • Platform and HPC Data Engineering: Design and implement data management systems and architectures for HPC platforms, focusing on optimizing data flow, storage, and access in large-scale computing environments.
  • File System Management: Oversee the configuration, maintenance, and optimization of distributed file systems (e.g., Lustre, IBM Spectrum Scale, NFS, GPFS) and storage solutions used in HPC environments to ensure efficient performance, scalability, and reliability.
  • Data Labeling and Tagging: Implement and manage metadata-driven systems for data labeling/tagging. This includes the development of strategies for classifying, indexing, and organizing datasets to enhance data discoverability, access control, and auditing.
  • Storage Appliance Configuration: Configure and maintain various storage appliances (e.g., NetApp, Dell EMC, HPE) and integrated storage solutions. Ensure that storage devices are optimized for performance, capacity, and availability within the HPC ecosystem.
  • Data Integration and Workflow Optimization: Integrate data storage and management systems with HPC clusters, ensuring seamless data flow between compute nodes and storage appliances. Optimize data pipelines to support high-throughput workloads and minimize bottlenecks in I/O performance.
  • Performance Tuning: Monitor and improve the performance of storage systems, focusing on I/O throughput, latency, and efficient resource allocation. Use performance metrics to guide optimizations across storage appliances and file systems.
  • Security and Compliance: Implement security best practices for data access, protection, and management, ensuring compliance with government regulations and internal data governance policies. Configure encryption, access control, and secure data sharing methods.
  • Automation and Scripting: Develop and maintain automation scripts (e.g., using Python, Bash, or Perl) to streamline storage configurations, data labeling/tagging, and system monitoring tasks. Automate processes related to data integration and HPC platform management.
  • Collaboration and Support: Work closely with data scientists, HPC administrators, software developers, and other technical staff to support ongoing projects. Provide expertise in troubleshooting data storage issues and ensuring optimal system performance.
  • Documentation and Reporting: Maintain thorough documentation for storage configurations, file system setups, data labeling/tagging procedures, and performance optimization strategies. Provide regular reports on system health, data management processes, and any improvements made.

Qualifications & Requirements:
  • Education: Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field. A Master's degree or higher is a plus.
  • 7+ years of experience in managing data infrastructure in HPC environments, with expertise in file systems, storage appliances, and data workflows.
  • Hands-on experience with distributed file systems, including Lustre, IBM Spectrum Scale (GPFS), NFS, and others commonly used in HPC settings.
  • Proven experience with storage appliance configuration (e.g., NetApp, Dell EMC, HPE, or similar systems), including performance tuning, capacity management, and reliability.
  • Strong experience in implementing data labeling/tagging systems, metadata management, and structuring large datasets for efficient access and compliance.
  • Knowledge of high-performance networking protocols (e.g., InfiniBand, RDMA) and their role in data transfer and storage optimization.
  • Familiarity with data access protocols like GridFTP, rsync, and NFS for large-scale data transfer.
  • (TS/SCI with CI Poly) Can obtain Poly after start

Contact:

This job and many more are available through The Judge Group. Please apply with us today!
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Judge Group, Inc.