SAIC is seeking an Information Systems Architect to join a team that administers next-generation high performance computing (HPC) Architectures and develops innovative solutions for operations and efficient utilization.
On any given day, you may be called on to:
- Participate in all aspects of the HPC system lifecycle including facility integration, standup, acceptance testing, performance benchmarking, operational support, and reclamation.
- Maintain all system aspects of security, networks, filesystems, system software installation, and user support.
- Plan patches and upgrades.
- Troubleshoot/replace defective components
- Determine and proactively mitigate the impact of system changes and improvements, including non-standard research access and software, on the security and stability of the platforms and on the user community.
- Participate in development of new operational methodologies and support infrastructure to enable efficient operations of multiple, concurrent, leading-edge and prototype HPC Clusters.
- Support research and development staff to deliver functional platforms for pre-production systems.
- Work with the HPC Monitoring Team to deploy monitoring solutions on the platforms and utilize them for performance understanding
- Learn new technologies, processes, and system software in an unstructured environment.
•Bachelor's degree in Computer Science, Computer Engineering, Information Systems Engineering (CIS/MIS), or significant STEM discipline and five or more years of significant IT experience OR 9 years or more of relevant IT experience with achievements that demonstrate the knowledge, skills and ability to perform the duties of the job in lieu of degree.
•5 years' experience with Linux /Unix operating systems, including hardware setup, installation, upgrades, and troubleshooting.
•Experience handling large numbers of systems (20-50+).
•Experience answering technical support queries from a community of end-users
•A strong teammate capable of handling multiple duties with good customer focus and excellent oral and interpersonal skills.
•Ability to obtain and maintain a DOE Q security clearance.
•Proven understanding/experience installing clusters (initial racking/cabling/diagnosing hardware issues)
•Advanced scripting experience (Shell, Python, PERL, or any other system-level scripting).
•Basic understanding of networking concepts and infrastructure, such as firewalls, routing, bonding, and VLANs
•Knowledge and experience with security and authentication components, such as ssh, Kerberos, LDAP, SSL, nmap, public and private key encryption, and other third party security products.
•Ability to implement and review system performance monitoring, determine optimization opportunities, and set new resource/capacity requirements.
•Experience with Linux containers and related technologies (docker, kubernetes, etc.)
•Experience with storage administration, fiber channel SAN, LUN provisioning, NFS filesystem management, and related processes and technologies
•Understanding of HPC scheduling software (e.g. LSF, Moab, or SLURM)
•Understanding of HPC parallel filesystems (e.g. GPFS, Lustre)
•Understanding of HPC Interconnects (e.g., Infiniband, OmniPath, high-speed ethernet)
•Current DOE Q Security Clearance
My SAIC Benefits.