HPC - Linux Systems Engineer

Overview

Remote
On Site
Full Time

Skills

High performance computing
Google Cloud
Linux administration
Attention to detail
Machine Learning (ML)
Data Analysis
Cloud computing
Red Hat Enterprise Linux
Server hardware
Problem solving
Data storage
Access control
HPC
Linux
Training
Artificial intelligence
GPU
Storage
Computer networking
Design
Computer hardware
Management
Automation
Scripting
Ansible
Python
Bash
CPU
Scheduling
Docker
Communication
Collaboration
IBM GPFS
InfiniBand
Metrics
Visualization
Grafana
Authentication
ADFS
LDAP
Kerberos
Writing
Git

Job Details

Job Description

Ford Motor Company's High Performance Computing environment provides Ford's global user base with capabilities to run complex, resource-intensive computational tasks, including CAE analysis, ADAS simulations, graphics rendering, machine learning model training, and data analytics. As a member of this team, the selected candidate will be responsible for the majority of Ford's on-premises and cloud-based AI/ML infrastructure. This includes GPU systems, parallel storage, high speed networking, and several cloud services like VertexAI on Google Cloud Platform.

RESPONSIBILITIES

Participate in the design and implementation of HPC based hardware and software solutions for AI/ML and ADAS customers

Assist in the management and engineering of our RedHat Enterprise Linux load as it pertains to AI/ML systems

Develop and support automation and scripting (Ansible, Python, Bash)

Assist development teams and internal customers with environment related questions and issues

Monitor and troubleshoot system failures (occasional off-hours work)

QUALIFICATIONS

Bachelor's Degree or equivalent experience Significant experience with Linux system administration

Strong understanding of server hardware and networking (e.g. CPU / GPU architecture)

Ability to write scripts and automations in languages such as Python and Bash.

Basic understanding of batch scheduling concepts

Comfortable with containerization (Docker, Buildah, Podman)

Excellent problem-solving and troubleshooting skills

Good communication and collaboration skills

Attention to detail and ability to work independently

Desired Skills:

Experience with Google Cloud Platform (or similar)

Experience with large scale storage systems (Lustre, GPFS, etc.) and InfiniBand networking

Experience with metrics collection and visualization with tools such as Grafana, Prometheus and OpenTSDB

Familiarity with authentication and access control systems (ADFS, LDAP, Kerberos)

Experience writing Ansible automation

Ability to use Git

What you'll receive in return:

As part of the Ford family, you'll enjoy excellent compensation and a comprehensive benefits package that includes generous PTO, retirement, savings, and stock investment plans, incentive compensation, and much more. You'll also experience exciting opportunities for professional and personal growth and recognition.

Candidates for positions with Ford Motor Company must be legally authorized to work in the United States. Verification of employment eligibility will be required at the time of hire. Visa sponsorship will not be available for this position.

We are an Equal Opportunity Employer committed to a culturally diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, color, age, sex, national origin, sexual orientation, gender identity, disability status, or protected veteran status.

For information on Ford's salary and benefits, please visit:

Please see our Company Profile.

$desc3

About Ford Motor Company