Advisor - GPU Platform Engineering

Overview

On Site
USD 135,000.00 - 213,400.00 per year
Full Time

Skills

FOCUS
Data Science
Operational Excellence
Hosting
IaaS
Management
Clustering
Cloud Computing
Agile
Linux Administration
Server Administration
Spectrum
File Systems
Scripting
High Performance Computing
HPC
GPU
Storage
Weka
Scheduling
Orchestration
Kubernetes
LSF
Computer Networking
Ethernet
Docker
PyTorch
JAX
Artificial Intelligence
Machine Learning (ML)
Workflow
Data Processing
Training
Scripting Language
Bash
Python
Computer Science
Information Technology
Linux
Clinical Trials
Productivity
Computer Hardware
IT Strategy
SAP BASIS
Chinese
Japanese
Leadership
Network
Health Care
Life Insurance
Promotions

Job Details

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We're looking for people who are determined to make life better for people around the world.

Come help us unlock the power of AI and HPC based POGPU and Accelerated Compute infrastructure!

The Cloud and Connectivity organization is seeking experts and leaders in AI and High-Performance Computing (HPC), and Nvidia DGX server management. This role will also focus on DGX Server mgmt., Spectrum X networking technologies, and Weka Storage integration to support cutting-edge AI/ML workloads.

What You'll Be Doing

You will be driving the engineering and operations of advanced Linux platforms supporting AI and HPC workloads, managing Nvidia DGX systems using Mission Control, Base Command and Run:AI, and optimizing Spectrum X networking and WEKA storage for AI/ML applications. You will play a crucial role in boosting productivity for our Advanced Intelligence and Data science teams through implementing advancements across our AI/HPC infrastructure tooling and operational excellence

You will work in our Infrastructure Hosting Platform area leading the strategy, engineering and development of Advanced Linux computing capabilities for AI/ML. Additionally, you would advise with our senior Linux platform engineer directing the global Linux strategy for on-premises private cloud and public IaaS Linux services.

How You'll Succeed
  • Be Bold - You will bring a high learning agility and Infrastructure availability and reliability Engineer skills to help us enable the Lilly Technology strategy, identifying tech opportunities, and accelerate our cloud journey.
  • Be Fast - You will accelerate initiatives in areas such as: AI/ML acceleration, Infrastructure AI OPS automation, HPC management, and infrastructure as code to enable critical business projects.
  • Be Proactive - You will have groundbreaking chances to build secure, resilient, and reliable hybrid cloud services using proactive, predictive, and automated capabilities.
  • Be Your Best - You will learn about new technologies, AI/ML based HPC, large scale GPU clustering, Infrastructure as Code, and Enterprise Scale Hyper Cloud providers, agile ways of working, and willingness to become an expert.

What You Should Bring
  • Expertise in Linux system administration, HPC environments, and Nvidia DGX server management. Experience with Spectrum X networking and parallel file systems is essential. Strong scripting skills and familiarity with containerization and automation tools are highly valued.
  • 6+ years of demonstrated experience in AI/ML and HPC workloads and infrastructure.
  • Hands-on experience in using or operating High Performance Computing (HPC) grade infrastructure as well as in-depth knowledge of accelerated computing (e.g., GPU), storage (e.g., Weka), scheduling & orchestration (e.g., Slurm, Kubernetes, LSF), high-speed networking (e.g., Ultra-Ethernet, RoCE ), and containers technologies (Docker).
  • Passion for continual learning and keeping abreast of new technologies and effective approaches in the AI/ML infrastructure field.
  • Expertise in running and optimizing large-scale distributed training workloads using PyTorch (DDP, FSDP), NeMo, or JAX. Also, possess a deep understanding of AI/ML workflows, encompassing data processing, model training, and inference pipelines.
  • Some proficiency in at least one scripting language such as Bash, Python, or equivalent.

Basic Qualifications
  • Bachelor's degree in computer science, Information Technology, or related technical field.
  • 10+ years' experience as a Linux OS/ Platform Engineer.
  • Demonstrated experience leading a global large-scale Infrastructure project.

Additional Information:

Hybrid role located in Indianapolis, IN (relocation required)

<5% travel

Organization Overview

Lilly IT builds and maintains capabilities using cutting edge technologies like most prominent tech companies. What differentiates Lilly IT is that we redefine what's possible through tech to advance our purpose - creating medicines that make life better for people around the world, like data driven drug discovery and connected clinical trials. We hire the best technology professionals from a variety of backgrounds, so they can bring an assortment of knowledge, skills, and diverse thinking to deliver innovative solutions in every area of our business.

The Global Information and Services Tech team is at the forefront of digitalization to enable and advance the entire company, with increased productivity and best-in-class Customer experiences. This team provides a robust and sustainable infrastructure of hardware, software and services that are critical to enable our global workforce and business to operate and transform. As leaders in technology and understanding business requirements and challenges, this team defines and leads the overall company technology strategy.

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form ( for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly is proud to be an EEO Employer and does not discriminate on the basis of age, race, color, religion, gender identity, sex, gender expression, sexual orientation, genetic information, ancestry, national origin, protected veteran status, disability, or any other legally protected status.

Our employee resource groups (ERGs) offer strong support networks for their members and are open to all employees. Our current groups include: Africa, Middle East, Central Asia Network, Black Employees at Lilly, Chinese Culture Network, Japanese International Leadership Network (JILN), Lilly India Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ+ Allies), Veterans Leadership Network (VLN), Women's Initiative for Leading at Lilly (WILL), enAble (for people with disabilities). Learn more about all of our groups.

Actual compensation will depend on a candidate's education, experience, skills, and geographic location. The anticipated wage for this position is
$135,000 - $213,400

Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly's compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees.

#WeAreLilly
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.