IT Staff Systems Engineer (HPC)

Remote in San Jose, CA, US • Posted 12 hours ago • Updated 12 hours ago
Contract Independent
Contract Corp To Corp
Contract W2
On-site
$60 - $70/hr
Fitment

Dice Job Match Score™

✨ Finding the perfect fit...

Job Details

Skills

  • IBM LSF
  • SLURM
  • RTM
  • Python
  • HPC

Summary

Role:IT Staff Systems Engineer (HPC)

Location: San Jose, CA / Austin, TX

Must haves

  • Exp in LFS/SLURM
  • LINUX
  • Excellent Communication and Problem solving Skills

Nice to Have

  • Python
  • Cloud
  • Hands-on technical experience managing IBM LSF/SLURM and RTM and scripting using Python, shell, Perl, etc., in a Farm environment and knowledge of LSF /SLURM spanning Farm to Cloud is highly desirable
  • Solid understanding and proven operational experience with compute farms, job submission/management technologies, cloud, and associated management tools.
  • Proven experience working directly with R&D software development teams to collaboratively develop solutions to optimize their working environment (Direct EDA experience desired)

Responsibilities

  • Supporting multiple geological locations to serve user communities across North America, Europe, and Asia sites.
  • Focusing on improving R&D productivity and committing to customer success.
  • Driving the overall operational strategy for internal High-Performance Compute (HPC) farms in all Cadence locations.
  • Developing and executing the three-year compute roadmap and planning annual capacity growth for on-premises server farm in San Jose.
  • Operating, managing, and enhancing the internal compute farm and associated cloud (AWS).
  • Maintaining, enhancing, monitoring, reporting, and improving its efficiency.

Requirements

  • 8+ years of technical experience architecting, managing, and improving a compute farm environment running Linux.
  • At least 5 years of direct hands-on experience in a global or regional compute farm and/or hybrid cloud environment consisting of 1,000 or more servers with some remote direct reports
  • At least 3 years working in a global group, coordinating support, strategies, projects, and operations across multiple geographies in a team-oriented approach
  • Extensive technical experience managing IBM LSF and RTM and scripting using Python, shell, Perl, etc., in a Farm environment and knowledge of LSF spanning Farm to Cloud is highly desirable
  • Solid understanding and proven operational experience with compute farms, job submission/management technologies, cloud, and associated management tools.
  • Proven experience working directly with R&D software development teams to collaboratively develop solutions to optimize their working environment (Direct EDA experience desired)
  • Proven experience in capacity and performance management, optimizing performance, ensuring adequate capacity, working with R&D on optimization of their workloads, and development and maintenance of key performance indicators
  • A proven process focus shown through documentation, change management, incident management and problem-resolution activities
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10200946b
  • Position Id: 8939556
  • Posted 12 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Sunnyvale, California

Today

Contract

$55 - $62 hourly

Sunnyvale, California

Today

Contract

Compensation information provided in the description

San Jose, California

Today

Contract

USD 67.00 - 70.42 per hour

San Jose, California

22d ago

Easy Apply

Contract

$30 - $40

Search all similar jobs