InfiniBand Jobs

Refine Results
1 - 20 of 44 Jobs

Network Engineer with InfiniBand Expertise

iTvorks Inc

Remote

Contract

Role: Network Engineer with InfiniBand Expertise Exp: 8+ Rate: $65/Hr on C2C or $45 50/Hr on W2 Remote Direct Client JD: Experience with tuning and optimizing HPCS and high-performance computing. Familiar with NCCL (Software for optimizing InfiniBand networks) Familiar with standing up and doing acceptance testing these systems to ensure they are working GPU AI - infrastructure MOST IMP. strong infiniband expertise/specific discipline with experience in partition keys (pkeys) and Subnet Manager

Design Verification Engineer

CSM Technologies Inc

Remote

Third Party, Contract

Title: Design Verification Engineer Location: Austin, TX(Remote) Duration: Long Term Contract Job Description: Verification of High performance packet based data transfer protocolsEthernet/Infiniband/PCIeConfiguration and usage of high speed SERDESConfiguration and usage off die to die interconnectConfiguration and usage of ARM processorsJTAG/SBUS functionalityInjecting Error stimulusMalformed, truncated packets, etc

Senior High Performance Computing (HPC) Adminstrator

Edify Technologies, Inc.

New York, New York, USA

Full-time, Third Party

Job Title: Senior HPC Administrator Job Location: New York city, NY (Remote/ Hybrid) Duration: Fulltime with Client About Edify Technologies: Transforming Businesses with Innovative Digital Solutions! Headquartered in Naperville, IL, we are a dynamic team with over two decades of industry expertise, dedicated to delivering robust business solutions, staff augmentation, and a comprehensive range of application and web services. As a former recipient of INC. Magazine's prestigious '5000 Fastest Gr

Software Engineer, Machine Learning Compute

OpenAI

San Francisco, California, USA

Full-time

About the Team The Applied Engineering team works across research, engineering, product, and design to bring OpenAI's technology to consumers and businesses. You'll join the team responsible for running the infrastructure that supports the models backing ChatGPT and the API. The systems we support include inference kubernetes clusters, GPU health, Infiniband performance, node lifecycle, and more. We seek to learn from deployment and distribute the benefits of AI, while ensuring that this power

Engineering Manager, AI Inference Systems

OpenAI

San Francisco, California, USA

Full-time

About the Team The Applied AI team safely brings OpenAI's technology to the world. We released ChatGPT, Plugins, DALL E, and the APIs for GPT-4, GPT-3, embeddings, and fine-tuning. We also operate inference infrastructure at scale. There's a lot more on the immediate horizon. We seek to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is more important to us than unfettered growth. We serve end-users directl

Sr. Scientific Compute Engineer

Zachary Piper Solutions, LLC

Philadelphia, Pennsylvania, USA

Full-time

Piper Companies is seeking a Sr. Scientific Compute Engineer to join a client of ours that is a world-renowned academic medical center in Philadelphia, with hospitals ranked among the nation's top hospitals and #1 in Philadelphia. This is a contract-to-hire role with a 6 month expectation. Location: Fully Remote, local preferred. Responsibilities: This person would be focused on HPC (High performance computing).Manage Linux-based clustered computing environment (RedHat, CentOS preferred).Workin

HPC Systems Administrator

Zachary Piper Solutions, LLC

Remote

Full-time

Piper Companies is seeking a HPC Systems Administrator with an academic medical center in Philadelphia for a contract-to-hire, remote opportunity. Responsibilities for the HPC Systems Administrator : Focused on HPC (High performance computing).Manage Linux-based clustered computing environment (RedHat, CentOS preferred).Working with an underlying storage system running GPFS.Upgrade incoming projects, understand testing, preparation, understand interdependencies, and other variables across env

Senior Systems Analyst

Leidos

Dayton, Ohio, USA

Full-time

Description Leidos is seeking an experienced Senior Systems Analyst to join our team supporting the Department of Defense (DoD) High Performance Computing (HPC) Modernization Program (DoD HPCMP) and the U.S. Air Force Research Laboratory (AFRL) DoD Supercomputing Resource Center (DSRC) located at Wright Patterson AFB, OH. The Senior Systems Analyst will work with numerous AFRL customers across in support of mission-critical research activities on the HPC systems. Day-to-day responsibilities inc

AIX Systems Administrator

Synergis

Atlanta, Georgia, USA

Contract

Senior AIX System Administrator 12-Month Contract | Onsite as needed (Atlanta) Synergis client has engaged us in a search for a Senior AIX System Administrator. This individual will become a member of the Distributed Systems group within the Information Technology Department. The candidate s primary responsibility will be to manage the AIX server infrastructure as part of a team of 6 AIX administrators. Senior AIX System Administrator| Background and Scope Managing a number of assigned serve

Senior Manager, Product

ADT LLC

Boca Raton, Florida, USA

Full-time

Company Overview: ADT has been in the business of helping save lives since 1874. As the #1 smart home security provider in the U.S., we help protect and connect families, businesses and larger commercial customer every day. Our continuous innovation, advanced technology and strategic partnerships deliver products and services that help protect life and valuables, whether at home, your business or on the go. And as times change, so do we. Above all, our mission is clear: we help save lives for a

TS/SCI HPC Systems Engineer

ClearBridge Technology Group

Arlington, Virginia, USA

Contract

Our client, located in Arlington, VA, is currently in need of a TS/SCI cleared HPC Systems Engineer for a 3 month contract. The consultant will work onsite in support of HPC hardware configuration, management and maintenance. The consultant will primarily be focused on the underlying hardware vs the software it runs. Responsibilities: HPC hardware configuration, management and maintenance SID documentation Troubleshooting issues Addressing hardware failures Addressing tickets Validation of har

Software Engineer, Model Inference

OpenAI

San Francisco, California, USA

Full-time

About the Team Our team brings OpenAI's most capable technology to the world through our products. Most recently, we released ChatGPT, GPT-4, the Whisper API, and DALL-E. We empower consumers and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they've never been able to before. Across all product lines, we ensure that these powerful tools are used responsibly. This is a key part of OpenAI's path towards safely deploying broadly beneficial Arti

Software Engineer, Hardware Health

OpenAI

San Francisco, California, USA

Full-time

About the Team OpenAI's Hardware Health team oversees all hardware health related aspects of our custom-built hyperscale supercomputers. The team is responsible for maximizing the available supercomputing capacity for research and ensuring that our researchers are minimally impacted by hardware faults. The hardware health team is being incubated inside OpenAI's Research team, which operates at the far edge of all available innovations in AI - doing the engineering and research required to trai

Quantitative Developer

Jobot

Chicago, Illinois, USA

Full-time

Chicago / Hybrid + Top Trading Security SaaS Company This Jobot Job is hosted by: Ryan Kilroy Are you a fit? Easy Apply now by clicking the "Apply Now" button and sending us your resume. Salary: $150,000 - $220,000 per year A bit about us: Our product simply is our tech and our strategies. We do our very best to create, code, and implement robust trading strategies in the financial markets. We cannot be successful without all three and this is the never-ending challenge, to stay ahead of the

Principal Network Developer - Cluster Networks GPU

Oracle Corporation

Santa Clara, California, USA

Full-time

Job Description Supports the design, deployment, and operations of a large-scale global Oracle cloud computing environment (Oracle Cloud Infrastructure - OCI). Primarily focused on development and support of network fabric and systems through a combination of a deep level understanding of networking at the protocol level coupled with programming skills to support the intensive automation required to operate a production environment. As OCI is a cloud-based network with a global footprint, this

Principal Network Developer - Cluster Networks GPU

Oracle Corporation

Santa Clara, California, USA

Full-time

Job Description Supports the design, deployment, and operations of a large-scale global Oracle cloud computing environment (Oracle Cloud Infrastructure - OCI). Primarily focused on development and support of network fabric and systems through a combination of a deep level understanding of networking at the protocol level coupled with programming skills to support the intensive automation required to operate a production environment. As OCI is a cloud-based network with a global footprint, this

Principal Network Developer - Cluster Networks GPU

Oracle Corporation

Santa Clara, California, USA

Full-time

Job Description Supports the design, deployment, and operations of a large-scale global Oracle cloud computing environment (Oracle Cloud Infrastructure - OCI). Primarily focused on development and support of network fabric and systems through a combination of a deep level understanding of networking at the protocol level coupled with programming skills to support the intensive automation required to operate a production environment. As OCI is a cloud-based network with a global footprint, this

Senior Principal Engineer, Runtime

SambaNova Systems

Palo Alto, California, USA

Full-time

Working at SambaNova The Runtime team at SambaNova is a seasoned engineering team with a proven track record of delivering cutting-edge system software solutions for AI and machine learning applications in the enterprise & commercial landscape. We handle all phases of software infrastructure to enable the higher level apps, including: OS interface/integration Data model manipulation for scaling Networking/communication intra and inter node Orchestration of partitioned workloads Error monitoring

Senior Signal Integrity Engineer - Georgetown, TX

AirBorn Inc.

Georgetown, Texas, USA

Full-time

SUMMARY: Designs, develops, and coordinates the design and production of high-speed multigigabit connectors, test boards, and laboratory testing from inception to completion by performing the following duties. ESSENTIAL DUTIES AND RESPONSIBILITIES include the following: Other duties may be assigned. Reviews project instructions and blueprints to ascertain test specifications, procedures, objectives, test equipment, nature of technical problem, and possible solutions such as part redesign, subs

Sr. Inbound / Outbound Coordinator (Night Shift)

Nestle USA

Breinigsville, Pennsylvania, USA

Full-time

Foods people love. Brands people trust. And a career that nourishes your future like no other. If you're driven by the passion to do something meaningful that changes lives, Nestl is the place for you. Nestl USA is one of seven operating companies that make up Nestl's presence in the United States. We're in 97% of American homes, and as the leading food and beverage company, our goals are to continue to deliver quality food and beverage products, strengthen our local communities, and reduce our