Overview
Skills
Job Details
Position Summary:
As an InfiniBand Network Engineer specializing in InfiniBand networks, you will be instrumental in designing, implementing, and optimizing high-performance fabric architectures for our data center and infrastructure projects. Your expertise in InfiniBand technologies will ensure that our network infrastructure meets the demands of our cutting-edge applications and services.
Key Responsibilities:
- Design, architect, and implement distributed InfiniBand networks for high-performance computing (HPC) and data center environments.
- Collaborate with cross-functional teams to understand project requirements and develop fabric architectures that meet performance, scalability, and reliability goals.
- Configure and deploy InfiniBand switches, routers, adapters, and other network components, ensuring seamless integration with existing infrastructure.
- Optimize network performance through tuning, troubleshooting, and performance monitoring, identifying and resolving bottlenecks and performance issues.
- Develop automation scripts and tools to streamline network provisioning, configuration management, and monitoring tasks.
- Stay current with industry trends, emerging technologies, and best practices related to InfiniBand networking, HPC, and distributed computing.
- Provide technical guidance and support to other team members, sharing knowledge and best practices to enhance team capabilities
Basic Qualifications:
- Bachelor s degree in computer science, Electrical Engineering, or related field. Master's degree preferred.
- 3+ years of professional experience in designing, implementing, and optimizing distributed InfiniBand networks.
- Hands-on experience with InniBand fabric design, deployment, and troubleshooting.
- Experience with small or large-scale, low-latency, high-bandwidth fabrics common in GPU clusters and HPC environments.
- Strong understanding of InfiniBand architectures, protocols, and technologies, including experience with switch configurations, routing, and subnet management.
- Proficiency in network design and troubleshooting, with experience in network performance analysis and optimization.
- Experience with high-performance computing (HPC) environments, distributed computing, and parallel processing technologies.
- Strong scripting and automation skills, with proficiency in Python, Bash, or similar scripting languages.
- Excellent communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
- Certifications such as InfiniBand Certified Associate (IBCA) or InfiniBand Certified Professional (IBCP) are a plus.
Please note that Applied Digital is currently unable to sponsor new applicants for employment authorization or provide immigration-related support for this position. This includes, but is not limited to, visa categories such as H-1B, F-1 OPT, F-1 STEM OPT, F-1 CPT, J-1, TN, E-2, E-3, L-1, O-1, and any Employment Authorization Documents (EADs) or other work authorizations that require employer sponsorship.