Senior Kubernetes Engineer - AI GPU Cluster Infrastructure


GlobalLogic Inc.
Dice Job Match Score™
🧠 Analyzing your skills...
Job Details
Skills
- DevOps
- Kubernetes
- RDMA/ROCEv2
- NCCL
- Deployment
- CI/CD & Automation
- Linux
- Artificial Intelligence
- GPU
- Remote Direct Memory Access
- Computer Cluster Management
Summary
Job Description:
- 5+ years of experience with Kubernetes in production environments
- Experience managing Kubernetes clusters for GPU workloads
- Strong knowledge of Kubernetes networking and cluster architecture
- Experience configuring RDMA and RoCEv2 networking for Kubernetes clusters
- Hands-on experience with Kubernetes CLI tools and cluster troubleshooting
- Optional:
- Experience with NVIDIA NCLL and/or GPU ecosystem tools
- Experience with AI infrastructure environments and automation
Job Responsibilities:
- Deploy, configure, and manage Kubernetes clusters for AI workloads
- Ensure proper configuration of RoCEv2 and RDMA networking within Kubernetes clusters
- Troubleshoot node failures and ensure cluster stability and recovery
- Support performance testing and optimization of AI workloads running on Kubernetes
- Work closely with infrastructure, compute, and networking teams
- Implement best practices for GPU cluster management and reliability
Must have:
- Kubernetes
- Linux administration
- RDMA/ROCEv2
- Troubleshooting
- Nice to have:
- NCCL
- Automation/deployment automation
Education: Bachelor''s or Master’s degree in Computer Science, Computer or Electrical Engineering, Mathematics, or a related field.
GlobalLogic estimates the starting pay range for this role to be performed in Remote to be $135,000 to $140,000 and reflects base salary only. This pay range is provided as a good-faith estimate, and the amount offered may be higher or lower. GlobalLogic takes many factors into consideration in making an offer, including candidate qualifications, work experience, operational needs, travel and onsite requirements, internal peer equity, prevailing wage, responsibilities, and other market and business considerations.
- Dice Id: RTL65472
- Position Id: 8907341
- Posted 2 hours ago
Company Info
The leader in software R&D services, GlobalLogic has created a network of global innovation hubs throughout the US, India, Ukraine, China and Argentina that connects clients with 3,000 of the brightest and most innovative software minds through an award-winning platform (GlobalLogic Velocity ) for distributed Agile R&D.
GlobalLogic leverages its proven Agile tools and processes, as well as a decade of experience building thousands of market-leading products, to provide clients with a full range of lifecycle services, including advisory, ideation, customer research, engineering, QA/IVT, maintenance & support, and product line management. The company has ongoing partnerships with more than 150 clients in markets such as Digital Media, Electronics, Finance, Healthcare, Infrastructure Software, Retail and Telecom.
If you are a California resident, more details on how we process your personal information can be found in the CCPA Recruitment Privacy Notice (https://www.globallogic.com/privacy/ccpa-recruitment-privacy-notice/)
Top Rank
Ranked an Inc. 500 company Recognized as a top global employer since 2005 Has global innovation hubs in U.S., India, Ukraine, China and Argentina
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs