Role : Product Manager with GPU & LLM Infrastructure
Location : Minneapolis, Minnesota / Charlotte, NC / Irving, TX (Hybrid)
Visa: H1B L2 EAD
Need Passport Number
About This Role
Enterprise AI Platform- GPU & LLM Infrastructure Product Manager
You will define and lead the product strategy for enterprise-scale LLM/SLM inference GPU platform. In this role, you will partner closely with GPU hardware and platform engineering teams to translate customer needs and business objectives into a clear, prioritized roadmap with measurable outcomes.
You will own capabilities across high-performance model inferencing, GPU orchestration, and platform services, including vLLM, NVIDIA/Run:AI, and Red Hat OpenShift AI. The role also encompasses API productization, observability and evaluation, reliability and SLOs, and compliant end-to-end lifecycle management to enable secure, scalable, and enterprise-ready AI solutions.
In This Role, You Will
- Lead a team to identify, strategize and execute highly complex Artificial Intelligence initiatives that span a line of business
- Recommend business strategy and deliver Artificial Intelligence enabling solutions to solve business challenges
- Define and prioritize cases, obtain the required resources and ensure the solutions deliver the intended benefits
- Leverage Artificial Intelligence expertise to evaluate technological readiness and resources required to execute the proposed solutions
- Make decisions to drive the implementation of Artificial Intelligence initiatives and programs while serving multiple stakeholders
- Resolve issues which may arise during development or implementation
- Collaborate and consult with peers, colleagues and managers to resolve issues and achieve goals
Required Qualifications:
- 5+ years of Artificial Intelligence Solutions experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
- 2+ years of hands-on experience with cloud platforms such as Google Cloud Platform or Azure, and container orchestration technologies including Docker and Kubernetes/OpenShift
Desired Qualifications:
- 2+ years of experience working on platform or ML/AI infrastructure products within regulated environments
- 2+ years of experience of proven success owning an API or platform with accountability for SLAs/SLOs, including versioning and deprecation strategies, change management, and reliability outcomes
- Strong communication skills, with the ability to influence senior stakeholders and clearly explain complex technical concepts to diverse audiences
- Working knowledge of LLM/SLM inference stacks, including vLLM, Triton, and TensorRT-LLM, as well as batching strategies, KV cache management, quantization techniques (e.g., FP8, INT4), and evaluation frameworks-sufficient to make informed product trade-offs with engineering teams
- Familiarity with GPU and platform fundamentals, such as modern GPU architectures (e.g., H100/H200), MIG and NCCL, GPU orchestration tools (NVIDIA/Run:AI), and Kubernetes/OpenShift AI administration and admission control patterns
- Experience building developer-centric platforms, including APIs, SDKs, and structured release and governance processes
- Hands-on experience with observability and evaluation for GenAI systems, including dashboards, tracing, alerting, and safety and quality metrics
- Demonstrated strength in stakeholder management, partnering effectively across Risk, Security, Architecture, and line-of-business application teams