Overview
Skills
Job Details
Design and maintain GPU capacity models, forecasts, and dashboards to support planning and operational decisions.
Track current GPU compute usage, forecast future demand, and proactively identify shortfalls or surpluses.
Collaborate with infrastructure, data science, and engineering teams to understand compute requirements.
Develop reporting frameworks and metrics for real-time and historical demand/capacity analysis.
Optimize GPU resource allocation and improve planning efficiency through data-driven strategies.
Provide insights into utilization trends and drive recommendations for procurement or reallocation.
Partner with finance, procurement, and ops teams to support GPU resource budgeting and cost analysis.
Support strategic planning efforts related to scaling GPU compute infrastructure.
Required Qualifications:
1+ year of hands-on experience in demand and capacity planning, specifically in compute/GPU environments.
Experience with capacity modeling, forecasting tools, and techniques.
Strong understanding of GPU compute workloads (e.g., ML training, inference, high-performance computing).
Proficiency in Excel, SQL, and visualization/reporting tools (e.g., Power BI, Tableau).
Familiarity with cloud platforms (AWS, Azure, Google Cloud Platform) and/or on-prem GPU infrastructure.
Excellent analytical, communication, and cross-functional collaboration skills.