Infrastructure Architect

San Jose, CA, US • Posted 2 hours ago • Updated 2 hours ago
Contract W2
On-site
$70 - $77/hr
Fitment

Dice Job Match Score™

📋 Comparing job requirements...

Job Details

Skills

  • Architectural Design
  • Virtual Machines
  • System On A Chip
  • NVIDIA GPU
  • Data Center
  • nodes

Summary

Role Overview

We are looking for a Principal Infrastructure Architect to join our IT PMO organization to take responsibility and lead the design, orchestration, and lifecycle management of our next-generation GPU Farm and AI Factory environments. This role is unique in its breadth, requiring a deep understanding of high-performance AI compute stacks alongside the disciplined management of physical data center assets and their long-term operational health. You will bridge the gap between R&D engineering requirements and the physical realities of global data center operations.

Key Responsibilities

  1. AI & GPU Infrastructure Design (GPU Farm / AI Factory)
  • Lead the architectural design and refinement of the Nutanix GPU-as-a-Service (GPUaaS) platform, ensuring a seamless experience for internal R&D, QA, and Sales teams.
  • Provide technical leadership in some of the key initiatives such as Nutanix Validated Designs (NVD) for the AI Factory, incorporating NVIDIA MGX/HGX architectures and high-density Cisco nodes (e.g., UCS 845A).
  • Architect the Management Cluster control plane (NKP, Prism Central, NuDeploy) to ensure it is decoupled from GPU compute nodes for maximum efficiency.
  • Implement policy-driven placement of workloads across on-prem and cloud-burst environments.
  1. Data Center Asset & Lifecycle Management
  • Design solution for a centralized Data Center Asset Inventory system, ensuring real-time visibility into all hardware assets, including CPUs, GPUs, Virtual Machines, and networking.
  • Develop a comprehensive Hardware Lifecycle Management strategy, including procurement forecasting, "rack and stack" operationalization, and decommissioning of legacy systems (G3/G4/G5).
  • Lead "Tiger Team" initiatives to navigate supply chain constraints, ensuring critical release milestones are not delayed by hardware shortages.
  • Enforce strict Security Standards for Data Center HW Provisioning.
  • Implement network segmentation for all the critical applications.
  • Ensure all infrastructure meets SOC 2 and ISO 27001 compliance objectives while maintaining low-latency performance.
  1. Special Projects
  • Provide required architecture and designs during the project intake process. Review, guide the teams for right architecture for all demands before they become approved projects.
  • Partner with security team and provide guidelines for upcoming projects.
  • Involve and lead projects as an architect on special projects.

Required Qualifications

  • Bachelor's degree in Information Technology, Business, or a related field
  • 5+ years of experience in Data Center projects in an enterprise environment
  • Knowledge of Cisco, Dell, HPE, Supermicro hardware.
  • Hardware Expertise: Deep knowledge of Cisco HW, NVIDIA GPU architectures (H100, B200, RTX 6000 Pro) and high-speed interconnects (RoCE v2, InfiniBand).
  • Infrastructure Mastery: Extensive knowledge and experience with Data Center infrastructure.
  • Management Tools: Proficiency with asset management and automation tools (Netbox, ServiceNow, Terraform, or OpenTofu).
  • Lifecycle Mgmt & Capacity Planning: Experience in Data Center lifecycle mgmt, DC HW capacity planning, decommissioning, defragmentation, building complex financial showback models for shared infrastructure.
  • AI/ML Ops: Proven expertise in Kubernetes (NKP preferred) and NVIDIA AI Enterprise stacks (GPU Operator, DCGM, Triton, vLLM).

Preferred Qualifications

  • Experience managing (as an architect) massive-scale data center environments (1,000+ nodes).
  • Knowledge of Nutanix Cloud Infrastructure (NCI), AHV, and Prism Central
  • Strong background in MLOps and automated pipeline integration (Kubeflow/MLflow).
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90989446
  • Position Id: 8964894
  • Posted 2 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote or San Jose, California

Today

Full-time, Third Party, Contract

San Jose, California

Today

Easy Apply

Contract

Depends on Experience

San Jose, California

6d ago

Easy Apply

Contract

81 - 86

Santa Clara, California

Today

Easy Apply

Contract

$40 - $50

Search all similar jobs