Job Role: Solution Architect (PowerEdge Rack / Tower)
Locations: Childress, TX
Duration : 12 months
Description:
Must have skills = PowerEdge Rack/Tower Experience, NVIDIA certifications
Nice to have skills - PowerEdge XE server experience NVIDIA QR Switches
Deep hands-on experience with GPU deployment, configuration, and multi-node testing using NVIDIA Base Command Manager
Proficiency with benchmarking tools: HPL, STREAM, NCCL, RCCL, MxP, OSU Microbenchmarks
Red Hat certification (RHCSA/RHCE) or 7+ years of relevant RH distros experience
Experience with GenAI/HPC networking (InfiniBand and/or RoCE)
Experience working in Linux based parallel computing environments at scale
Strong customer facing and communication skills
Desirable Requirements :
Bachelor's degree
NVIDIA certifications (NCA, NCE, DGX)
Experience with NVIDIA UFM, Infiniband, and SpectrumX fabrics
Exposure to hybrid cloud or GPU cloud environments
Experience with GPU observability/performance profiling tools
Code Upgrade
o Perform cluster-level code upgrades as per approved versions and compatibility guidelines.
iDRAC Management
o Configuration, access validation, and health checks of iDRAC.
o Troubleshooting and lifecycle management support.
Firmware Updates
o Update server, BIOS, NIC, storage, and related firmware.
o Ensure version alignment and post-update validation.
Redfish
o Overview and usage of Redfish APIs.
o Customization and automation using Redfish for system management and monitoring.
BlueField
o Configuration and management of BlueField DPUs.