Data Center Technician

Hybrid in Dallas, TX, US • Posted 2 hours ago • Updated 13 minutes ago
Contract Independent
Contract W2
6 Months
Hybrid
Depends on Experience
Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

  • Systems Administrator
  • Systems Engineer
  • Data Center
  • CPU
  • GPU
  • Datacenter
  • HPE
  • Dell
  • CI/CD

Summary

Operations focused Systems Administrator / Systems Engineer (more of a Data Center Technician actually) supporting a large-scale bare-metal server environment (~17,000 servers) with a heavy emphasis on CPU and GPU compute availability. This role is centered on reliability, automation, and operational excellence digging into systems and pipelines when things break and improving them so they break less often. This is not hands-on data center work.

What you ll be doing

  • Administer and support large-scale bare-metal server infrastructure, primarily HPE and Dell platforms
  • Perform server break/fix troubleshooting including hardware faults, firmware/BIOS/BMC issues, POST failures, degraded components, and system instability
  • Manage server lifecycle operations: onboarding, provisioning, firmware updates, BIOS/BMC configuration, and hardware refresh kits
  • Own incident response and break/fix workflows while maintaining 98%+ compute availability SLAs
  • Work cross-functionally with Data Center and Networking teams during hardware incidents, including ticket creation, repair coordination, and log collection
  • Interface directly with HPE and Dell vendors: gathering diagnostics, sending logs, driving RMAs, and tracking issues through resolution
  • Support and troubleshoot CI/CD and automation pipelines used for server provisioning, configuration, and lifecycle management
  • Dig into automation code and workflows (Ansible, scripts, pipelines) when jobs fail to understand root cause and unblock deployments
  • Identify recurring operational issues and contribute to process improvements, runbooks, and reliability enhancements
  • Help manage and reduce the operations backlog, prioritizing fixes, cleanup, and automation improvements

Must Have:

  • Hands-on experience supporting HPE and Dell servers in production, including break/fix and hardware incident troubleshooting
  • Experience with HPE iLO, Dell iDRAC, and related BMC environments
  • Strong understanding of server hardware components (CPU, GPU, memory, disks, NICs, power) and common failure modes
  • Experience troubleshooting automation and CI/CD pipelines that manage infrastructure (not just running them, but fixing them when they fail)
  • Operational mindset with experience owning incidents, SLAs, backlog items, and process improvements
  • Automation experience with Ansible, Bash, Jenkins, or similar tooling
  • Exposure to GPU-dense, HPC, or high-performance compute environments
  • Experience improving runbooks, reducing toil, and scaling operations through automation
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90859492
  • Position Id: 9006016
  • Posted 2 hours ago
Contact the job poster
Kumar Sai

Kumar Sai

Recruiter @ SumasEdge Corporation
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Irving, Texas

Today

Easy Apply

Full-time

USD 70,000.00 - 80,000.00 per year

Irving, Texas

Today

Full-time

Dallas, Texas

Today

Full-time

Southlake, Texas

Today

Easy Apply

Full-time

Search all similar jobs