Data Center Engineer

Santa Clara, CA, US • Posted 6 hours ago • Updated 6 hours ago
Full Time
On-site
USD $50.00 - 60.00 per hour
Company Branding Image
Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

  • Backbone.js
  • Operational Excellence
  • Incident Management
  • High Availability
  • UPS
  • Distribution
  • Adobe AIR
  • Computer Networking
  • Asset Management
  • Preventive Maintenance
  • Network
  • Continuous Improvement
  • Workflow
  • Reporting
  • Network Security
  • Capacity Management
  • Access Control
  • System On A Chip
  • ISO/IEC 27001:2005
  • Management
  • Hardware Troubleshooting
  • Organizational Skills
  • Communication
  • ITIL
  • Change Management
  • GPU
  • Artificial Intelligence
  • HPC
  • Leadership
  • Regulatory Compliance
  • Auditing
  • Scripting
  • Python
  • Bash
  • Operational Efficiency
  • Taxes
  • Life Insurance
  • Partnership
  • Collaboration
  • Business Transformation
  • Law

Summary

Description
NVIDIA is seeking a highly experienced Data Center Operations Engineering Lead to serve as the on-site operational owner for a critical data center location. This role is the operational backbone of the site-responsible for ensuring infrastructure reliability, uptime, compliance, and readiness to support production workloads across NVIDIA's rapidly growing global data center footprint.
This is a hands-on, high-impact role for a senior engineer who thrives in mission-critical environments, owns issues end-to-end, and drives operational excellence through strong technical judgment, disciplined processes, and cross-functional leadership. You will act as the primary on-site authority and escalation point while partnering with centrally managed engineering, facilities, network, security, and capacity planning teams.
**Being able to track and report on continuous areas of improvement is key for the DC to continue to progress.
Key Responsibilities
*Data Center Operations & Incident Management
Own day-to-day operational health of the assigned data center site.
*Serve as the primary on-site escalation point for operational, infrastructure, and facilities issues.
*Lead incident response, triage, escalation, and resolution to maintain high availability and uptime.
*Coordinate with internal teams, vendors, colocation providers, and Facilities Operations Centers (FOC) during incidents and maintenance events.
Infrastructure Readiness & Reliability
*Ensure infrastructure readiness for new site turn-ups, expansions, and post go-live stabilization.
*Inherit newly built lab or data center environments after buildout and transition them to steady-state operations.
*Govern infrastructure changes including installs, upgrades, retrofits, and decommissions with appropriate change management and rollback planning.
*Maintain deep operational knowledge of critical systems: power distribution, cooling (air and liquid), networking, space, and rack density.
Preventative Maintenance, Capacity & Asset Management
*Manage and track preventative maintenance schedules for power, cooling, network, and compute infrastructure.
*Monitor and manage site capacity (power, cooling, space, racks) and identify constraints and risks.
*Maintain accurate asset inventories and track lifecycle from deployment through decommissioning using DCIM tools.
Process Excellence, Metrics & Continuous Improvement
*Develop, document, and continuously improve SOPs, runbooks, escalation workflows, and site readiness checklists.
*Lead ITIL-aligned change management and operational governance processes.
*Track and report site-level operational metrics; analyze trends to drive reliability and service improvements.
*Identify opportunities to automate operational tasks and improve tooling and visibility.
Cross-Functional Leadership, Security & Compliance
*Act as the local liaison between facilities, engineering, networking, security, capacity planning, and compliance teams.
*Ensure physical and logical access controls are enforced and compliant.
*Maintain audit readiness and support compliance efforts (e.g., SOC 2, ISO 27001, safety and regulatory certifications).
*Manage relationships with vendors, service providers, and colocation partners, including SLAs and contracts.
Skills
Data center, data center operations, data center maintenance, Hardware troubleshooting, Troubleshooting, Infrastructure, cooling systems, Power, PDU, data center mgr, Data Center Facilities, Rack and stack
Top Skills Details
Data center,data center operations,data center maintenance,Hardware troubleshooting,Troubleshooting,Infrastructure,cooling systems,Power,PDU
Additional Skills & Qualifications
Additional Skills
Strong operational judgment, prioritization, and organizational skills.
Excellent written and verbal communication skills, including executive-level incident communication.
Ability to operate independently on-site while collaborating with distributed teams and off-site managers.
Experience with ITIL frameworks, change management, vendor SLAs, and compliance standards.
Ways to Stand Out
Experience supporting high-density or liquid-cooled GPU, AI, or HPC environments.
Prior ownership or leadership of data center compliance audits.
Scripting or automation experience (Python, Bash, etc.) to improve operational efficiency.
Experience Level
Intermediate Level
Job Type & Location
This is a Contract position based out of Santa Clara, CA.
Pay and Benefits
The pay range for this position is $50.00 - $60.00/hr.
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following:
Medical, dental & vision
Critical Illness, Accident, and Hospital
401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available
Life Insurance (Voluntary Life & AD&D for the employee and dependents)
Short and long-term disability
Health Spending Account (HSA)
Transportation benefits
Employee Assistance Program
Time Off/Leave (PTO, Vacation or Sick Leave)
Workplace Type
This is a fully onsite position in Santa Clara,CA.
Application Deadline
This position is anticipated to close on May 11, 2026.
>About TEKsystems:
We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.

The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

About TEKsystems and TEKsystems Global Services

We're a leading provider of business and technology services. We accelerate business transformation for our customers. Our expertise in strategy, design, execution and operations unlocks business value through a range of solutions. We're a team of 80,000 strong, working with over 6,000 customers, including 80% of the Fortune 500 across North America, Europe and Asia, who partner with us for our scale, full-stack capabilities and speed. We're strategic thinkers, hands-on collaborators, helping customers capitalize on change and master the momentum of technology. We're building tomorrow by delivering business outcomes and making positive impacts in our global communities. TEKsystems and TEKsystems Global Services are Allegis Group companies. Learn more at TEKsystems.com.

The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 101054TS
  • Position Id: JP-005988927
  • Posted 6 hours ago

Company Info

About TEKsystems c/o Allegis Group

We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in strategy, implementation and talent, we work with progressive leaders who drive change. That s the power of true partnership. TEKsystems is an Allegis Group company.

About_Company_One
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Santa Clara, California

Today

Full-time

USD 50.00 - 60.00 per hour

Santa Clara, California

Today

Full-time

USD 40.00 - 42.00 per hour

Santa Clara, California

Today

Full-time

USD 45.00 - 68.35 per hour

Fremont, California

Today

Full-time

USD 35.00 - 35.00 per hour

Search all similar jobs