Apply Now

Data Center Operations Engineer

California City, CA, US • Posted 1 day ago • Updated 1 day ago

Contract W2

On-site

Depends on Experience

Fitment

Dice Job Match Score™

🔢 Crunching numbers...

Job Details

Skills

Adapter
Artificial Intelligence
Bash
Cabling
Collaboration
Command-line Interface
Communication
Computer Hardware
Computer Networking
Documentation
FTP
Firmware
GPU
HPC
Hardware Installation
High Availability
ICMP
Incident Management
InfiniBand
Linux
Linux Administration
Management
Microsoft Windows
Migration
Network
OSI Model
Optical Fiber
Organizational Skills
RAID
Repair
Routers
SLA
SMTP
Scripting
Server Hardware
Servers
Storage
Switches
TCP
TCP/IP
TFTP
Testing
UDP
Video

Summary

Position: Data Center Operations Engineer

Location: California City, CA, USA
Duration: 12+ Months (Contract)
Interview: Video Interview
Visa: Open (As per client requirement)

Job Description:

We are seeking a Data Center Operations Engineer with strong hands-on experience supporting enterprise data center infrastructure, Linux systems, GPU server deployments, and InfiniBand networking. The ideal candidate will have expertise in installing, configuring, troubleshooting, and maintaining data center hardware and infrastructure while supporting HPC/AI environments and ensuring high availability of critical systems.

This role requires excellent troubleshooting skills, experience with GPU cluster deployments, InfiniBand fabrics, Linux administration, networking, and data center operations. The engineer will work closely with infrastructure, operations, and engineering teams to support deployments, maintenance activities, and continuous operational improvements.

Required Skills:

5+ years of experience in Data Center Operations or Infrastructure Engineering.
Strong hands-on experience with Linux system administration, troubleshooting, and performance validation.
Experience with Linux command-line utilities and Bash/Shell scripting.
Hands-on experience deploying and configuring GPU servers in clustered environments.
Experience with GPU cluster bring-up, driver installation, and system-level configuration.
Strong knowledge of InfiniBand networking, including switch configuration, subnet management, and troubleshooting.
Experience performing end-to-end GPU testing in InfiniBand-based clusters.
Solid understanding of networking fundamentals, including TCP/IP, OSI Model, ARP, ICMP, TCP, UDP, SMTP, FTP, and TFTP.
Experience installing, configuring, and troubleshooting routers, switches, and terminal servers.
Hands-on experience with server hardware installation, rack and stack, cabling, CPUs, memory, HDDs, RAID controllers, NICs, and firmware upgrades.
Experience with fiber and copper cabling, IP networking, and SAN infrastructure.
Experience supporting data center deployments, migrations, hardware refreshes, and expansion projects.
Experience using monitoring and alerting tools to identify and resolve infrastructure issues.
Experience working with ticketing systems while meeting SLA requirements.
Strong documentation skills for operational procedures, system configurations, and technical runbooks.
Excellent troubleshooting, communication, and organizational skills.
Ability to work in a fast-paced production environment and participate in on-call rotations.

Preferred Skills:

Experience supporting HPC, AI, or large-scale GPU environments.
Experience with NVIDIA GPU platforms and Mellanox/InfiniBand technologies.
Experience with data center monitoring solutions.
Experience supporting large-scale data center build-outs and infrastructure refresh programs.
Familiarity with automation or scripting for operational tasks.

Responsibilities:

Provide operational support for data center deployments, maintenance, and repair activities.
Install, configure, test, and maintain Linux servers and GPU infrastructure.
Deploy, configure, and validate GPU servers and clustered environments.
Perform InfiniBand fabric bring-up, switch configuration, subnet management, and troubleshooting.
Install and maintain server hardware, including CPUs, memory, storage, RAID components, and network adapters.
Configure and troubleshoot routers, switches, terminal servers, and out-of-band management devices.
Perform daily health checks of Linux systems, networking, and infrastructure components.
Support data center build-outs, hardware refreshes, migrations, and expansion projects.
Coordinate with vendors for hardware installation, diagnostics, replacement, and warranty support.
Monitor infrastructure using monitoring and alerting tools, ensuring timely incident resolution.
Maintain operational documentation, technical procedures, and runbooks.
Participate in incident response, maintenance windows, and on-call support rotations.
Collaborate with cross-functional global teams to ensure reliable, secure, and scalable infrastructure operations.

Contact -

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 90773860
Position Id: 9011916
Posted 1 day ago

Contact the job poster

Saipriya Yethirajula

Recruiter @ NMK Global Inc.

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Data Center Engineer

No location provided

•

Today

We are seeking a dedicated and experienced Data Center Engineer to join our team in Somerset, NJ. This role requires hands-on work with our physical servers, network infrastructure, and IT components across various colocation facilities. You'll be responsible for the physical setup, maintenance, and troubleshooting of server and network hardware, while ensuring smooth operations and accurate documentation. The ideal candidate is energized by working in a cutting-edge data center environment tha

Full-time

USD 100,000.00 - 140,000.00 per year

Catalyst Velocity Data Center Technician - Red Oak, TX

Texas

•

Today

Job Description Anticipated Start Dates (subject to change): August 31, 2026; or November 9, 2026 Oracle Cloud Infrastructure (OCI) is seeking skilled technicians to join our Infrastructure Operations team through the Catalyst Program - Velocity Track, designed to accelerate the development of high-potential operational talent into advanced technical roles. OCI operates one of the world's largest cloud infrastructures, supporting mission-critical services for customers around the globe. Our t

Full-time

USD 13.03 - 24.81 per hour

Data Center Technician (JoinOCI-Ns2)

Texas

•

Today

Job Description As an Oracle Data Center Engineer, you will be the technical liaison between the technology teams and the Data Center Environment and will be key in maintaining the Operational run aspects. You will be supporting our growth path and will be recognized as a 'technical expert' with a focus on core Data Center infrastructure. You will troubleshoot and solve all but the most complex infrastructure issues. As a pragmatic problem solver on a wide range of Data Center environment and s

Full-time

USD 32.84 - 67.88 per hour

Catalyst Foundations Data Center Technician - Red Oak, TX

Texas

•

Today

Job Description Anticipated Start Dates (subject to change): August 31, 2026; or November 2, 2026 Oracle Cloud Infrastructure (OCI) is expanding rapidly, and we are seeking motivated individuals to join our Infrastructure Operations team through the Catalyst Program - Foundations Track. This structured onboarding and development program is designed to accelerate the growth of early-career technicians by combining hands-on operational experience, technical training, mentorship, and career devel

Full-time

USD 13.03 - 24.81 per hour

Search all similar jobs

More jobs at NMK Global Inc. in California City, CA