AI Infrastructure Platform Engineer ( Locals to NC)

Charlotte, NC, US • Posted 6 hours ago • Updated 3 hours ago
Contract Independent
Contract Corp To Corp
Contract W2
No Travel Required
Able to Sponsor
On-site
Depends on Experience
Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

  • infrastructure engineering
  • Linux
  • CI/CD
  • AI/ML
  • Kubernetes
  • container
  • Generative AI
  • Predictive AI
  • GPU
  • AIOps

Summary

Role :AI Infrastructure Platform Engineer

Location : Charlotte , NC

Local candidates only.

In This Role, You Will
- Lead complex infrastructure initiatives supporting Generative AI and Predictive AI platforms from design to production operations.
- Serve as a technical lead for platforms supporting AI/ML model training, inference, and batch workloads.
- Design, build, deploy, and operate OpenShift-based container platforms optimized for high-performance GPU workloads.
- Build, support and operate scalable GPU SuperPod architecture with large multi-node GPU clusters.
- Own monitoring, alerting, and observability using Grafana, Splunk, and enterprise telemetry tools.
- Define SLIs/SLOs and build actionable alerts to proactively detect performance, capacity, and resiliency risks.
- Build AI- and agent-based automation tools for self-healing, scaling, diagnostics, and incident remediation.
- Apply AIOps techniques to reduce alert fatigue and improve platform reliability.
- Lead production incident analysis and ensure operational rigor and root-cause prevention.
- Mentor engineers and influence stakeholders across a geographically distributed organization.

Required Qualifications
- 5+ years of infrastructure engineering experience.
- 5+ years troubleshooting complex end-to-end architectures(including CI/CD pipeline).
- 5+ years Linux systems experience.
- 4+ years supporting AI/ML platforms.
- 4+ years of Kubernetes / container platform experience including production support.

Desired Qualifications
- Experience with Generative AI and Predictive AI platforms.
- Hands-on GPU platform operations including scheduling, quota, and performance tuning.
- Experience with OpenShift in GPU-enabled, multi-tenant environments.
- Experience designing or operating GPU Super Pods.
- Deep experience with observability using Grafana, Splunk, and custom telemetry pipelines.
- Experience building AI- or agent-driven automation tooling (AIOps).
- Hands-on experience supporting AI/ML workloads on Google Cloud Platform and Azure, including GPU-backed services and managed AI infrastructure
- Experience operating hybrid or multi-cloud AI platforms, with an understanding of cloud-native services, networking, identity, and cost optimization for Generative and Predictive AI
- Strong monitoring of AI signals such as inference latency and GPU utilization.
- Experience with BCP/DR, resiliency, and highly available architectures.

Job Expectations
- Participation in a 24x7 on-call rotation.
- Ownership for production stability, platform health, and customer outcomes.
- Operate in regulated enterprise environments with strong risk and control foc

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90970970
  • Position Id: 8963480
  • Posted 6 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Charlotte, North Carolina

6d ago

Easy Apply

Contract, Third Party

$180,000 - $250,000

Charlotte, North Carolina

6d ago

Easy Apply

Contract

55 - 65

Charlotte, North Carolina

Today

Contract

USD 53.00 - 57.00 per hour

Charlotte, North Carolina

Today

Contract

70-85/hr

Search all similar jobs