Senior Terraform Engineer - Azure-Centric (ML & GenAI Focus)

Overview

Hybrid
60 - 65
Contract - W2
Contract - Independent
Contract - 6 Month(s)
No Travel Required
Unable to Provide Sponsorship

Skills

terraform
azure
Machine Learning
Gene AI
Cloud Architecture
Dev Ops
ML Ops
Azure AI Studio
Azure Databricks
Azure Kubernetes Service (AKS)
Azure Machine Learning (Azure ML)
AWS/GCP integrations

Job Details

Location: Charlotte, NC
Employment Type: Contract 
Permanent Residents and

We are seeking a highly skilled Senior Terraform Engineer with deep expertise in Azure services to join our Enterprise AI Platform team. This role is Azure-centric, with a strong emphasis on deploying Machine Learning (ML) and Generative AI (GenAI) models in scalable, secure, enterprise environments.

The ideal candidate will have hands-on experience with multi-cloud architectures, Infrastructure as Code (IaC) best practices, and a strong foundation in ML workflows, enterprise AI platforms, and cloud-based ML services. You will play a key role in automating infrastructure provisioning, integrating AI/ML pipelines, and optimizing deployments for performance, cost, security, and compliance across a multi-cloud landscape.

This position requires a proactive engineer who can bridge DevOps and MLOps, leveraging Terraform to support high-impact AI initiatives. If you thrive in fast-paced environments and are passionate about building robust, automated cloud infrastructures for AI at scale, this role offers a unique opportunity to drive innovation.


Key Responsibilities

Infrastructure as Code & Azure Platform Engineering

  • Design, implement, and maintain Infrastructure as Code (IaC) solutions using Terraform to provision and manage Azure resources, including:

    • Azure Machine Learning (Azure ML)

    • Azure AI Studio

    • Azure Kubernetes Service (AKS)

    • Azure Databricks

    • Related services supporting ML and GenAI model deployment

  • Develop and enforce IaC best practices, including:

    • Modular Terraform design

    • Remote state management (Azure Storage backends)

    • Drift detection

    • Automated policy and security testing using tools such as Terragrunt and Checkov

ML & GenAI Platform Enablement

  • Deploy and orchestrate ML and GenAI models on enterprise ML platforms

  • Enable end-to-end automation across the ML lifecycle, from model training through inference

  • Integrate AI/ML workflows with CI/CD pipelines (Azure DevOps, GitHub Actions)

Multi-Cloud Architecture & Integration

  • Collaborate with data scientists, ML engineers, and cross-functional teams to design multi-cloud architectures, with Azure as the primary platform and AWS/Google Cloud Platform integrations

  • Support hybrid deployments, data sovereignty requirements, and disaster recovery strategies

  • Implement cross-cloud networking, identity federation, and resource orchestration

Cloud Optimization & Security

  • Optimize cloud infrastructure for AI/ML workloads, including:

    • Compute clusters

    • Storage (Azure Blob Storage, Azure Data Lake Storage – ADLS)

    • Networking (Virtual Networks, Private Endpoints)

    • Security controls (Azure RBAC, Azure Key Vault, Azure Sentinel)

  • Ensure infrastructure meets enterprise security, availability, and compliance standards (e.g., GDPR, SOC 2)

MLOps & Observability

  • Implement MLOps best practices, including:

    • Model versioning

    • Monitoring

    • Logging

    • Alerting

  • Leverage observability tools such as Azure Monitor, Prometheus, and MLflow to ensure reliable, production-grade deployments

Operations & Collaboration

  • Troubleshoot and resolve infrastructure issues in production AI environments

  • Ensure high availability, scalability, and reliability of AI platforms

  • Conduct code reviews, mentor junior engineers, and contribute to documentation for ML/GenAI-specific IaC patterns

  • Stay current with emerging Azure ML services, including:

    • Azure OpenAI Service

    • Prompt Flow

  • Participate in on-call rotations and incident response for critical AI infrastructure


Required Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field (or equivalent professional experience)

  • 5+ years of experience as a Cloud Engineer, DevOps Engineer, or similar role

  • At least 3 years of hands-on experience with Terraform for IaC in Azure environments

  • Proven experience deploying ML and GenAI models using Azure ML, including:

    • Model training

    • Model registration

    • Managed endpoints

    • Inference pipelines

  • Strong hands-on experience with multi-cloud architectures

    • Azure required

    • AWS and/or Google Cloud Platform preferred

  • In-depth understanding of Terraform concepts, including:

    • Modules

    • Providers (AzureRM)

    • Variables and outputs

    • Workspaces and backends

  • Solid understanding of the machine learning lifecycle, including:

    • Data ingestion

    • Feature engineering

    • Model serving

    • Scaling in enterprise AI platforms (Azure ML, SageMaker, Vertex AI)

  • Experience with containerization and orchestration tools:

    • Docker

    • Kubernetes (AKS)

    • Helm

  • Proficiency in scripting languages such as Python, PowerShell, or Bash

  • Familiarity with cloud security best practices for ML environments, including:

    • Encryption

    • Access controls

    • Vulnerability scanning

  • Strong problem-solving skills and experience working in Agile teams


Preferred Qualifications

  • Relevant certifications, including:

    • Microsoft Certified: Azure DevOps Engineer Expert

    • Azure AI Engineer Associate

    • HashiCorp Certified: Terraform Associate

  • Experience with additional IaC tools such as:

    • ARM Templates

    • Bicep

    • Pulumi (for hybrid Azure setups)

  • Background in MLOps tooling, including:

    • Kubeflow

    • MLflow

    • Azure ML Pipelines

  • Experience with cloud cost optimization for AI workloads using tools like Azure Cost Management

  • Prior experience working in regulated industries (finance, healthcare, etc.) with compliance-driven infrastructure requirements

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.