DevOps Automation Engineer Google Cloud Platform & ML Workloads

Overview

On Site
$60 - $70
Contract - W2
Contract - Independent
Contract - 12 Month(s)

Skills

GCP
HPC

Job Details

Experienced DevOps Automation Engineer with a strong background in Google Cloud Platform infrastructure, ML workload orchestration, and high-performance computing (HPC). Skilled in building scalable, automated environments that support dynamic, high-demand workloads across diverse customer blueprints. Proven ability to optimize infrastructure usage, automate test and deployment pipelines, and ensure observability and reliability across cloud-native systems.

Core Competencies:

  • Infrastructure-as-Code (IaC): Terraform, Deployment Manager
  • CI/CD: Cloud Build, Jenkins, GitOps practices
  • Orchestration: Google Cloud Platform ASQ Tool, DWS (Dynamic Workload Scheduler)
  • Monitoring & Logging: Cloud Monitoring, Logging, Custom Dashboards
  • Containerization: GKE, Docker, Kubernetes-native automation
  • ML & HPC Workloads: NCCL, HyperCompute, Slurm Cluster Ops
  • Automation: Concurrent testbench deployments, blueprint-based configuration
  • Observability: Telemetry integration, alerting, event correlation
  • Release & Image Management: VM image rollout and validation workflows
  • Scripting: Python, Bash, YAML for automation pipelines
  • Load Balancing & Reallocation: Optimization on A* VMs for ML workloads

Highlights:

  • Designed and deployed automated blueprint testbenches simulating real-world ML workloads with full observability and telemetry hooks.
  • Managed auto-scaling HyperCompute clusters using Slurm, with scripted workflows for lifecycle events and resource optimization.
  • Integrated Google Cloud Platform ASQ orchestration tools to run parallel test environments with varying configurations for robust regression testing.
  • Automated historical log management, root cause analysis storage, and cluster update workflows, supporting 6-month traceability.
  • Developed dynamic CI/CD pipelines to support frequent releases of cluster toolkits and VM images, ensuring compliance and performance.

Quadrant Technologies is a leading IT services and solutions provider specializing in cloud computing, data analytics, AI/ML, and enterprise applications. We empower businesses with innovative technology solutions that drive digital transformation and enhance operational efficiency.
Committed to excellence and customer success, we help organizations optimize IT investments, streamline processes, and achieve strategic goals. Our team of experienced professionals ensures seamless project execution from implementation to ongoing support.
Join a dynamic, growth-driven company where innovation meets impact. Explore career opportunities with Quadrant Technologies today!

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.