Devops Engineer

  • Dallas, TX
  • Posted 6 hours ago | Updated 3 hours ago

Overview

Remote
On Site
Contract - W2

Skills

AWS
cloud
Terraform
Azure
GCP
devops
cloud formation
Grafana
Prometheus
Datadog
TigerGraph
IaaC

Job Details

Position: DevOps Consultant

Location: Remote

Duration: Long term contract

Type: Only W2

About the Team

Join a specialized Infrastructure Engineering team focused on deploying and managing cloud-hosted TigerGraph clusters. This role spans the full cluster lifecycle-from provisioning and performance testing to observability and operations-ensuring optimal performance and reliability of TigerGraph's hosted environments.

Key Responsibilities

Cluster Provisioning & Setup

  • Define and implement cluster sizing strategies based on workload and capacity planning.
  • Lead deployment of TigerGraph clusters across supported cloud platforms.
  • Manage infrastructure cost approvals and budgeting.

Cloud Infrastructure & Operations

  • Provision cloud infrastructure components including compute, storage, and networking.
  • Implement secure networking configurations and ensure alignment with security policies.
  • Collaborate with architecture and domain teams to fulfill security and deployment requirements.

Performance & Resiliency

  • Conduct benchmarking, load testing, and stress simulations to validate readiness.
  • Apply best practices for scalable and fault-tolerant cluster configurations.

Observability & Operational Readiness

  • Set up monitoring, alerting, and dashboarding tools for real-time operational visibility.
  • Develop and maintain runbooks, standard operating procedures (SOPs), and incident response workflows.

Ongoing Cluster Management

  • Manage upgrades, scaling activities, and infrastructure right-sizing.
  • Optimize shard distribution and maintain balanced cluster performance.
  • Monitor and reduce cloud resource consumption for cost efficiency.

Required Skills & Experience

  • 5+ years of experience in cloud infrastructure engineering (AWS, Google Cloud Platform, or Azure)
  • Hands-on experience with distributed systems or graph databases (TigerGraph preferred)
  • Expertise in infrastructure-as-code tools (Terraform, CloudFormation)
  • Experience with performance/load testing tools and frameworks
  • Proficient in observability tools (e.g., Prometheus, Grafana, Datadog)
  • Strong understanding of operational documentation, incident management, and SOPs
  • Familiarity with Kubernetes and container orchestration (a plus)

Preferred Qualifications

  • Experience with performance testing tools like JMeter, Locust, or Gatling
  • Background in managing medium to large-scale data clusters, with a focus on scalability and fault tolerance
  • Prior experience with graph databases, especially TigerGraph or Neo4j

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.