DevOps Supervisor NEX

Houston, TX, US • Posted 3 days ago • Updated 39 minutes ago
Full Time
On-site
Fitment

Dice Job Match Score™

📋 Comparing job requirements...

Job Details

Skills

  • IaaS
  • Computer Cluster Management
  • Optimization
  • Time Series
  • Relational Databases
  • Caching
  • Testing
  • Publishing
  • Auditing
  • SAFE
  • Adapter
  • Provisioning
  • Workflow
  • Field Operations
  • Mentorship
  • Microservices
  • Authentication
  • Authorization
  • Regulatory Compliance
  • Vendor Relationships
  • Budget
  • Leadership
  • Computer Networking
  • Managed Services
  • Writing
  • Management
  • Continuous Integration
  • Continuous Delivery
  • Atlassian
  • Stacks Blockchain
  • IoT
  • Computer Hardware
  • OTA
  • People Management
  • Computer Science
  • Information Systems
  • Team Leadership
  • Supervision
  • DevOps
  • Reliability Engineering
  • Apache Velocity
  • Problem Solving
  • Incident Management
  • Expect
  • Collaboration
  • Google Cloud
  • Google Cloud Platform
  • Cloud Computing
  • SQL
  • Virtual Private Cloud
  • Kubernetes
  • Terraform
  • Docker
  • Bamboo
  • Bitbucket
  • JSON
  • PostgreSQL
  • Redis
  • Python
  • MQTT
  • Law
  • Reporting

Summary

Job Description

Cloud Infrastructure & Platform
  • Own end-to-end cloud infrastructure strategy - networking, Kubernetes cluster management, IAM, secrets management, and cost optimization.
  • Lead all Terraform IaC development across environments (dev, staging, production), enforcing consistent module patterns and state management.
  • Design and operate Kubernetes workloads using Kustomize overlays for both cloud and edge deployment targets.
  • Manage supporting infrastructure: time-series and relational databases, caching layers, and cloud-managed services.

CI/CD & Deployment
  • Own and mature CI/CD pipelines across all services using the Atlassian suite (Bitbucket, Bamboo / Bitbucket Pipelines) - building, linting, testing, publishing, and deploying Python/FastAPI microservices.
  • Standardize Docker build practices, image tagging strategies, and container registry management.
  • Implement and enforce GitOps workflows for Kubernetes deployments, ensuring audit trails and safe rollback capabilities.
  • Collaborate with development teams to reduce deployment friction and improve feedback loops.

Edge Deployments
  • Own deployment architecture for edge-tier workloads running on field hardware - Docker Compose stacks including MQTT and ModbTCP protocol adapters.
  • Develop reliable provisioning, update, and monitoring workflows for edge nodes in remote or low-connectivity environments.
  • Coordinate with product and field operations teams on edge deployment requirements, connectivity constraints, and rollout planning.

Site Reliability & 24/7 Support
  • Build and own the on-call program: runbooks, alerting, escalation paths, and SLO definitions.
  • Lead incident response, ensuring fast mitigation and thorough post-mortems that prevent recurrence.
  • Define and track reliability metrics (availability, MTTR, error budgets) and report to the Director of Platform Development.
  • Continuously improve observability across cloud and edge environments through structured logging, metrics, and distributed tracing.

Team Leadership & Cross-Functional Collaboration
  • Hire, mentor, and grow a team of DevOps and Platform Engineers; define career ladders and performance expectations.
  • Partner with backend engineering teams to support the Python/FastAPI microservices platform, authentication, and authorization policy rollouts.
  • Champion a security-first culture: secrets management, least-privilege IAM, dependency scanning, and compliance automation.
  • Manage vendor relationships, cloud spend, and tooling budget with transparency to leadership.

  • Perform additional duties as required and assigned.

Job Requirements:
  • 5+ years in DevOps, SRE, or Platform Engineering roles, with at least 1-2 years in a tech lead or supervisory capacity.
  • Deep hands-on experience with a major cloud platform (Google Cloud Platform preferred) including Kubernetes, IAM, networking, and managed services.
  • Strong Terraform skills - writing modules, managing remote state, and structuring multi-environment configurations.
  • Proficiency in Kubernetes and Kustomize for managing multi-environment, multi-target (cloud + edge) workloads.
  • Experience building and maintaining CI/CD pipelines in the Atlassian suite (Bitbucket, Bamboo, or Bitbucket Pipelines); comfort with pipeline-as-code patterns.
  • Solid Docker expertise including multi-stage builds, Compose stacks, and container runtime troubleshooting.
  • Hands-on experience with Prometheus, structured/JSON logging, and building actionable alerting systems.
  • Ability to lead on-call rotations and drive incident management processes end-to-end.
  • Comfortable working in a Python-centric engineering environment (Python 3.12, Poetry, FastAPI familiarity preferred).
  • Experience with edge / IoT deployment patterns - field hardware, intermittent connectivity, or OTA update strategies.
  • Demonstrates positive people management skills: communicates effectively, treats team members fairly and consistently, coaches well, and takes an interest in team members' career development.

Minimum Qualifications:
  • Bachelor's Degree in Computer Science, Information Systems, or a related technical field (Required).
  • 5+ years of progressive experience in DevOps, SRE, or Platform Engineering (Required).
  • 1-2 years of experience in a team lead or supervisory capacity (Required).

Additional Details:

The DevOps / SRE Supervisor works with broad ownership and limited direction. The incumbent determines and develops the approach to infrastructure solutions. Work is evaluated on outcomes: system reliability, delivery velocity, and infrastructure cost efficiency.

Resolves a wide range of platform and infrastructure problems, from routine operational tasks to complex architectural decisions. Uses judgment within engineering best practices to determine the appropriate course of action. Problem resolution timeframes range from immediate incident response to multi-week infrastructure projects.

Candidates should expect to work across the following technologies (or comparable equivalents): Google Cloud Platform (GKE, Cloud SQL, IAM, VPC) Kubernetes / Kustomize Terraform Docker / Docker Compose Bitbucket / Bamboo / Bitbucket Pipelines Prometheus / structured JSON logging TimescaleDB / PostgreSQL / Redis Auth0 (AuthN) / Cerbos (AuthZ) Python 3.12 / FastAPI / Poetry MQTT / ModbTCP (edge protocols).

About Us

The Evolving Oil Field Demands Evolving Service Providers

NexTier is a leading provider of integrated completions that employs sustainable practices and equipment to support our customers' ESG goals while accelerating production in the most demanding US land basins.

Patterson-UTI is committed to a workplace free from discrimination and harassment, offering equal employment opportunities to all individuals regardless of personal characteristics protected by law. Employees are encouraged to report any concerns through multiple channels.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 80184230
  • Position Id: 1690eff6cff69ca87f104e3eec3f2a3b
  • Posted 3 days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Houston, Texas

Today

Full-time

Houston, Texas

Today

Full-time

No location provided

Today

Full-time

Remote

2d ago

Easy Apply

Contract

Depends on Experience

Search all similar jobs