Senior Workflow Orchestration Engineer (Airflow & Scheduling Platforms)

• Posted 30+ days ago • Updated 9 hours ago
Full Time
Fitment

Dice Job Match Score™

🛠️ Calibrating flux capacitors...

Job Details

Skills

  • Trading
  • FOCUS
  • Build Automation
  • Provisioning
  • Unit Testing
  • Dashboard
  • SLA
  • Scheduling
  • Resource Allocation
  • Concurrent Computing
  • Incident Management
  • Documentation
  • Mentorship
  • Testing
  • Sensors
  • Mapping
  • Kubernetes
  • Network
  • Terraform
  • Continuous Delivery
  • Oracle Policy Automation
  • Python
  • Bash
  • Java
  • Modeling
  • Performance Tuning
  • Microsoft Azure
  • Google Cloud
  • Google Cloud Platform
  • Snow Flake Schema
  • Amazon Redshift
  • Databricks
  • Apache Spark
  • Electronic Health Record (EHR)
  • Virtual Private Cloud
  • Computer Networking
  • Storage
  • Amazon S3
  • Regulatory Compliance
  • SSO
  • OIDC
  • RBAC
  • Auditing
  • Change Control
  • Leadership
  • Roadmaps
  • Communication
  • Workflow
  • Amazon Web Services
  • Step-Functions
  • Migration
  • Data Quality
  • Cloud Computing
  • Orchestration
  • Management
  • Continuous Integration
  • Optimization
  • Capacity Management
  • High Availability
  • Meta-data Management
  • Database
  • Backup
  • Recovery
  • Apache Airflow

Summary

About the role

We're seeking a seasoned engineer to design, operate, and scale our workflow orchestration platform with a primary focus on Apache Airflow. You'll own the Airflow control plane and developer experience end-to-end-architecture, automation, security, observability, and reliability-while also evaluating and operating complementary schedulers where appropriate. You'll build automation infrastructure and partner across data, trading, and engineering teams to deliver mission-critical pipelines at scale.

What you'll do
  • Architect, deploy, and operate production-grade Airflow on Kubernetes including all components and user application dependencies, with focus on upgrades, capacity planning, HA, security, and performance tuning
  • Operate a multi-scheduler ecosystem: determine when to use Airflow, distributed compute schedulers, or lightweight task runners based on workload requirements; provide unified developer experience across schedulers
  • Build automation infrastructure: Terraform modules and Helm charts with GitOps-driven CI/CD for environment provisioning, upgrades, and zero-downtime rollouts
  • Standardize the developer experience: DAG repo templates, shared operator libraries, connection and secrets management, dependency packaging, code ownership, linting, unit testing, and pre-commit hooks
  • Implement comprehensive observability: metrics collection, dashboards, distributed tracing, SLA/latency monitoring, intelligent alerting, and runbook automation
  • Enable resilient workflow patterns: build idempotency frameworks, retry/backoff strategies, deferrable operators and sensors, dynamic task mapping, and data-aware scheduling
  • Ensure reliability at enterprise scale: architect and tune resource allocation (pools, queues, concurrency limits) to support high-throughput workloads; optimize large-scale backfill strategies; develop comprehensive runbooks and lead incident response/postmortems
  • Partner with teams across the organization to provide enablement, documentation, and self-service tooling
  • Mentor engineers, contribute to platform roadmap and technical standards, and drive engineering best practices

Required qualifications
  • 5-8+ years building/operating data or platform systems; 3+ years running Airflow in production at scale (hundreds-thousands of DAGs and high task throughput).
  • Deep Airflow expertise: DAG design and testing, idempotency, deferrable operators/sensors, dynamic task mapping, task groups, datasets, pools/queues, SLAs, retries/backfills, cross-DAG dependencies.
  • Strong Kubernetes experience running Airflow and supporting services: Helm, autoscaling, node/pod tuning, topology spread, network policies, PDBs, and blue/green or canary strategies.
  • Automation-first mindset: Terraform, Helm, GitOps (Argo CD/Flux), and CI/CD for platform lifecycle; policy-as-code (OPA/Gatekeeper/Conftest) for DAG, connection, and secrets changes.
  • Proficiency in Python for authoring operators/hooks/utilities; solid Bash; familiarity with Go or Java is a plus.
  • Observability and SRE practices: PrometheGrafana/StatsD, centralized logging, alert design, capacity/throughput modeling, performance tuning.
  • Data platform experience with at least one major cloud (AWS/Azure/Google Cloud Platform) and systems like Snowflake/BigQuery/Redshift, Databricks/Spark, EMR/Dataproc; strong grasp of IAM, VPC networking, and storage (S3S/ADLS).
  • Security/compliance: SSO/OIDC, RBAC, secrets management (Vault/Secrets Manager), auditing, least-privilege connection management, and change control.
  • Proven incident leadership, runbook creation, and platform roadmap execution; excellent cross-functional communication.

Nice to have
  • Experience operating alternative orchestrators (Prefect 2.x, Dagster, Argo Workflows, AWS Step Functions) and leading migrations to/from Airflow.
  • OpenLineage/Marquez adoption; Great Expectations or other data quality frameworks; data contracts.
  • dbt Core/Cloud orchestration patterns (state management, artifacts, slim CI).
  • Cost optimization and capacity planning for schedulers and workers; spot instance strategies.
  • Multi-region HA/DR for Airflow metadata DB; backup/restore and disaster drills.
  • Building internal developer platforms/portals (e.g., Backstage) for self-service pipelines.
  • Contributions to Apache Airflow or provider packages; familiarity with recent AIPs/Airflow 2.7+ features.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10125634
  • Position Id: 38ad50c996cc50fc68129e8658cb590f
  • Posted 30+ days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Chicago, Illinois

Today

Full-time

Chicago, Illinois

Today

Full-time

USD 168,750.00 - 281,250.00 per year

Chicago, Illinois

15d ago

Full-time

USD 168,750.00 - 281,250.00 per year

Chicago, Illinois

Today

Full-time

USD 139,200.00 - 208,800.00 per year

Search all similar jobs