Lead Data Engineer

Overview

Remote
Depends on Experience
Contract - W2

Skills

Data Engineer
PySpark
SQL
Apache Spark
BigQuery
data modeling
GCP
cloud
API
Kubernetes
Healthcare
SFTP
Orchestration
Access Management
Airflow
Grafana
Computer Networking
Logistics

Job Details

Position: Lead Data Engineer

Location: 100% Remote

Duration: Need to confirm

Interview process: Need to confirm

Start Date: Need to confirm

Skill set below:

  • Strong with PySpark (esp. Spark SQL) for complex transformation pipeline

  • Hands-on with Airflow for orchestration and BigQuery SQL for querying and data modeling
  • Good experience in Google Cloud Platform (BigQuery, GCS, Pub/Sub, Cloud Run, Functions,Cloud SQL)
  • Comfortable with real-time ingestion: Kafka, webhooks, file-based triggers
  • Solid API integration skills (REST/webhooks), with ability to handle payload-driven workflows
  • Experience working in Kubernetes (GKE) for deploying and scaling pipelines
  • Comfortable handling UI-driven configuration, YAML-based setups, and modular frameworks
  • Exposure to schema transformation, data validation, hashing, and DQF logic
  • Domain familiarity with logistics/healthcare (CVS context is a big plus)
  • Strong ownership mindset able to work under pressure and balance speed with reliability
  • Basic understanding of BASH, SFTP transfers, Networking, Access Management
Bonus:
  • PostgreSQL, CI/CD, monitoring, dashboarding, or lightweight UI development
  • Python & PySpark (Spark SQL) 3+ years
  • Airflow (or any orchestration tool) 2+ years
  • Google Cloud Platform (BigQuery, GCS, Pub/Sub, Cloud Run, Functions, Cloud SQL) 3+ years
  • Real-time data ingestion (Kafka, webhooks, file-based) 2+ years
  • API integration (REST/webhooks) 2+ years
  • Kubernetes (GKE preferred) 1 2 years
  • BigQuery SQL & PostgreSQL 2+ years
  • YAML/config-driven pipeline design 2+ years
  • Schema transformation, hashing, DQF 2+ years
  • CI/CD, observability, lightweight dashboards (Grafana/Streamlit/Flask UI) 1+ year
  • Logistics or healthcare domain exposure nice to have
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.