Overview
Remote
Depends on Experience
Contract - W2
Skills
API
Apache Kafka
Apache Spark
Data Engineering
GCP
Google Cloud Platform
Python
Continuous Delivery
Continuous Integration
Machine Learning (ML)
Artificial Intelligence
AI
ML
PySpark
Google Cloud
Orchestration
PostgreSQL
Kubernetes
Job Details
Position: Data Engineer
Duration: At least until the end of year but likely will be extended into next
Location: 100% remote
Interview Process: 2 rounds but will have them back-to-back in the same day. First round is technical with other engineers. Should expect to share their screen and do some basic SQL & Python code. Second round is with HM and will be less technical.
This role is now contract to hire; candidates MUST be eligible to convert without sponsorship (GC or USC).
Notes from req intake call:
- They do want someone proactive, not so much heads down. They want a lot of interaction. Personality will be a big thing here.
- Building out the next gen platform for their data product, TLT transformation logistics technology.
- They deal with all the logistics data in the PBM and specialty.
- Track the lifecycle of a particular order from when it is ordered to delivery. Then they follow up with a survey.
- How the order gets placed, how the member information gets reviewed (insurance), what are the package dimensions and has the label been printed.
- The new initiative is modernizing their technical stack on Google Cloud Platform and improving those services.
- Must have SQL experience, this should be standard.
- If they know LLM, AI or ML that is helpful, but they don t need it. They want someone willing to learn.
- They have the architecture approved and are building out the lower environment. Meeting with Google and the team to review findings.
- Planned go live is the end of October, think it will run to EOY at least.
- Need the ability to talk to the business side and discuss what they are doing for each sprint.
- Looking for a Mid-level Engineer.
Must Haves:
Python & PySpark (Spark SQL) 3+ years
Airflow (or any orchestration tool) 2+ years
Google Cloud Platform (BigQuery, GCS, Pub/Sub, Cloud Run, Functions, Cloud SQL) 3+ years
Real-time data ingestion (Kafka, webhooks, file-based) 2+ years
API integration (REST/webhooks) 2+ years
Kubernetes (GKE preferred) 1 2 years
BigQuery SQL & PostgreSQL 2+ years
YAML/config-driven pipeline design 2+ years
Schema transformation, hashing, DQF 2+ years
CI/CD, observability, lightweight dashboards (Grafana/Streamlit/Flask UI) 1+ year
Airflow (or any orchestration tool) 2+ years
Google Cloud Platform (BigQuery, GCS, Pub/Sub, Cloud Run, Functions, Cloud SQL) 3+ years
Real-time data ingestion (Kafka, webhooks, file-based) 2+ years
API integration (REST/webhooks) 2+ years
Kubernetes (GKE preferred) 1 2 years
BigQuery SQL & PostgreSQL 2+ years
YAML/config-driven pipeline design 2+ years
Schema transformation, hashing, DQF 2+ years
CI/CD, observability, lightweight dashboards (Grafana/Streamlit/Flask UI) 1+ year
Plusses:
- Logistics or healthcare domain exposure nice to have
- PostgreSQL, CI/CD, monitoring, dashboarding, or lightweight UI development
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.