Role: Senior Databricks Data Engineer
Location: Remote
Duration: 4-8 Weeks + Possible extension
Description:
We are looking to immediately onboard a Senior Databricks Data Engineer to support a high-priority initiative delivering data to IPSOS for MMM/MTA modeling.
We are building data pipelines to extract and deliver curated datasets (approx. 150GB historical + weekly increments) from Databricks bronze/silver layers to an external analytics partner (IPSOS). The data will be used for MMM/MTA modeling, so accuracy, consistency, and reliability are critical.
Key Responsibilities:
Data Extraction & Engineering
Build scalable extraction pipelines from Databricks (bronze/silver layers)
Prepare datasets for external consumption (column selection, renaming, formatting, normalization)
Work across ~10 20 fact and dimension tables spanning media and sales domains
Incremental Pipeline Development
Design and implement incremental logic using timestamps or CDC patterns
Optimize for ongoing weekly loads (~2GB) while supporting large historical extracts
File Generation & Optimization
Generate export-ready datasets in CSV/Parquet formats
Implement partitioning strategies for performance (e.g., by date/source)
Apply compression and optimize file sizes for transfer
Data Validation & Quality
Implement validation checks (schema, row counts, completeness)
Troubleshoot data inconsistencies across multiple sources
Secure Delivery
Support secure file delivery (e.g., SFTP, encryption)
Implement monitoring, logging, retry logic, and failure notifications
Collaboration
Work closely with internal teams and IPSOS for data validation and issue resolution
Support onboarding and early-stage troubleshooting
Required Skills
Strong hands-on experience with Databricks (Delta Lake, notebooks, jobs)
Proficiency in PySpark and SQL for large-scale data processing
Experience with incremental pipelines (CDC, watermarking)
Solid understanding of data modeling (fact/dimension, grain alignment)
Experience handling large datasets (100GB+) and performance tuning
Familiarity with file-based delivery (CSV/Parquet) and secure transfer (SFTP, encryption)
Nice to Have
Experience with MMM/MTA or marketing datasets (Google, Meta, Amazon, etc.)
Experience working with external analytics partners (e.g., IPSOS, Nielsen)