Apply Now

Senior Databricks Data Engineer

Remote • Posted 2 days ago • Updated 1 day ago

Full Time

Remote

$DOE

Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

SQL
data modeling
Performance Tuning
PySpark
databricks
watermarking
CDC
marketing datasets

Summary

Role: Senior Databricks Data Engineer
Location: Remote
Duration: 4-8 Weeks + Possible extension

Description:
We are looking to immediately onboard a Senior Databricks Data Engineer to support a high-priority initiative delivering data to IPSOS for MMM/MTA modeling.
We are building data pipelines to extract and deliver curated datasets (approx. 150GB historical + weekly increments) from Databricks bronze/silver layers to an external analytics partner (IPSOS). The data will be used for MMM/MTA modeling, so accuracy, consistency, and reliability are critical.

Key Responsibilities:
Data Extraction & Engineering
Build scalable extraction pipelines from Databricks (bronze/silver layers)
Prepare datasets for external consumption (column selection, renaming, formatting, normalization)
Work across ~10 20 fact and dimension tables spanning media and sales domains

Incremental Pipeline Development
Design and implement incremental logic using timestamps or CDC patterns
Optimize for ongoing weekly loads (~2GB) while supporting large historical extracts

File Generation & Optimization
Generate export-ready datasets in CSV/Parquet formats
Implement partitioning strategies for performance (e.g., by date/source)
Apply compression and optimize file sizes for transfer

Data Validation & Quality
Implement validation checks (schema, row counts, completeness)
Troubleshoot data inconsistencies across multiple sources

Secure Delivery
Support secure file delivery (e.g., SFTP, encryption)
Implement monitoring, logging, retry logic, and failure notifications

Collaboration
Work closely with internal teams and IPSOS for data validation and issue resolution
Support onboarding and early-stage troubleshooting

Required Skills
Strong hands-on experience with Databricks (Delta Lake, notebooks, jobs)
Proficiency in PySpark and SQL for large-scale data processing
Experience with incremental pipelines (CDC, watermarking)
Solid understanding of data modeling (fact/dimension, grain alignment)
Experience handling large datasets (100GB+) and performance tuning
Familiarity with file-based delivery (CSV/Parquet) and secure transfer (SFTP, encryption)

Nice to Have
Experience with MMM/MTA or marketing datasets (Google, Meta, Amazon, etc.)
Experience working with external analytics partners (e.g., IPSOS, Nielsen)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91071031
Position Id: 2026-13947
Posted 2 days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Databricks Engineer

Remote

•

Today

Role: Databricks EngineerLocation:REMOTEDuration: Contract to hire after 6-month termWe are looking for a hands-on Databricks Engineer with strong AWS experience to design, build, and optimize scalable data pipelines and lakehouse solutions. The role focuses on implementing robust batch and streaming data solutions using Databricks, Delta Lake, and AWS cloud-native services, ensuring high performance, scalability, and security.Key Responsibilities Build and maintain end-to-end data pipelines us

Easy Apply

Full-time

65 - 75

Databricks Data Engineer (AWS)

Remote

•

6d ago

We are looking for a hands-on Databricks Data Engineer with strong AWS experience to design, build, and optimize scalable data pipelines and lakehouse solutions. The role focuses on implementing robust batch and streaming data solutions using Databricks, Delta Lake, and AWS cloud-native services, ensuring high performance, scalability, and security. Key Responsibilities (Data Engineering & Pipeline Development) Build and maintain end-to-end data pipelines using Databricks, Delta Lake, and AWS se

Easy Apply

Contract

Depends on Experience

Data Engineer - Databricks

Remote or Columbus, Ohio

•

Today

McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Here, we focus on the health, happiness, and well-being of you and those we serve - we care. What you do at McKesson matters. We foster a culture where you can grow, make an impact, and are empowered to bring new ideas. Together, we thrive as we shape the future of health for patien

Full-time

USD 106,500.00 - 177,500.00 per year

Data Engineer - Datanbricks

Remote or New Jersey

•

Yesterday

Job Title: Data Engineer - Databricks Location: Remote Job Type: Full time Job Overview: We are seeking a skilled Data Engineer to join our team. The successful candidate will be responsible for development and optimization of data pipelines, implementing robust data checks, and ensuring the accuracy and integrity of data flows. This role is critical in supporting data-driven decision-making processes, especially in the context of our insurance-focused business operations. Key Responsibilities:

Easy Apply

Full-time

$$150000

Search all similar jobs