Lead Data Engineer

Overview

On Site

Depends on Experience

Full Time

No Travel Required

Skills

EDW

Databricks

ETL Development

Python

SQL Server

Job Details

Overview

Dataflix is seeking a Lead Data Engineer / Data Architect to lead the hands-on migration of existing ETL workflows from Informatica PowerCenter to the Databricks Lakehouse Platform. This role combines technical leadership, architecture design, and hands-on development, ensuring successful delivery of a large-scale enterprise data migration. The ideal candidate has deep experience with Informatica and Databricks, a strong grasp of enterprise data warehousing, and the ability to guide and mentor a technical team while contributing directly to design and development.

Key Responsibilities

Analyze legacy ETL pipelines, identify dependencies, and plan phased migration sequencing.
Redesign and optimize existing SQL logic, stored procedures, and transformations for Databricks.
Architect and implement end-to-end data pipelines across Source Staging EDW BI layers.
Apply Databricks Delta Lake best practices for partitioning, schema evolution, and performance optimization.
Implement data validation, error handling, and logging frameworks for migration integrity.
Oversee orchestration and scheduling using Databricks Workflows, Airflow, or equivalent tools.
Ensure performance tuning, scalability, and security across all migrated components.
Collaborate with project management, architecture, and QA teams to ensure successful delivery.
Mentor junior engineers, perform code/design reviews, and enforce development standards.

Required Qualifications

6+ years of experience in data engineering, ETL development, or data architecture.
3+ years of hands-on experience with Databricks, Spark, or PySpark.
Strong background in Informatica PowerCenter and SQL Server.
Experience designing and implementing end-to-end data pipelines across Source, Staging, EDW, and BI layers.
Solid understanding of data warehousing and dimensional modeling (facts, dimensions, SCD1/SCD2).
Ability to analyze legacy Informatica ETL workflows, identify dependencies, and plan migration sequencing.
Strong experience optimizing Spark/Databricks performance through partitioning, caching, and query tuning.
Understanding of data validation, reconciliation, and QA processes for data migration and cutover.
Hands-on expertise in Unix shell scripting, stored procedures, and large-scale data transformations.
Skilled in Databricks Delta Lake operations (ACID transactions, merge, optimize, vacuum) and job performance tuning.
Knowledge of data security, RBAC, and access control within Databricks or similar enterprise data environments.
Experience with CI/CD pipelines and version control tools (GitHub, Bitbucket, Jenkins, or Azure DevOps).
Proven ability to lead and mentor technical teams while contributing as a hands-on developer.

Preferred Qualifications

Experience modernizing on-prem ETL to Databricks or other modern data platforms.
Exposure to metadata management, data lineage, and governance frameworks.
Familiarity with orchestration tools (Airflow, Control-M, Databricks Workflows).
Databricks certification (Associate or Professional Data Engineer) preferred.
Experience with hospitality or travel industry datasets is a plus.

Soft Skills

Strong problem-solving, accountability and analytical mindset.
Excellent communication and cross-functional collaboration skills.
Self-driven and proactive, with the ability to balance leadership and hands-on execution.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share