ETL / ELT Architect (Azure Data Engineering)
6 months - Contract to hire
Remote
Experience Level
8–10 years of overall data engineering experience, with at least 4–5 years in cloud-based data platforms and 2–3 years in an architecture or lead design role.
Role Overview
The ETL / ELT Architect will lead the design and governance of scalable, secure, and high-performance data pipelines on Microsoft Azure. This role is responsible for defining enterprise-wide Bronze → Silver → Gold (Medallion) architecture standards, Databricks ETL frameworks, and orchestration patterns supporting both
batch and streaming workloads.
The architect will act as the technical authority for ETL design decisions, performance optimization, schema evolution, and reliability across the data platform.
Key Responsibilities
ETL / ELT Architecture & Standards
Define and govern Medallion Architecture (Bronze, Silver, Gold) standards across the program.
Establish ELT-first design principles using Azure Databricks and Delta Lake.
Design reusable, metadata-driven ETL frameworks supporting multiple ingestion patterns.
Define ingestion strategies for CDC, full loads, and streaming data from Azure Event Hub and databases.
Databricks & Delta Lake Architecture
Design and implement Databricks Auto Loader for scalable ingestion with schema drift handling.
Define merge and upsert strategies using Delta Lake for Silver and Gold layers.
Establish best practices for:
o Schema evolution and validation
o Late-arriving data handling
o Idempotent processing
Define Delta Lake maintenance strategies (OPTIMIZE, VACUUM, Z-ORDER).
Performance & Optimization
Define partitioning strategies based on data volume, access patterns, and downstream usage.
Optimize Spark workloads for joins, aggregations, and large-scale transformations.
Ensure efficient cluster sizing and job configuration for cost and performance balance.
Orchestration & Workflow Design
Define orchestration approaches using Azure Data Factory and Databricks Workflows.
Design dependency management across Bronze, Silver, and Gold pipelines.
Enable parameterized and reusable pipelines supporting multi-tenant and multi-source ingestion.
Error Handling, Monitoring & Reliability
Define standardized error handling, retry, and recovery mechanisms.
Implement data quality checks and validation at each layer.
Design observability using Azure Monitor and Alerts.
Ensure pipeline resilience and operational stability.
Governance & Downstream Enablement
Align ETL design with Azure security, governance, and lineage standards (Microsoft Purview).
Design Gold-layer data models optimized for Synapse Dedicated SQL Pool, reporting, and analytics.
Support secure data sharing through Azure Data Share and external consumption platforms.
Required Skills & Experience
Experience
8–10 years in data engineering and ETL/ELT development.
4+ years designing and implementing cloud-based data platforms (Azure preferred).
2+ years in an architecture, lead, or technical design role.
Technical Skills
Strong expertise in Azure Databricks architecture and Spark-based ETL.
Deep hands-on experience with Delta Lake (MERGE, schema evolution, ACID guarantees).
Experience with Databricks Auto Loader for streaming and incremental ingestion.
Proven experience designing enterprise-grade ETL frameworks.
Strong knowledge of schema drift handling, CDC patterns, and incremental processing.
Hands-on experience with Azure Data Factory for orchestration.
Expertise in performance tuning and optimization for Databricks and Spark workloads.
Experience with real-time and streaming data pipelines.
Exposure to data migration and legacy system decommissioning programs.
Strong understanding of error handling, retry logic, and fault-tolerant pipeline design.
Cloud & Data Platform
Strong experience with Azure ADLS Gen2, Event Hub, Databricks, Synapse.
Familiarity with Microsoft Purview or equivalent governance tools.
Experience supporting downstream analytics, reporting, and data sharing use cases.
Soft Skills
Strong architectural thinking and decision-making ability.
Ability to define standards and mentor engineering teams.
Excellent communication and documentation skills.
Experience collaborating with platform, security, and analytics stakeholders.
Nice to Have
Knowledge of CI/CD and DevOps practices for Databricks and data pipelines.
Experience working in large enterprise or multi-domain data programs.