Title: Senior Data Engineer – Azure Platform
Location Remote / Hybrid (flexible)
Contract : 6 month C2H
Position Overview
We are seeking a Senior Data Engineer to lead the design, build-out, and operationalization of our Azure-based data platform. This is a foundational role responsible for architecting a scalable data lake, establishing CI/CD pipelines for data workloads, and standing up the core platform infrastructure that the broader data and analytics organization will build upon. The ideal candidate brings deep hands-on experience across the Azure data stack, a strong DevOps and platform engineering mindset, and the ability to set standards that will govern how the organization ingests, transforms, stores, and serves data at scale.
Key Details
Department Data Engineering / Platform Engineering
Reports To Director of Data Engineering or VP of Data & Analytics
Experience Level Senior (5–10 years)
Core Responsibilities
Azure Data Lake Architecture & Build-Out
• Design and implement a multi-layer Azure Data Lake Storage (ADLS Gen2) architecture following medallion (Bronze / Silver / Gold) or equivalent lakehouse patterns.
• Define folder structures, partitioning strategies, file format standards (Parquet, Delta), and lifecycle management policies across raw, curated, and consumption zones.
• Architect and deploy Azure Data Factory (ADF) and/or Azure Synapse pipelines for large-scale data ingestion from diverse sources including APIs, relational databases, flat files, streaming sources, and SaaS platforms.
• Implement Delta Lake or Lakehouse patterns on top of ADLS Gen2 to enable ACID transactions, schema enforcement, time travel, and efficient upsert/merge operations.
• Design and manage Azure Synapse Analytics or Databricks workspaces for large-scale transformation, aggregation, and data serving workloads.
• Establish data lake security architecture including RBAC, ACLs, managed identities, private endpoints, and encryption at rest and in transit.
CI/CD & DevOps for Data
• Design and implement end-to-end CI/CD pipelines for data engineering workloads using Azure DevOps or GitHub Actions, covering ADF pipelines, Databricks notebooks, SQL scripts, and infrastructure-as-code deployments.
• Establish branching strategies, pull request workflows, automated testing (unit, integration, data quality), and promotion gates across development, staging, and production environments.
• Implement infrastructure-as-code (IaC) using Terraform, Bicep, or ARM templates to provision and manage all Azure data platform resources in a repeatable, version-controlled manner.
• Build automated deployment pipelines for Azure Data Factory ARM exports, Databricks workspace configurations, Synapse artifacts, and Azure SQL database schema migrations.
• Integrate data quality validation steps into CI/CD workflows, ensuring that pipeline deployments include automated checks for schema conformance, row counts, null
thresholds, and referential integrity.
• Establish monitoring, alerting, and observability for deployed pipelines using Azure Monitor, Log Analytics, and custom dashboards for pipeline run health, latency, and failure rates.
Platform Engineering & Governance
• Stand up the foundational data platform from the ground up, including resource group design, networking (VNets, private endpoints, service endpoints), and environment isolation (dev / staging / prod).
• Define and enforce data governance standards across the platform, including metadata management (Azure Purview / Microsoft Purview), data cataloging, lineage tracking, and classification policies.
• Establish naming conventions, tagging standards, cost management practices, and resource governance policies for all Azure data resources.
• Create reusable frameworks, templates, and accelerators (e.g., parameterized ADF pipelines, shared Databricks libraries, Terraform modules) that enable other engineers to onboard new data sources and workloads efficiently.
• Define and document platform standards for data modeling, ETL/ELT patterns, error handling, logging, retry logic, and idempotency to ensure consistency across the engineering team.
• Lead capacity planning and performance tuning across the platform, including Synapse DWU scaling, Databricks cluster policies, storage tiering, and cost optimization.
Collaboration & Leadership
• Partner with data analysts, data scientists, BI developers, and business stakeholders to understand data consumption patterns and design platform capabilities that serve downstream use cases.
• Mentor and provide technical guidance to mid-level and junior data engineers, conducting code reviews and establishing engineering best practices.
• Collaborate with cloud infrastructure, security, and networking teams to ensure the data platform meets enterprise architecture, security, and compliance requirements.
• Present platform architecture, roadmap updates, and technical decisions to leadership and cross-functional stakeholders.
Required Qualifications
• Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a related field; or equivalent professional experience.
• 5+ years of hands-on data engineering experience, with at least 3 years working extensively within the Azure data ecosystem.
• Deep expertise with Azure Data Lake Storage Gen2, Azure Data Factory, Azure Synapse Analytics, and/or Azure Databricks.
• Proven experience designing and implementing CI/CD pipelines for data workloads using Azure DevOps or GitHub Actions, including automated testing and multi-environment promotion.
• Strong proficiency with infrastructure-as-code tools (Terraform, Bicep, or ARM templates) for provisioning Azure data resources.
• Expert-level SQL skills including performance tuning, indexing strategies, complex transformations, and schema design.
• Strong Python or PySpark proficiency for data transformation, pipeline development, and automation.
• Hands-on experience implementing Delta Lake or Lakehouse architecture patterns with ACID transaction support.
• Working knowledge of data governance and cataloging tools (Microsoft Purview, Unity Catalog, or equivalent).
• Experience with Git-based version control, branching strategies, and collaborative development workflows.
• Strong understanding of networking and security concepts in Azure (VNets, private endpoints, managed identities, RBAC, ACLs).
• Excellent communication skills with the ability to translate complex technical architecture into clear, stakeholder-friendly language.
Preferred Qualifications
• Azure certifications such as DP-203 (Data Engineer Associate), AZ-400 (DevOps Engineer Expert), or AZ-305 (Solutions Architect Expert).
• Experience standing up a data platform from scratch in a greenfield or early-stage environment.
• Familiarity with streaming and real-time data patterns using Azure Event Hubs, Kafka, or Structured Streaming.
• Experience with dbt for transformation orchestration within a lakehouse or warehouse environment.
• Knowledge of cost optimization strategies across Azure data services (reserved capacity, auto-scaling, storage tiering).
• Experience with data mesh or domain-oriented data platform architectures.
• Familiarity with containerized workloads (Docker, AKS) for data pipeline deployment.
• Background in regulated industries with exposure to SOX, HIPAA, GDPR, or equivalent compliance frameworks.
Technology Stack
The following represents the core and extended technology landscape for this role:
Category Technologies
Storage & Lake Azure Data Lake Storage Gen2, Delta Lake, Parquet, Azure Blob Storage
Compute & Processing Azure Synapse Analytics, Azure Databricks, Apache Spark, PySpark
Orchestration Azure Data Factory, Airflow (optional), dbt (preferred)
CI/CD & DevOps Azure DevOps, GitHub Actions, Terraform, Bicep, ARM Templates
Languages Python, SQL, PySpark, Bash/PowerShell
Governance & Catalog Microsoft Purview, Unity Catalog, Azure Policy
Monitoring Azure Monitor, Log Analytics, Application Insights
Security Azure AD, Managed Identities, Key Vault, Private Endpoints, RBAC/ACLs BI & Consumption Power BI, Azure SQL Database, Serverless SQL Pools