Overview
Skills
Job Details
Position: Azure Data Lead Engineer
Location: Remote, USA
Data Engineer/s to architect and implement scalable ETL and data storage solutions using Microsoft Fabric and the broader Azure technology stack. This role will be pivotal in building a metadata-driven data lake that ingests data from over 100 structured and semi-structured sources, enabling rich insights through canned reports, conversational agents, and analytics dashboards.
Key Responsibilities
Design and implement ETL pipelines using Microsoft Fabric (Dataflows, Pipelines, Lakehouse, warehouse, sql) and Azure Data Factory.
Build and maintain a metadata-driven Lakehouse architecture with threaded datasets to support multiple consumption patterns.
Develop agent-specific data lakes and an orchestration layer for an uber-agent that can query across agents to answer customer questions.
Enable interactive data consumption via Power BI, Azure OpenAI, and other analytics tools.
Ensure data quality, lineage, and governance across all ingestion and transformation processes.
Collaborate with product teams to understand data needs and deliver scalable solutions.
Optimize performance and cost across storage and compute layers.
Required Qualifications
5+ years of experience in data engineering with a focus on Microsoft Azure and Fabric technologies.
Strong expertise in:
o Microsoft Fabric (Lakehouse, Dataflows Gen2, Pipelines, Notebooks)
o Azure Data Factory, Azure SQL, Azure Data Lake Storage Gen2
o Power BI and/or other visualization tools
o Azure Functions, Logic Apps, and orchestration frameworks
o SQL, Python and PySpark/Scala
Experience working with structured and semi-structured data (JSON, XML, CSV, Parquet).
Proven ability to build metadata-driven architectures and reusable components.
Familiarity with agent-based architectures and conversational AI integration is a plus.
Strong understanding of data modeling, data governance, and security best practices.
Preferred Qualifications
Experience with Azure OpenAI, Copilot Studio, or similar conversational platforms.
Knowledge of CI/CD pipelines and DevOps practices for data engineering.
Experience in building multi-tenant data platforms or domain-specific lakes.