Overview
Skills
Job Details
Key Responsibilities:
Solution Design & Architecture: Design end-to-end data architecture on Azure, including ingestion, storage, processing, and analytics solutions using Databricks. This often involves implementing the medallion architecture (Bronze, Silver, Gold layers) and the Lakehouse pattern. Design & build FDPs.
Data Pipeline Development: Build and optimize robust ETLELT data pipelines using Azure Databricks, Azure Data Factory (ADF), PySpark, Scala, Python, and SQL.
Integration: Seamlessly integrate Databricks with other Azure services such as Azure Data Lake Storage Gen2 (ADLS Gen2), Azure Synapse Analytics, Azure SQL Database, Azure Event Hubs, and Power BI.
Governance & Security: Define and implement best practices for data quality, governance (using tools
medallion architecture (Bronze, Silver, Gold layers) and the Lakehouse pattern. Design & build FDPs.
Data Pipeline Development. Build and optimize robust ETLELT data pipelines using Azure Databricks, Azure Data Factory (ADF), PySpark, Scala, Python, and SQL.
Integration: Seamlessly integrate Databricks with other Azure services such as Azure Data Lake Storage Gen2 (ADLS Gen2), Azure Synapse Analytics, Azure SQL Database, Azure Event Hubs, and Power BI.
Governance & Security: Define and implement best practices for data quality, governance (using tools like Unity Catalog and Azure Purview), security, privacy, and compliance within the cloud data solutions.
Collaboration & Leadership: Work closely with stakeholders, data engineers, data scientists, and business analysts to translate requirements into technical solutions. Provide technical guidance and mentorship to development teams.
Performance Optimization: Monitor and tune data processing performance for large-scale data processing and analytics workloads.
Platform Management. Deploy, configure, and manage Databricks environments and clusters, ensuring performance, cost-efficiency, and adherence to security and governance standards.
Technical Expertise:
Deep expertise in the Databricks platform, including Delta Lake, Unity Catalog, Spark optimization, and notebook development.
Proficiency in programming languages like Python, SQL, and ScalaPySpark.
Strong knowledge of Azure data services (ADF, ADLS Gen2, Azure Synapse, Azure SQL).