Sr. Databricks Lead Engineer - 100% Remote

Overview

Remote
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 6 Month(s)
No Travel Required

Skills

Azure
data engineer
Databricks
Dynamics 365
data pipelines
PySpark

Job Details

Position: Sr. Databricks Lead engineer

Duration: 6+ months

Location: 100% Remote

Job Summary:

We are looking for a highly skilled and motivated Senior Data Engineer to lead the design and development of scalable, enterprise-grade data pipelines and frameworks within the Azure ecosystem. This role requires deep expertise in Databricks (PySpark/Scala), Delta Lake, real-time data streaming, and Azure integration tools. The ideal candidate will have hands-on experience implementing the Medallion architecture, ensuring data quality, lineage, and availability across multiple business systems, including Dynamics 365 CE & F&O.

Key Responsibilities:

1. Data Engineering & Pipeline Development

  • Design, develop, and optimize robust batch and real-time data pipelines using Databricks (PySpark/Scala) and Delta Lake.
  • Ingest structured and semi-structured data from diverse sources: APIs, FTP/SFTP, Dropbox, Event Hub, and Azure Data Factory.
  • Build reusable ingestion and transformation frameworks to support scalable and modular pipeline architectures.

2. Medallion Architecture Implementation

  • Architect and manage data flows using Databricks' Medallion architecture:
  • Bronze Layer: Raw ingestion from APIs and external systems.
  • Silver Layer: Cleaned, enriched, and validated data.
  • Gold Layer: Curated datasets for analytics, reporting, and operational use.
  • Ensure end-to-end data quality, lineage, and traceability across all layers using Unity Catalog and Azure Purview.
  • 3. Streaming & Real-Time Enablement

    • Design and implement real-time data ingestion and processing pipelines using Structured Streaming, Event Hub, and Delta Live Tables.
    • Enable near real-time data availability to Power BI, Dynamics 365 CE & F&O, and other downstream platforms.

    4. Collaboration & System Integration

    • Collaborate with data architects, API teams, and integration engineers to align pipelines with business logic and data models.
    • Support integrations with third-party and internal systems such as Shelter LOBs, PetProPortal, Vetco, and SFCC.

    5. Monitoring, Optimization & Governance

    • Monitor pipeline performance and implement best practices for cost-efficiency, scalability, and fault tolerance.
    • Establish observability, logging, and alerting using Azure Monitor and Databricks-native tools.
    • Ensure adherence to data privacy, security, and governance policies through Unity Catalog, Azure Purview, and role-based access controls.

    Required Skills & Experience:

    • 8+ years of experience in data engineering and distributed data processing.
    • 3+ years of hands-on development experience with Databricks, PySpark, and Delta Lake.
    • Deep knowledge of:
    • Structured Streaming, Azure Event Hub, Delta Live Tables
    • Azure Data Factory, Logic Apps
    • Lakehouse design patterns, especially Medallion architecture
  • Strong understanding of data quality frameworks, CI/CD for data pipelines, and metadata-driven designs.
  • Experience working with Dynamics 365 data structures and enterprise-scale system integrations.
  • Preferred Qualifications:

    • Databricks Certified Data Engineer Associate/Professional
    • Microsoft Certified: Azure Data Engineer Associate
    • Experience with Unity Catalog, Azure Purview, and data governance best practices.
    • Familiarity with DevOps practices, including deployment pipelines using Azure DevOps or similar tools.

    Expected Outcomes:

    • Implementation of scalable and reliable batch and streaming data pipelines to support enterprise-wide analytics and reporting needs.
    • Delivery of high-quality, curated datasets aligned with business domains and operational KPIs.
    • Enabling of real-time data consumption across enterprise applications and reporting systems.
    • Establishment of a governed, observable, and efficient data infrastructure using best-in-class tools and practices.

    Best Regards,

    Chetna

    -D

    -Fax

    Truth Lies in Heart

    Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.