Data Engineer (Databricks & Azure)

Overview

Remote
Depends on Experience
Contract - Independent
Contract - 3 Month(s)
No Travel Required

Skills

Databricks
DevOps
Python
Azure
Data Architecture
SQL
Data modeling
Azure Cloud
Azure Data Lake Storage (ADLS Gen2)

Job Details

Our client is seeking a Data Engineer for a 3-6 month contract (extendable). This is a remote position. Due to the nature of the work, client is unable to work with H-1B candidates.

As a Data Engineer for this key client engagement:

You will be a critical player in building the foundational data infrastructure for a leading firm's data and AI strategy. Working primarily with Databricks on the Azure platform, you will design, develop, and maintain robust data pipelines, ingesting diverse data sources and transforming them into actionable insights. You will collaborate closely with the product team and other stakeholders to construct a data lakehouse that will power integrations, advanced analytics and future AI-driven workflows, all while handling sensitive client data with the utmost care and responsibility.

On any given day, we hope that you will:

  • Partner closely with the Product Manager, Product Designer, and client stakeholders to understand data requirements and translate them into effective data solutions within Azure and Databricks.
  • Design, build, and maintain scalable and reliable data pipelines in Databricks to ingest data from a variety of source types (e.g., business workflow systems, accounting systems, CRM, databases, APIs, flat files).
  • Implement and manage a medallion architecture (Bronze, Silver, Gold layers) within Databricks, transforming raw data into curated, business-ready datasets tailored for specific use cases defined by the product team.
  • Develop gold layer tables and views optimized for analytics, ensuring they meet the requirements for dashboards and reports, particularly for consumption via Power BI.
  • Configure and optimize Databricks to connect seamlessly with BI tools like Tableau and Power BI, enabling self-service analytics for the customer.
  • Work with potentially sensitive client data, implementing and adhering to strict data security, privacy, and governance protocols.
  • Leverage your skills in Databricks, including familiarity with or a strong willingness to quickly learn features like MLflow, Delta Lake, and Unity Catalog.
  • Apply DevOps best practices to data pipeline development, including automation, monitoring, and CI/CD where applicable.
  • Collaborate on the design and optimization of data models, ensuring they align with business needs, performance requirements, and future scalability.
  • Implement robust automated testing procedures to validate data pipelines, ensure data quality, and maintain the accuracy of transformed data.
  • Create and maintain comprehensive documentation for data pipelines, data models, architectural decisions, and operational procedures.
  • Establish monitoring and alerting solutions to proactively identify and resolve issues in data pipelines, ensuring data availability and reliability.
  • Communicate effectively with both technical and non-technical stakeholders, clearly explaining data engineering concepts, design choices, and progress.
  • Contribute to a collaborative environment within a large, cross-functional consulting team.

In order to succeed, you will need to have some combination of the following:

  • Proven experience as a Data Engineer, with a strong focus on designing and implementing solutions on the Databricks platform.
  • Hands-on expertise in building and maintaining scalable Python data pipelines within Azure and Databricks..
  • Demonstrable experience in implementing medallion data architecture (Bronze, Silver, Gold layers) in to support analytics and AI use cases.
  • Proficiency in ingesting data from diverse source types (e.g., APIs, relational databases, NoSQL databases, flat files, streaming sources).
  • Experience with BI tools and optimizing for maintainability and performance.
  • Strong SQL skills and proficiency in data modeling techniques.
  • Experience with Azure cloud services, particularly Azure Data Lake Storage (ADLS Gen2), Azure Key Vault, Azure Data Factory or other Azure data services.
  • Familiarity with MLflow for managing the machine learning lifecycle is a strong plus; curiosity and ability to quickly learn new Databricks features is essential.
  • Understanding of DevOps principles and experience with tools for CI/CD, version control (e.g., Git), and infrastructure automation is advantageous.
  • Experience working with sensitive data and a strong understanding of data security, data governance, and privacy-preserving techniques.
  • Excellent problem-solving skills and the ability to troubleshoot complex data issues.
  • Strong communication skills, with the ability to articulate technical details and decisions to product managers, client stakeholders, and other engineers.
  • Ability to work effectively within a large, cross-functional consulting team in a dynamic, client-facing environment.
  • A mindset that leans into data science concepts (understanding data needs for ML) or advanced DevOps practices.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Will-N-Able