Overview
Remote
Depends on Experience
Contract - Independent
Contract - 6 Month(s)
No Travel Required
Skills
Data Architecture
Data Engineering
Data Governance
Data Lake
Data Modeling
Databricks
DevOps
Documentation
Extract
Transform
Load
Machine Learning (ML)
Microsoft Azure
Microsoft Power BI
NoSQL
Python
Relational Databases
SQL
Tableau
Medallion architectur
BI
Azure Data Lake Storage (ADLS Gen2)
Azure Key Vault
Azure Data Factory
Job Details
Our client is unable to hire H1-B candidates at this time.
Client is a small, growing consulting company, with a focus on AI and Data Solutions. They are seeking a Data Engineer with extensive experience on Databricks and Azure.
As a Data Engineer:
- You will be a critical player in building the foundational data infrastructure for a leading firm's data and AI strategy.
- Working primarily with Databricks on the Azure platform, you will design, develop, and maintain robust data pipelines, ingesting diverse data sources and transforming them into actionable insights.
- You will collaborate closely with the product team and other stakeholders to construct a data lakehouse that will power integrations, advanced analytics and future AI-driven workflows, all while handling sensitive client data with the utmost care and responsibility.
Responsibilities:
- Partner closely with the Product Manager, Product Designer, and client stakeholders to understand data requirements and translate them into effective data solutions within Azure and Databricks.
- Design, build, and maintain scalable and reliable data pipelines in Databricks to ingest data from a variety of source types (e.g., business workflow systems, accounting systems, CRM, databases, APIs, flat files).
- Implement and manage a medallion architecture (Bronze, Silver, Gold layers) within Databricks, transforming raw data into curated, business-ready datasets tailored for specific use cases defined by the product team.
- Develop gold layer tables and views optimized for analytics, ensuring they meet the requirements for dashboards and reports, particularly for consumption via Power BI.
- Configure and optimize Databricks to connect seamlessly with BI tools like Tableau and Power BI, enabling self-service analytics for the customer.
- Work with potentially sensitive client data, implementing and adhering to strict data security, privacy, and governance protocols.
- Leverage your skills in Databricks, including familiarity with or a strong willingness to quickly learn features like MLflow, Delta Lake, and Unity Catalog.
- Apply DevOps best practices to data pipeline development, including automation, monitoring, and CI/CD where applicable.
- Collaborate on the design and optimization of data models, ensuring they align with business needs, performance requirements, and future scalability.
- Implement robust automated testing procedures to validate data pipelines, ensure data quality, and maintain the accuracy of transformed data.
- Create and maintain comprehensive documentation for data pipelines, data models, architectural decisions, and operational procedures.
- Establish monitoring and alerting solutions to proactively identify and resolve issues in data pipelines, ensuring data availability and reliability.
- Communicate effectively with both technical and non-technical stakeholders, clearly explaining data engineering concepts, design choices, and progress.
- Contribute to a collaborative environment within a large, cross-functional consulting team.
Requirements:
- Proven experience as a Data Engineer, with a strong focus on designing and implementing solutions on the Databricks platform.
- Hands-on expertise in building and maintaining scalable Python data pipelines within Azure and Databricks.
- Demonstrable experience in implementing medallion data architecture (Bronze, Silver, Gold layers) in to support analytics and AI use cases.
- Proficiency in ingesting data from diverse source types (e.g., APIs, relational databases, NoSQL databases, flat files, streaming sources).
- Experience with BI tools and optimizing for maintainability and performance.
- Strong SQL skills and proficiency in data modeling techniques.
- Experience with Azure cloud services, particularly Azure Data Lake Storage (ADLS Gen2), Azure Key Vault, Azure Data Factory or other Azure data services.
- Familiarity with MLflow for managing the machine learning lifecycle is a strong plus; curiosity and ability to quickly learn new Databricks features is essential.
- Understanding of DevOps principles and experience with tools for CI/CD, version control (e.g., Git), and infrastructure automation is advantageous.
- Experience working with sensitive data and a strong understanding of data security, data governance, and privacy-preserving techniques.
- Excellent problem-solving skills and the ability to troubleshoot complex data issues.
- Strong communication skills, with the ability to articulate technical details and decisions to product managers, client stakeholders, and other engineers.
- Ability to work effectively within a large, cross-functional consulting team in a dynamic, client-facing environment.
- A mindset that leans into data science concepts (understanding data needs for ML) or advanced DevOps practices.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.