Overview
Skills
Job Details
About DataAffect: We are a boutique data/service management firm specializing in the delivery of Data Governance, Enterprise Data Strategy, Solutions Architecture, Data Warehousing, Data Integrations, Data Security & Privacy Management, IT Service Management, Business analysis (and) Agile Project Management services to diverse clients across multiple industries.
Role: Azure Databricks Lead
Location: Remote
Duration: 6+ months
Job Summary
We are looking for a results-driven Lead Data Engineer (Contractor) to architect, develop, and guide the implementation of modern data pipelines and Databricks, PySpark, and Azure cloud platforms. This role also requires strong hands-on experience in Databricks architecture, and administration.
Required Skills & Qualifications
- 7+ years of experience in data engineering, with a strong background in cloud-native data architecture.
- Deep hands-on experience with Databricks architecture, workspace administration, and cluster management.
- Experience defining and managing cluster policies, pools, and autoscaling strategies.
- Strong knowledge of Spark performance tuning and job optimization.
- Proven expertise in Databricks SQL, PySpark, Delta Lake, and large-scale data pipelines.
- Skilled in building reusable Python libraries with Pandas, Openpyxl, XLSXWriter, and PySpark.
- Practical experience working with Unity Catalog for security and governance.
- Strong collaboration experience with front-end/backend development teams and backend integration.
- Strong SQL expertise and hands-on experience with PostgreSQL, SQL Server, or similar.
- DevOps expertise with tools like Azure DevOps, Git, and pipeline automation.
- Excellent communication skills with the ability to lead discussions with cross-functional teams and stakeholders.
Tools & Technologies
- Cloud Platforms: Azure (preferred), AWS
- Big Data & Analytics: Databricks, PySpark, Delta Lake, Databricks SQL
- Programming & Frameworks: Python, Pandas, PySpark, Flask
- Visualization & Bl: Tableau
- App Integration: React, .NET, REST APIs
- DevOps & CI/CD: Azure DevOps, Git
- Databases: Databricks SQL, Azure SQL DB, or similar