Lead/Architect Data Engineer

Overview

Remote
Hybrid
BASED ON EXPERIENCE
Contract - W2
Contract - Independent

Skills

DATABRICKS
DATA BRICKS
DATA-BRICKS
SQL
PYSPARK
PY SPARK
PYTHON
AZURE
ETL
ELT
ARCHITECTURE
ADMINISTRATION
PERFORMANCE
DATA PIPELINES
DELTA LAKE
AZURE SQL
AZURE DEVOPS

Job Details

APN Consulting, Inc. is a progressive IT staffing and services company offering innovative business solutions to improve client business outcomes. We focus on high impact technology solutions in ServiceNow, Fullstack, Cloud & Data, and AI / ML. Due to our globally expanding service offerings we are seeking top-talent to join our teams and grow with us.

Role: Lead/Architect Data Engineer
Location: Remote (US based-remote anywhere in US)
Duration: Contract

Job Summary

We are looking for a results-driven Lead Data Engineer (Contractor) to architect, develop, and guide the implementation of modern data pipelines and cloud-native analytics solutions. The ideal candidate will lead end-to-end delivery across engineering, analytics, and product teams, bringing deep experience in Databricks, PySpark, and Azure cloud platforms. This role also requires strong hands-on experience in Databricks architecture, administration, and performance optimization .

Key Responsibilities

  1. Lead the architecture, design, and development of scalable ETL/ELT pipelines using Databricks, Pyspark, and SQL across distributed data environments.
  2. Architect and manage Databricks workspaces, including provisioning and maintenance of clusters, cluster policies, and job compute environments in accordance with enterprise standards.
  3. Collaborate with platform and infrastructure teams to define Databricks architecture strategy and ensure secure, scalable, and cost-effective implementation.
  4. Define and enforce cluster policies to ensure proper resource utilization, cost control, and access control based on workload patterns and team requirements.
  5. Lead performance tuning of Spark jobs, Databricks SQL queries, and notebooks, ensuring optimal execution and minimizing latency.
  6. Build modular, reusable Python libraries using Pandas, NumPy, and PySpark for scalable data processing.
  7. Develop optimized Databricks SQL queries and views to powerTableau dashboards
    1. React and .NET-based applications
    2. Ad-hoc and real-time analytics use cases
    3. Work closely with frontend and backend development teams to deliver use-case-specific, query-optimized datasets.
  8. Leverage Unity Catalog for fine-grained access control, data lineage, and metadata governance.
  9. Drive DevOps best practices using Azure DevOps, Terraforms, and CI/CD automation pipelines.
  10. Mentor junior engineers and perform architectural reviews to ensure consistency and alignment with best practices.

Required Skills & Qualifications

  1. 7+ years of experience in data engineering, with a strong background in cloud-native data architecture.
  2. Deep hands-on experience with Databricks architecture, workspace administration, and cluster management.
  3. Experience defining and managing cluster policies, pools, and autoscaling strategies.
  4. Strong knowledge of Spark performance tuning and job optimization.
  5. Proven expertise in Databricks SQL, PySpark, Delta Lake, and large-scale data pipelines.
  6. Skilled in building reusable Python libraries with Pandas, Openpyxl, XLSXWriter, and PySpark.
  7. Practical experience working with Unity Catalog for security and governance.
  8. Strong collaboration experience with front-end/backend development teams and backend integration.
  9. Strong SQL expertise and hands-on experience with PostgreSQL, SQL Server, or similar.
  10. DevOps expertise with tools like Azure DevOps, Git, and pipeline automation.
  11. Excellent communication skills with the ability to lead discussions with cross-functional teams and stakeholders.

Tools & Technologies

  1. Cloud Platforms: Azure (preferred), AWS
  2. Big Data & Analytics: Databricks, PySpark, Delta Lake, Databricks SQL, Spark Connect, Delta Live Tables
  3. Programming & Frameworks: Python, Pandas, PySpark, Flask
  4. Visualization & BI: Tableau
  5. App Integration: React, .NET, REST APIs
  6. DevOps & CI/CD: Azure DevOps, Git
  7. Databases: Databricks SQL, Azure SQL DB, or similar

We are committed to fostering a diverse, inclusive, and equitable workplace where individuals from all backgrounds feel valued and empowered to contribute their unique perspectives. We strongly encourage applications from candidates of all genders, races, ethnicities, abilities, and experiences to join our team and help us build a culture of belonging.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.