Databricks Data Architect

Overview

Hybrid

Depends on Experience

Full Time

Skills

Databricks

Amazon Web Services

Data Architecture

Apache Spark

ELT

PySpark

Extract

Transform

Load

Continuous Integration

Continuous Delivery

Job Details

About the Role

We are looking for a highly skilled Databricks Architect to lead the design and technical planning for a large-scale migration to the Databricks platform. This is a strategic pre-execution role where you will drive technical architecture discussions with client stakeholders and internal teams to lay the foundation for a successful project. The role involves working closely with our leadership and the client s technical leadership to define the migration blueprint, leverage accelerators like Turgon, and prepare the Statement of Work (SOW) for execution.

Responsibilities

Act as the technical lead and engage with client architects and engineers to understand the current ecosystem and design the target state on Databricks.
Define the architecture for data ingestion, ETL/ELT pipelines, data lakehouses, and advanced analytics using the Databricks Lakehouse Platform.
Evaluate existing systems (e.g., AWS Redshift, Spark on EMR, GKE, Kafka) and design migration strategies to Databricks with minimal disruption.=
Develop architecture artifacts including data flow diagrams, integration patterns, and component-level designs.
Identify reusable components, best practices, and automation opportunities to accelerate delivery using tools such as Turgon.
Collaborate with infrastructure, security, and DevOps teams to define cluster sizing, deployment models, CI/CD pipelines, and cost optimization.
Support SOW and project scoping discussions with effort estimations and risk identification.
Stay engaged through early build phases, ensuring architectural integrity during handoff to engineering teams.

Must-Have Skills

8+ years of experience in data engineering and architecture, with at least 3+ years on Databricks.
Deep knowledge of Databricks Lakehouse Architecture, including Unity Catalog, Delta Lake, DBFS, MLflow, and structured streaming.
Strong background in building large-scale ETL/ELT pipelines using PySpark, SparkSQL, dbx, and Databricks Workflows.
Hands-on with AWS stack (S3, Redshift, EMR, Lambda) or Azure equivalents.
Experience integrating Databricks with Kafka, Airflow, CI/CD pipelines, and container orchestration platforms like Kubernetes.
Familiar with security frameworks, role-based access control, and compliance in enterprise data environments.
Strong communication and leadership skills, capable of driving discussions with senior architects and business leaders.

Nice to Have

Experience using Turgon or other internal accelerators in migration/modernization projects.
Prior experience in manufacturing, semiconductor, or hardware-driven enterprises.
Exposure to Computer Vision and Edge AI deployments integrated with Databricks (a plus, not mandatory).

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share