Data Architect (Databricks)

Overview

Remote
$140,000 - $160,000
Full Time
No Travel Required

Skills

ARM
Amazon Web Services
Analytics
Apache Airflow
Apache Kafka
Apache Spark
Artificial Intelligence
Cloud Computing
Collaboration
Communication
Continuous Delivery
Continuous Integration
Data Engineering
Data Governance
Data Modeling
Databricks
DevOps
FOCUS
Generative Artificial Intelligence (AI)
Innovation
Machine Learning (ML)
Machine Learning Operations (ML Ops)
Microsoft Azure
Orchestration
Performance Tuning
Real-time
Stakeholder Engagement
Storage
Streaming
Terraform
Unity
Use Cases
Workflow
fulltime

Job Details

Job Title Data Architect (Databricks)

Location Remote

Duration: Full-time

Role: We are seeking a seasoned Data Architect with deep expertise in Databricks, Lakehouse architecture, and AI/ML/GenAI enablement to lead a critical modernization initiative. The role involves transforming a legacy platform into a future-ready, scalable, cloud-native Databricks-based architecture. You will drive design and implementation of high-performance data pipelines, orchestrate data workflows, and integrate AI/ML capabilities across the stack to unlock real-time intelligence and innovation.

Key Responsibilities

  • Lead the architectural modernization from an on-prem/legacy platform to a unified Databricks Lakehouse ecosystem.
  • Architect and optimize data pipelines (batch and streaming) to support AI/ML and GenAI workloads on

Databricks.

  • Migrate and re-engineer existing Spark workloads to leverage Delta Lake, Unity Catalog, and advanced

performance tuning in Databricks.

  • Drive integration of AI/ML models (including GenAI use cases) into operational data pipelines for real-time

decision-making.

  • Design and implement robust orchestration using Apache Airflow or Databricks Workflows, with CI/CD

integration.

  • Establish data governance, security, and quality frameworks aligned with Unity Catalog and enterprise

standards.

  • Collaborate with data scientists, ML engineers, DevOps, and business teams to enable scalable and governed

AI solutions.

Required Skills

  • 12+ years in data engineering or architecture, with a strong focus on Databricks (at least 4-5 years) and AI/ML

enablement.

  • Deep hands-on experience with Apache Spark, Databricks (Azure/AWS), and Delta Lake.
  • Proficiency in AI/ML pipeline integration using Databricks MLflow or custom model deployment strategies.
  • Strong knowledge of Apache Airflow, Databricks Jobs, and cloud-native orchestration patterns.
  • Experience with structured streaming, Kafka, and real-time analytics frameworks.
  • Proven ability to design and implement cloud-native data architectures.
  • Solid understanding of data modeling, Lakehouse design principles, and lineage/tracking with Unity Catalog.
  • Excellent communication and stakeholder engagement skills.

Preferred Qualifications

  • Certification in Databricks Data Engineering Professional is highly desirable.
  • Experience transitioning from in house data platforms to Databricks or cloud-native environments.
  • Hands-on experience with Delta Lake, Unity Catalog, and performance tuning in Databricks.
  • Expertise in Apache Airflow DAG design, dynamic workflows, and production troubleshooting.
  • Experience with CI/CD pipelines, Infrastructure-as-Code (Terraform, ARM templates), and DevOps practices.
  • Exposure to AI/ML model integration within real-time or batch data pipelines.
  • Exposure to MLOps, MLflow, Feature Store, and model monitoring in production environments.
  • Experience with LLM/GenAI enablement, vectorized data, embedding storage, and integration with Databricks is an added advantage.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About GLOBAL IT CON LLC