Overview
Skills
Job Details
Position Title: Data Architect with Databricks
Job Location: Plano, TX(Hybrid)
Joining Mode: Contract to hire
Key Responsibilities
- Solution Architecture & Design
Design end-to-end data architectures using Databricks Lakehouse Platform.
Architect scalable ETL/ELT pipelines, real-time streaming solutions, and advanced analytics platforms.
Define data models, storage strategies, and integration patterns aligned with business and enterprise architecture standards.
Provide guidance on cluster configuration, performance optimization, cost management, and workspace governance.
- Technical Leadership
Lead technical discussions and design workshops with engineering teams and business stakeholders.
Provide best practices, frameworks, and reusable component designs for consistent delivery.
Perform code reviews and provide technical mentoring to data engineers and developers.
- Stakeholder & Project Engagement
Collaborate with product owners, business leaders, and analytics teams to translate business requirements into scalable technical solutions.
Create and present solution proposals, architectural diagrams, and implementation strategies.
Support pre-sales or discovery phases with technical input when needed.
- Data Governance, Security & Compliance
Define and implement governance standards across Databricks workspaces (data lineage, cataloging, access control, etc.).
Ensure compliance with regulatory and organizational security frameworks.
Implement best practices for monitoring, auditing, and data quality management.
- Continuous Improvement & Innovation
Stay updated on Databricks features, roadmap, and industry trends.
Recommend improvements, optimizations, and modernization opportunities across the data ecosystem.
Evaluate integration of complementary technologies (Delta Live Tables, MLflow, Unity Catalog, streaming frameworks, etc.).
Required Skills & Experience
Technical Skills
Databricks Expertise: Strong hands-on experience with Databricks (clusters, notebooks, Delta Lake, MLflow, Unity Catalog).
Cloud Platforms: Experience with at least one cloud provider (AWS, Azure, Google Cloud Platform).
Data Engineering: Strong proficiency in Spark, Python, SQL, and distributed data processing.
Architecture: Experience designing large-scale data solutions including ingestion, transformation, storage, and analytics.
Streaming: Experience with streaming technologies (Structured Streaming, Kafka, Kinesis, EventHub).
DevOps: CI/CD practices for data pipelines (Azure DevOps, GitHub Actions, Jenkins, etc.).