Overview
Skills
Job Details
Job Title : Sr. Databricks Architect
Location : Remote
Duration : 12 Months
Job Description:
Position Overview
We are seeking a highly skilled Sr. Databricks Architect with over 7+ years of experience in data architecture and engineering. The ideal candidate will have a strong background in designing and implementing large-scale data platforms, particularly using Databricks. This role requires technical leadership, strategic thinking, and the ability to collaborate with cross-functional teams to drive data initiatives.
Key Responsibilities
Technical Leadership
- Provide technical and people leadership to design and build lakehouse and analytic products.
- Serve as a key member of the Client Architecture Review Board and Design Steering Committee, sharing expertise on architectures and design initiatives.
Data Architecture Design
- Involved in the design and implementation of large-scale platform and data architectures on Databricks.
- Develop technical roadmaps for data architecture, aligning with business objectives and industry trends.
- Design end-to-end data management and segmentation for the entire data flow in the Databricks platform.
Migration and Catalog Management
- Lead the migration from Hive Metastore to Unity Catalog to enable trickle-down catalog for development, staging, and production environments.
- Build different user personas to cater to wider consumption patterns in Unity Catalog.
- Implement a unified access control model on Databricks' Unity Catalog, leveraging RBAC, ABAC, and CBAC for fine-grained data governance.
Data Governance and Compliance
- Implement Immuta and Databricks integration for automatic financial and PII data discovery, ensuring comprehensive data governance.
- Design and build a comprehensive data quality framework and maintain statistics for further analysis.
Data Pipeline Development
- Design and build complex data pipelines for both batch and streaming data sources on Databricks using Auto Loader, Kafka, Airflow, and workflows.
- Implement Delta Sharing to enable easy and secure data sharing within and outside the organization.
Hybrid Data Solutions
- Implement data virtualization to provide a unified data layer across hybrid environments.
- Design and build lakehouse federated queries and push-down queries as hybrid compute for external CDP platforms.
Infrastructure Management
- Build a Terraform-based Databricks IaaS framework to support workspace management, IAM, SSAM, service principal, and other operations.
- Create a robust DevOps framework for Databricks platform and data pipelines.
Monitoring and Auditing
- Create a robust monitoring, logging, and auditing infrastructure for Databricks, incorporating Spark logging to provide a unified view of data pipeline operations.
Required Qualifications
Education
Bachelor of Engineering in Computer Science.
Certifications
- Databricks Certified Professional Data Engineer.
- Databricks Certified GenAI.
- Databricks Champion (In-progress).
Experience
- Over 18 years of work experience in data architecture and engineering.
- Proven experience in data platform modernization and architecting horizontally scalable data platforms.
Technical Skills
- Proficiency in Databricks, Python, Scala, Kafka, No-SQL databases, and DBT.
- Experience with AWS and Azure cloud platforms.
- Familiarity with MLFlow, Vector Databases, Meta Llama, and DBRX.
Industries and Domains
- Marketing, GTM, Sales
- Hi-Tech, SaaS, Manufacturing
- eCommerce, CDP, C360
Roles
- Data Engineering Solution Architect
- Data & AI Engagement Lead