Data Architect/ Databricks Architect- Princeton, NJ (Onsite)

Overview

On Site
Depends on Experience
Contract - W2
Contract - Independent
Contract - 12 month(s)
No Travel Required

Skills

Data Architect
Azure
aws
DevOps
ETL
Spark
Kafka
Databricks
Data Governance
Machine Learning
delta lake
Lakehouse
Unity Catalog
Databricks Workflows

Job Details

Role Name: Databricks Architect
Location: Princeton, NJ (Onsite)
Duration: 12 Months

Job ID : CGEMJP00324803

JOB DESCRIPTION:
• Contribute to the design and implementation of scalable data architectures, such as a Lakehouse, using Delta Lake and Unity Catalog.
• Manage and maintain the underlying data infrastructure, which typically exists on a major cloud platform like AWS, Azure, or Google Cloud Platform.
• Implement data governance practices, including data lineage, metadata management, and access controls integrating with other 3rd party products like Immuta, Protegrity for tokenization etc.
• Adhere to software engineering best practices, including participating in code reviews and CI/CD (Continuous Integration/Continuous Deployment) automation.
• Stay up to date with the latest trends and technologies in data engineering and the Databricks ecosystem.
• Design and build robust ETL (Extract, Transform, Load) and ELT workflows to ingest, transform, and load structured and unstructured data from various sources.
• Utilize Databricks features like Delta Live Tables and Databricks Workflows to orchestrate and manage complex data processes.
• Optimize and tune Apache Spark jobs for performance and cost efficiency on large datasets.
• Implement and enforce data security, access control, and compliance policies in Databricks and Azure.
• Hands on experience working on streaming technologies (Kafka, Event Hubs, Kinesis).
• Experience architecting machine learning platforms, advanced analytics workloads and develop and deployment of MLOPS on Databricks.
• Expertise with enterprise security models, networking, and cost governance.
• Should have experience working with DevOps teams to establish deployment practices using Terraform or similar.
• Optimize Databricks performance (auto-scaling, caching, delta optimization, job tuning, cost optimization).
• Prior experience in leading enterprise Azure Databricks implementations

 

Best Regards,

Vishal

Truth Lies in  Heart

 

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.