Role Name: Sr. Databricks Architect
Location: Princeton, NJ (Onsite)
Duration: 12 Months
JOB DESCRIPTION:
• Contribute to the design and implementation of scalable data architectures, such as a Lakehouse, using Delta Lake and Unity Catalog.
• Manage and maintain the underlying data infrastructure, which typically exists on a major cloud platform like AWS, Azure, or Google Cloud Platform.
• Implement data governance practices, including data lineage, metadata management, and access controls integrating with other 3rd party products like Immuta, Protegrity for tokenization etc.
• Adhere to software engineering best practices, including participating in code reviews and CI/CD (Continuous Integration/Continuous Deployment) automation.
• Stay up to date with the latest trends and technologies in data engineering and the Databricks ecosystem.
• Design and build robust ETL (Extract, Transform, Load) and ELT workflows to ingest, transform, and load structured and unstructured data from various sources.
• Utilize Databricks features like Delta Live Tables and Databricks Workflows to orchestrate and manage complex data processes.
• Optimize and tune Apache Spark jobs for performance and cost efficiency on large datasets.
• Implement and enforce data security, access control, and compliance policies in Databricks and Azure.
• Hands on experience working on streaming technologies (Kafka, Event Hubs, Kinesis).
• Experience architecting machine learning platforms, advanced analytics workloads and develop and deployment of MLOPS on Databricks.
• Expertise with enterprise security models, networking, and cost governance.
• Should have experience working with DevOps teams to establish deployment practices using Terraform or similar.
• Optimize Databricks performance (auto-scaling, caching, delta optimization, job tuning, cost optimization).
• Prior experience in leading enterprise Azure Databricks implementations.
Best Regards,
Vishal
Truth Lies in Heart