Role name: Unity Catalog Platform Engineer
Work site: Seattle, WA - onsite.
Start date: Immediate availability.
Job Description:
**Unity Catalog Platform Engineer**
Responsible for provisioning data through Unity Catalog by implementing the technical backend (infrastructure configuration, cluster management, and pipeline development) and taking care of organizational policies and governance requirements, with a secondary focus on streamlining and automating data-related processes.
**Required Technical Skills**
• Unity Catalog
• Databricks development and programming
• Data engineering, SQL, Python
• Scripting and automation
• Privacera data security and access control management
• Azure infrastructure provisioning
• Familiar with data requirements of common ML/AI use cases
• Awareness of data governance frameworks, enterprise data compliance requirements, metadata modeling, data architecture, and enterprise-scale data discovery solutions
**Responsibilities**
• Implement data provisioning patterns based on business requirements, following follow predefined processes, policies, standards, and metadata management rules
• Create and manage distributed workspaces in Databricks, set up workspace policies, provision Databricks clusters and manage data infrastructure sizing and capacity
• Create Python notebooks, implement data masking processes, create UDFs (SQL/Python), troubleshoot data pipelines
• Ensure data security and compliance with regulations using Databricks and Privacera's features
• Navigate multi-step enterprise approval process across architecture, security, and governance teams
• Design and implement data architecture leveraging technologies such as Databricks, Unity Catalog, Privacera, and Collibra
• Develop, optimize, and manage data pipelines for ETL processes using Databricks, with a focus on data integrity and quality
• Design and maintain data models and schemas, incorporating Unity Catalog and Collibra data governance practices
• Establish a robust data governance strategy, defining standards, metadata management, lineage, and quality practices
• Operationalize Machine Learning models in Batch and Real Time Data Pipelines, leveraging relevant governance setups
• Collaborate with cross-functional teams including data scientists, engineers, and analysts to translate business requirements into scalable solutions