Overview
Skills
Job Details
Title: Data Governance Automation Engineer (Databricks/Unity Catalog)
Location: Remote (Must be based in US)
Duration: 3 months
Description:
Design and implement automated workflows in Databricks using SQL, PySpark, and Delta Lake.
Administer and automate Unity Catalog metadata management, including permissions, lineage, tagging, and audits via APIs.
Develop Python scripts and frameworks for governance automation, data quality checks, and platform integration.
Build and maintain CI/CD pipelines for data governance using Git/GitHub, Azure DevOps, or Jenkins.
Integrate with third-party APIs (e.g., Databricks REST APIs, ServiceNow, monitoring tools) for governance, ticketing, and observability.
Enforce role-based access controls (RBAC), tagging standards, and policy-driven governance across data assets.
Implement monitoring, logging, and alerting for governance-related activities and platform usage.
Apply data quality frameworks (e.g., Great Expectations, Deequ) to validate and monitor data accuracy and completeness.
Support security and compliance initiatives, including IAM, encryption, PII handling, and regulatory controls.
Requirements:
Strong expertise with Databricks (SQL, PySpark, Delta Lake, Jobs/Workflows).
Proficiency in Unity Catalog administration and working with relevant APIs.
Experience in automation engineering for data platforms (scripting, orchestration, enforcement).
Solid Python development skills for building automation and integration tools.
Familiarity with CI/CD practices and tools: Git, GitHub, Azure DevOps, Jenkins.
Understanding of data governance, IAM, and security best practices.
Knowledge of monitoring, audit logging, and observability tools.
Experience with data quality frameworks (Great Expectations, Deequ, or similar).
Strong problem-solving and collaboration skills.
Experience with compliance standards (e.g., GDPR, HIPAA, SOC 2).
Exposure to enterprise ITSM tools like ServiceNow.
Knowledge of cloud security models (Azure, AWS preferred).