Overview
Skills
Job Details
Summary
This role requires a hands-on Databricks expert who can design and implement scalable, reliable data solutions in a complex enterprise environment. Experience in healthcare data processing and strong collaboration skills are highly valued for success in this position.
Overview
We are seeking an experienced Data Engineer with strong Databricks development expertise to design, build, optimize, and support large-scale data pipelines and analytics solutions. The ideal candidate will have deep hands-on experience in Azure Databricks, PySpark, Delta Lake, and modern data engineering frameworks, ideally within the healthcare, Medicaid/Medicare, or payer systems context.
Key Responsibilities
- Databricks Development
- Design, develop, and maintain ETL/ELT pipelines using Databricks (PySpark, SQL).
- Build and optimize Delta Lake tables, data ingestion, transformations, and workflows.
- Implement Lakehouse architecture–based solutions for large structured and unstructured datasets.
- Develop notebooks, jobs, workflows, and automation scripts in Databricks.
- Data Engineering & Integration
- Create scalable data models for analytics, reporting, and downstream applications.
- Integrate data from various sources (API, files, databases, cloud storage, legacy systems).
- Optimize data pipelines for performance, reliability, and cost efficiency.
- Implement CI/CD for data pipelines using Git, ADO, or similar tools.
- Cloud & Platform Engineering
- Work extensively with Azure services (ADF, ADLS Gen2, Synapse, Key Vault, Functions).
- Ensure platform governance and security—RBAC, cluster policies, secrets, tokens.
- Manage Databricks clusters: sizing, auto-scaling, performance tuning.
- Quality, Testing & Documentation
- Implement unit testing, data validation, and data quality checks.
- Conduct root cause analysis and resolve data processing issues.
- Document data flows, pipelines, technical specs, and operational runbooks.
- Collaboration & Leadership
- Collaborate with product owners, architects, and business teams to understand requirements.
- Provide technical guidance to junior developers and offshore teams.
- Participate in Agile ceremonies, sprint planning, grooming, and retrospectives.
Required Skills
- 10+ years overall experience in Data Engineering.
- 5+ years hands-on experience with:
- Azure Databricks
- PySpark
- SQL (advanced level)
- Delta Lake
- Strong expertise with Azure Data Factory (ADF) and Azure Data Lake Storage.
- Experience with performance tuning of large-scale ETL pipelines.
- Strong understanding of data modeling, distributed computing, and Lakehouse architecture.
- Knowledge of DevOps, CI/CD, Git branching, and deployment pipelines.
- Experience supporting production ETL pipelines with on-call readiness.
Preferred Skills
- Experience in Healthcare / Medicaid / Medicare / MMIS data systems.
- Familiarity with Databricks SQL, Unity Catalog, and governance frameworks.
- Exposure to APIs, REST, Spark optimization, and job orchestration frameworks.
- Knowledge of MLflow, feature stores, or MLOps (nice to have).
- Understanding of HIPAA compliance and healthcare data regulations
Education
Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or related field.
Databricks or Azure certifications preferred (DP-203, Databricks Data Engineer Associate/Professional).