Overview
Skills
Job Details
The Databricks Architect will lead enterprise-scale healthcare data solutions by designing, optimizing, and deploying robust data architectures on the Databricks Lakehouse platform. The ideal candidate will bring deep expertise in Databricks, healthcare data requirements (including HIPAA), and a proven track record of architecting data-driven analytic and AI/ML solutions for healthcare providers, payers, or life sciences organizations.
Key Responsibilities
Drive the design and implementation of scalable, secure, high-performing data architectures including data lakes, warehouses, and real-time streaming using Databricks and Apache Spark for large healthcare datasets
Lead architectural reviews, create blueprints, and develop architecture documentation aligning with healthcare regulatory requirements (HIPAA, HITRUST, etc.).
Collaborate with business, clinical, and IT stakeholders to deliver solutions that accelerate analytics, population health, patient outcomes, and operational efficiency.
Oversee end-to-end data pipeline architecture including ingestion, ETL/ELT, data transformation, and advanced analytics on Databricks.
Champion data governance, data lineage, and metadata management, leveraging healthcare data catalogs and complying with data privacy and security standards.
Optimize Databricks clusters and workflows for performance and cost, including autoscaling, performance tuning, and workload management.
Guide, mentor, and elevate project delivery teams on best practices for healthcare data engineering, analytics, and ML using Databricks.
Communicate technical concepts effectively to cross-functional teams, business leaders, and technical teams.
Lead or participate in Architecture Review Boards and governance committees for healthcare data solutions.
Required Skills & Experience
7+ years of experience in Data Architecture, Engineering, or Analytics, with domain expertise in healthcare data and regulatory environments.
2+ years hands-on experience architecting and implementing Databricks Lakehouse/Spark solutions at enterprise scale.
Demonstrated proficiency in Python, SQL, and/or Scala for data engineering and analytics.
Proven experience designing healthcare data lakes, warehouses, or real-time data pipelines to support analytics, reporting, ML, or population health.
Solid understanding of cloud platforms (AWS, Azure, or Google Cloud Platform) and cloud-native data security and privacy best practices.
Familiarity with healthcare data interoperability standards (HL7, FHIR, CCD, X12, etc.) and HIPAA compliance.
Strong communication, presentation, and collaboration skills working with technical and non-technical stakeholders.
Databricks Architect/Data Engineer certifications (preferred).
Nice to Have
Experience with healthcare-specific data platforms, EMR/EHR integration, or clinical data models.
Familiarity with data catalog and governance tools (Collibra).
Background in healthcare AI/ML and using Databricks for healthcare analytics or predictive modeling.
Experience leading or mentoring technical teams in a healthcare setting.