Genzeon, an AI and automation company with deep engineering and data expertise, dedicated to serving the healthcare and retail industries. Our platform solutions – including HIP One, CompliancePro Solutions, and Patient Engagement Solutions – empower organizations to scale innovation and transform outcomes.
Genzeon is a global community of innovators and problem-solvers, with a culture built on inclusion, flexibility, and purpose-driven work. With four global delivery centers, we support providers, payers, Healthtech, and retail organizations worldwide.
Genzeon has an exciting opening for AWS Databricks Data Engineer to join our dynamic team.
AWS Databricks Data Engineer
Remote
FTE
We are looking for an experienced AWS Databricks Data Engineer to design, build, and maintain scalable data pipelines and lakehouse architectures. The ideal candidate will have strong hands-on experience with Apache Spark, PySpark, Python, SQL, Delta Lake, and AWS cloud services.
The candidate will be responsible for developing reliable data pipelines, optimizing cloud storage, ensuring data quality, and supporting data governance and security within a production AWS and Databricks environment.
Strong interpersonal skills with the ability to work closely with business analysts and technical teams, actively participate in discussions, and drive data-related problem-solving initiatives.
An AWS Databricks Data Engineer to design, build, and maintains scalable data pipelines and lakehouse architectures using Apache Spark, Python, and SQL. Optimize cloud storage, ensure data quality, and integrate seamlessly with other AWS services.
Core Responsibilities
- Pipeline Development: Design and build batch and streaming data pipelines using PySpark, Delta Lake, Autoloader and Delta Live Tables (DLT) to ingest data sets into Databricks.
- AWS Integration: Build cloud-native data solutions leveraging AWS services (e.g., S3 for storage, IAM for access management, and AWS Glue or Lambda for serverless tasks
- Data Governance & Security: Configure and manage access controls using Databricks Unity Catalog to ensure compliance and monitor lineage
- Orchestration & CI/CD: Automate pipeline deployments using Databricks Workflows, Apache Airflow, and CI/CD tools (e.g., GitHub, GitLab). [, , ]
- Cross-Functional Collaboration: Partner closely with stakeholders to build robust feature stores and prepare datasets for various consumption needs
Typical Qualifications & Technical Skills
- Industry: Healthcare Payer industry experience. Have worked on MMIS data sets – claims, provider, member enrollment and similar data sets
- Experience: 3-5 years of hands-on data engineering experience, specifically on Databricks.
- Programming: High proficiency in Python (specifically PySpark) and advanced SQL.
- Big Data & Cloud: Strong understanding of Apache Spark, Data Lakehouse architecture, and working within a production AWS environment.
- Databricks Ecosystem: Familiarity with the Databricks platform ecosystem, including notebooks, Delta Lake, and Unity Catalog.