SPARK Engineer / Developer (Only G.C / U.S.C)
Louisville, KY (Remote)
6+Months
Role Overview:
The SPARK Engineer role focuses on the creation and management of Python and SQL data pipelines utilizing the Spark/Databricks environment. The role encompasses the development of structured data organization in Delta Lake, and the facilitation of both batch and near-real-time data integration with core systems and analytical engines via technologies like Kafka. The ideal candidate will have a strong background in cloud data services (AWS, Azure, or Google Cloud Platform), data workflow coordination tools (such as Airflow or Prefect), and a solid understanding of data integrity and monitoring principles. An understanding of schema management, compatibility considerations, and data product governance is crucial. The engineer will work in synergy with backend developers, product managers, and data analysts to foster precision and scalability of actionable data.
Prior experience in regulatory or healthcare fields will be an advantage.
Primary Duties:
1. Engineer robust data ecosystems spanning the spectrum of state, action, decision, and analytics using tools like Python, SQL, Spark, and Databricks.
2. Craft and deploy data workflows that successfully manage both immediate and deferred data processing demands.
3. Administer data orchestration systems (including Airflow, Prefect) to assure data excellence, dependability, and system-wide growth.