Overview
Skills
Job Details
Job Title: Senior Data Engineer (PySpark, ETL SSIS) -W2 only
Location: Rocky Hill,CT
Experience: 9+ Years in Data Engineering
Job Description:
We are looking for an experienced and motivated Data Engineer with expertise in PySpark to join our dynamic team. As a key member of our data engineering team, you will play a crucial role in designing, building, and maintaining scalable data pipelines that enable efficient data processing and analytics within the healthcare domain.
This role will combine both development and administrative activities, making it essential that the candidate has experience not only in building robust data pipelines but also in overseeing their operational aspects to ensure performance, reliability, and optimization.
Key Responsibilities:
- Design, develop, and maintain scalable ETL pipelines using PySpark to process large datasets.
- Collaborate with cross-functional teams (data scientists, analysts, business stakeholders) to understand data requirements and deliver high-quality solutions.
- Work on administrative tasks, including monitoring, troubleshooting, and optimizing data pipelines and infrastructure.
- Manage data integration across healthcare systems, ensuring compliance with relevant standards.
- Leverage SSIS for ETL development and ensure smooth data movement across different environments.
- Integrate and transform data from multiple sources, ensuring data quality and consistency.
- Handle and resolve data processing issues, ensuring minimal disruption to operations.
- Document best practices, processes, and workflows to maintain pipeline efficiency and scalability.
- Work with both relational and non-relational databases, ensuring smooth data flow and optimized performance.
Required Skills & Qualifications:
- 9+ years of experience in Data Engineering, with expertise in designing, building, and maintaining ETL pipelines.
- Strong proficiency in PySpark for large-scale data processing and transformation.
- Experience with ETL tools, particularly SSIS (SQL Server Integration Services).
- Solid understanding of data modeling, relational databases, and data warehousing principles.
- Experience working with cloud-based data storage and processing technologies (AWS, Google Cloud Platform, or Azure).
- Familiarity with healthcare data standards, such as HL7 and FHIR, is highly desirable.
- Proven ability in data pipeline monitoring, troubleshooting, and performance tuning.
- Strong communication skills and the ability to work collaboratively with cross-functional teams.
Infowave Systems is an equal opportunity employer that is committed to diversity and inclusion in the workplace.