Senior Data Engineer with Pyspark

  • Rocky Hill, CT
  • Posted 60+ days ago | Updated moments ago

Overview

On Site
BASED ON EXPERIENCE
Full Time
Contract - W2
Contract - Independent

Skills

Analytics
Optimization
Collaboration
Management
Data Integration
Regulatory Compliance
Data Quality
Workflow
Scalability
Data Flow
Data Engineering
PySpark
Data Processing
Microsoft SQL Server
Microsoft SSIS
Data Modeling
Relational Databases
Data Warehouse
Cloud Computing
Data Storage
Amazon Web Services
Google Cloud
Google Cloud Platform
Microsoft Azure
Health Care
HL7
Extract
Transform
Load
Performance Tuning
Communication

Job Details

Job Title: Senior Data Engineer (PySpark, ETL SSIS) -W2 only
Location: Rocky Hill,CT

Experience: 9+ Years in Data Engineering

Job Description:

We are looking for an experienced and motivated Data Engineer with expertise in PySpark to join our dynamic team. As a key member of our data engineering team, you will play a crucial role in designing, building, and maintaining scalable data pipelines that enable efficient data processing and analytics within the healthcare domain.

This role will combine both development and administrative activities, making it essential that the candidate has experience not only in building robust data pipelines but also in overseeing their operational aspects to ensure performance, reliability, and optimization.

Key Responsibilities:

  • Design, develop, and maintain scalable ETL pipelines using PySpark to process large datasets.
  • Collaborate with cross-functional teams (data scientists, analysts, business stakeholders) to understand data requirements and deliver high-quality solutions.
  • Work on administrative tasks, including monitoring, troubleshooting, and optimizing data pipelines and infrastructure.
  • Manage data integration across healthcare systems, ensuring compliance with relevant standards.
  • Leverage SSIS for ETL development and ensure smooth data movement across different environments.
  • Integrate and transform data from multiple sources, ensuring data quality and consistency.
  • Handle and resolve data processing issues, ensuring minimal disruption to operations.
  • Document best practices, processes, and workflows to maintain pipeline efficiency and scalability.
  • Work with both relational and non-relational databases, ensuring smooth data flow and optimized performance.

Required Skills & Qualifications:

  • 9+ years of experience in Data Engineering, with expertise in designing, building, and maintaining ETL pipelines.
  • Strong proficiency in PySpark for large-scale data processing and transformation.
  • Experience with ETL tools, particularly SSIS (SQL Server Integration Services).
  • Solid understanding of data modeling, relational databases, and data warehousing principles.
  • Experience working with cloud-based data storage and processing technologies (AWS, Google Cloud Platform, or Azure).
  • Familiarity with healthcare data standards, such as HL7 and FHIR, is highly desirable.
  • Proven ability in data pipeline monitoring, troubleshooting, and performance tuning.
  • Strong communication skills and the ability to work collaboratively with cross-functional teams.

Infowave Systems is an equal opportunity employer that is committed to diversity and inclusion in the workplace.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Hexplora