Overview
On Site
Depends on Experience
Contract - W2
Able to Provide Sponsorship
Skills
PySpark
Python
Amazon Web Services
Spark
Extract
Transform
Load
Data Warehouse
Data Analysis
Amazon S3
Big Data
Cloud Computing
Collaboration
Data Modeling
Data Processing
Data Quality
ELT
Electronic Health Record (EHR)
Machine Learning (ML)
Apache Spark
Amazon Redshift
Scripting
Performance Tuning
Unstructured Data
Workflow
Job Details
ONLY W2
Key Responsibilities:
- Develop and maintain scalable ETL/ELT data pipelines using PySpark and Python.
- Design, build, and optimize data processing workflows on AWS Cloud (e.g., S3, Glue, EMR, Lambda, Redshift).
- Work with structured and unstructured data to support data analytics and machine learning.
- Implement data quality checks, performance tuning, and monitoring of data processes.
- Collaborate with data scientists, analysts, and business stakeholders to understand data needs.
Required Skills:
- Strong expertise in PySpark / Apache Spark for big data processing.
- Proficiency in Python for scripting and automation.
- Hands-on experience with AWS data services (S3, Glue, EMR, Lambda, Redshift, Athena).
- Good understanding of data modeling, ETL workflows, and data warehousing.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.