Data engineers with Python, Pyspark, Snowflake, EMR, EKS-Onsite Mclean, VA - Need Locals Only

Overview

On Site
Depends on Experience
Contract - W2
Contract - 12 Month(s)

Skills

Python
Pyspark
Snowflake
EMR
EKS
Data engineer

Job Details

Title - Data engineers with Python, Pyspark, Snowflake, EMR, EKS

Location-Onsite Mclean, VA - Need Locals Only

Duration Contract

Must Have skills

Python, Pyspark, Snowflake, EMR, EKS

Job Summary

We are seeking an experienced Data Engineer to design, build, and optimize scalable data pipelines and solutions across cloud and big data ecosystems.

The ideal candidate will have strong expertise in Python, PySpark, Snowflake, AWS EMR, and Amazon EKS, with the ability to transform raw data into actionable insights while ensuring performance, reliability, and security.

Key Responsibilities

Design, build, and maintain scalable and efficient ETL/ELT pipelines using Python and PySpark.

Develop and optimize data workflows across Snowflake and AWS-based big data platforms.

Work with AWS EMR clusters to manage distributed data processing and analytics workloads.

Deploy and manage containerized data applications using Amazon EKS (Kubernetes).

Collaborate with data scientists, analysts, and business stakeholders to enable reliable data delivery for analytics and reporting.

Ensure data quality, governance, and security standards are maintained across pipelines.

Optimize pipeline performance for scalability, cost efficiency, and resilience.

Contribute to the design and architecture of cloud-native data engineering solutions.

Support CI/CD pipelines and infrastructure-as-code practices for data platform deployments.

Required Skills & Qualifications

10 years of experience as a Data Engineer or similar role.

Strong programming skills in Python with experience in PySpark for large-scale data processing.

Expertise in Snowflake (warehousing, performance tuning, query optimization, data modeling).

Hands-on experience with AWS EMR for distributed data processing.

Experience with Amazon EKS (Kubernetes) for containerized workloads.

Solid understanding of SQL and relational data modeling.

Familiarity with CI/CD, Git, and DevOps practices.

Strong problem-solving, communication, and collaboration skills.

Preferred Qualifications

Experience in data lake architectures and streaming technologies (e.g., Kafka, Kinesis).

Knowledge of infrastructure as code (Terraform, CloudFormation).

Exposure to Airflow, dbt, or other orchestration tools.

Background in financial services, healthcare, or large-scale enterprise data environments is a plus.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.