Data Engineer

Overview

On Site
$75 - $100
Contract - W2
Contract - 12 Month(s)

Skills

Amazon Web Services
Apache Spark
PySpark
Data Architecture
Data Wrangling
Databricks
Continuous Integration
Communication
Migration
Version Control
SaaS
SQL
Python
AWS
ETL
Extract
Transform
Load
Continuous Delivery
Data Quality
Cloud Computing

Job Details

Job Title: Data Engineer / Developer Databricks & PySpark

Location: 100% Onsite New York, NY

Duration: 12 36 months

Rate: $75 $100/hr W2

Security Clearance Requirement: with 3+ years of residency

Employment Type: W2 only (No C2C or third-party agencies)

Required Skills: Databricks (SaaS), PySpark, Python, SQL, AWS and version control

Job Description:

We re hiring a hands-on Data Engineer (with a developer s mindset) to join a major financial institution on a long-term cloud migration initiative. You ll be at the heart of transforming legacy systems into scalable, cloud-native solutions leveraging Databricks, PySpark, and AWS to build high-performance data pipelines and infrastructure.

This isn t just a data wrangling role it s a full-stack engineering opportunity where you ll write production-grade code, architect ETL workflows, and optimize Spark jobs for performance and scale. You ll be part of a high-impact team migrating legacy applications from on-premises infrastructure to cloud-native platforms. This includes designing and deploying ETL pipelines, optimizing Spark jobs, and collaborating across teams to modernize data architecture.

Responsibilities:

  • Build and deploy ETL pipelines using Databricks and PySpark
  • Analyze existing solutions and propose cloud-native alternatives
  • Write and optimize complex SQL queries
  • Develop reusable components for structured, semi-structured, and unstructured data ingestion
  • Tune Spark jobs for performance and scalability in AWS
  • Implement data quality, validation, and governance best practices
  • Collaborate with architects, analysts, and stakeholders to align on data requirements
  • Support production workloads with debugging and performance optimization
  • Automate infrastructure provisioning using CI/CD and Terraform

Required Skills:

  • Databricks (SaaS), PySpark, Python, SQL, AWS and version control
  • AWS services: EC2, EMR, S3, RDS, Glue
  • SQL Super Star (SQL expertise will be tested during interviews)
  • CI/CD tools: GitLab pipelines
  • Infrastructure as Code: Terraform
  • Container orchestration: Docker, Kubernetes
  • Lakehouse architecture and federated SQL (Starburst, Trino)
  • Strong communication and collaboration skills
  • Positive attitude and ownership mindset

Preferred Qualifications:

  • Bachelor s degree in Computer Science or related field
  • Experience with Starburst, Trino, and Lakehouse architecture
  • Familiarity with AWS IAM, networking, and monitoring tools
  • Proficiency in Scala is a plus
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.