Overview
Skills
Job Details
Job Title: Data Engineer / Developer Databricks & PySpark
Location: 100% Onsite New York, NY
Duration: 12 36 months
Rate: $75 $100/hr W2
Security Clearance Requirement: with 3+ years of residency
Employment Type: W2 only (No C2C or third-party agencies)
Required Skills: Databricks (SaaS), PySpark, Python, SQL, AWS and version control
Job Description:
We re hiring a hands-on Data Engineer (with a developer s mindset) to join a major financial institution on a long-term cloud migration initiative. You ll be at the heart of transforming legacy systems into scalable, cloud-native solutions leveraging Databricks, PySpark, and AWS to build high-performance data pipelines and infrastructure.
This isn t just a data wrangling role it s a full-stack engineering opportunity where you ll write production-grade code, architect ETL workflows, and optimize Spark jobs for performance and scale. You ll be part of a high-impact team migrating legacy applications from on-premises infrastructure to cloud-native platforms. This includes designing and deploying ETL pipelines, optimizing Spark jobs, and collaborating across teams to modernize data architecture.
Responsibilities:
- Build and deploy ETL pipelines using Databricks and PySpark
- Analyze existing solutions and propose cloud-native alternatives
- Write and optimize complex SQL queries
- Develop reusable components for structured, semi-structured, and unstructured data ingestion
- Tune Spark jobs for performance and scalability in AWS
- Implement data quality, validation, and governance best practices
- Collaborate with architects, analysts, and stakeholders to align on data requirements
- Support production workloads with debugging and performance optimization
- Automate infrastructure provisioning using CI/CD and Terraform
Required Skills:
- Databricks (SaaS), PySpark, Python, SQL, AWS and version control
- AWS services: EC2, EMR, S3, RDS, Glue
- SQL Super Star (SQL expertise will be tested during interviews)
- CI/CD tools: GitLab pipelines
- Infrastructure as Code: Terraform
- Container orchestration: Docker, Kubernetes
- Lakehouse architecture and federated SQL (Starburst, Trino)
- Strong communication and collaboration skills
- Positive attitude and ownership mindset
Preferred Qualifications:
- Bachelor s degree in Computer Science or related field
- Experience with Starburst, Trino, and Lakehouse architecture
- Familiarity with AWS IAM, networking, and monitoring tools
- Proficiency in Scala is a plus