Overview
Hybrid
Depends on Experience
Contract - Independent
Contract - W2
Unable to Provide Sponsorship
Skills
Apache Spark
Python
AWS
SQL
ETL
Job Details
Project description
We are seeking a highly skilled Spark Developer with strong experience in Python, AWS, and SQL to join our team. The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing solutions, ensuring data quality, scalability, and performance. This role requires a solid background in distributed computing, cloud environments, and data engineering best practices.
Responsibilities
- Design, develop, and maintain scalable data pipelines using Apache Spark (batch and/or streaming).
- Build, optimize, and manage ETL/ELT workflows integrating multiple data sources.
- Develop data solutions in Python for data transformations, automation, and orchestration.
- Leverage AWS services (S3, EMR, Glue, Lambda, Redshift, Kinesis, etc.) to implement cloud-native data platforms.
- Write efficient SQL queries for data extraction, transformation, and reporting.
- Ensure data quality, lineage, and governance across pipelines.
- Collaborate with data engineers, architects, and analysts to deliver end-to-end data solutions.
- Troubleshoot performance bottlenecks and optimize Spark jobs for speed and cost-efficiency.
Skills
Must have
- 10+ years of experience in data engineering or backend development.
- Hands-on experience with Apache Spark (PySpark) in large-scale data environments.
- Strong proficiency in Python programming.
- Expertise in SQL (including advanced queries, performance tuning, and optimization).
- Experience working with AWS services such as S3, Glue, EMR, Lambda, Redshift, or Kinesis.
- Understanding of data warehousing concepts and ETL best practices.
- Strong problem-solving skills and ability to work in an agile, collaborative environment.
Nice to have
- Experience with Databricks or similar Spark-based platforms.
- Knowledge of streaming frameworks (Kafka, Flink).
- Familiarity with CI/CD pipelines, Docker, Kubernetes, Terraform.
- Exposure to data modeling (star schema, snowflake, data vault).
- Experience in financial services / capital markets .
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.