Overview
On Site
Accepts corp to corp applications
Contract - W2
Contract - Term Contract
Skills
python
SQL
pySpark
Terraform
Job Details
Job Title: Lead Data Engineer Python, PySpark & SQL
Location: Canada
Job Type: Full time contract
We are looking for a strong Lead Data Engineer with deep experience in Python, PySpark, SQL, and AWS to design, develop, and optimize large-scale data pipelines. This role requires strong hands-on coding skills, the ability to validate and process complex raw data, and expertise in running and tuning PySpark jobs on EMR.
Responsibilities
- Build scalable data ingestion and transformation pipelines using Python, PySpark, and SQL.
- Process raw CSV/text files from AWS S3, including validating headers, schema checks, and malformed file detection.
- Convert raw data into structured DataFrames and implement reusable data quality checks.
- Develop advanced transformations using SQL/PySpark (Window functions, LAG(), grouping logic, date gap detection, etc.).
- Deploy and tune PySpark applications on AWS EMR, optimizing executor memory, cores, shuffle behavior, and cluster performance.
- Work with AWS services such as S3, EMR, Glue, Lambda, IAM.
- Debug performance issues (OOM errors, shuffle spill, problems) and improve pipeline reliability.
- Lead design discussions, code reviews, and mentor junior engineers.
Required Skills
- 8+ years of experience in Data Engineering.
- Expert Python (file processing, scripting, validation automation).
- Strong PySpark (DataFrames, job tuning, distributed processing).
- Advanced SQL (analytical functions, performance tuning).
- Hands-on with AWS data stack: S3, EMR, Glue, Lambda.
- Strong understanding of Spark memory allocation, YARN container usage, and EMR resource tuning.
- Excellent debugging, communication, and problem-solving skills.
Nice to Have
- Airflow or Databricks experience.
- Terraform or CloudFormation.
- Experience with data lake formats (Delta, Iceberg, Hudi).
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.