SQL Developer with PySpark and Python

Overview

On Site

Accepts corp to corp applications

Contract - W2

Contract - long term

Skills

python

sql queries

SQL

ETL

Git

Big Data

data modeling

Stored Procedures

Performance Tuning

documentation

data pipelines

Data Quality

Amazon Web Services

Database Modeling

Problem-Solving

Deployment

Data Warehouse

GCP

Version Control

Continuous Integration/Delivery

Real-Time

Data Sources

Data Transformation

Data Validation

Financial Services

EMR

Apache Airflow

Job Details

Job Title: SQL Developer with PySpark and Python

Location: Pittsburgh, PA (Hybrid)
Job Type: Contract

Position Summary:

We are seeking a highly skilled SQL Developer with strong expertise in PySpark and Python to support data engineering and analytics initiatives. The ideal candidate will work on developing and optimizing complex SQL queries, building scalable data pipelines, and supporting the transformation and processing of large data sets in a distributed environment.

Key Responsibilities:

Design, develop, and maintain complex SQL queries and stored procedures to extract and transform data.
Develop and optimize data pipelines using PySpark in distributed processing environments (e.g., Databricks, EMR, or Spark clusters).
Write modular, reusable Python scripts to support ETL workflows, data validation, and automation.
Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver high-quality solutions.
Monitor and tune performance of queries and pipelines for efficiency and scalability.
Ensure data quality and integrity across multiple data sources and systems.
Participate in code reviews, documentation, and deployment processes.

Required Qualifications:

5+ years of experience in SQL development, including performance tuning and data modeling.
3+ years of hands-on experience with PySpark for big data processing.
Strong experience with Python, especially for data transformation and automation tasks.
Experience working in cloud or distributed data platforms (e.g., Azure, AWS, or Google Cloud Platform environments).
Familiarity with version control (e.g., Git) and CI/CD practices for data pipelines.
Excellent problem-solving, communication, and collaboration skills.

Preferred Qualifications:

Experience with Databricks or other Spark-based data platforms.
Knowledge of data lake, data warehouse, and real-time processing architectures.
Experience with Apache Airflow, Snowflake, or similar technologies is a plus.
Prior experience working in financial services, healthcare, or manufacturing domains is desirable.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share