SQL Developer with PySpark and Python

Overview

On Site
Accepts corp to corp applications
Contract - W2
Contract - long term

Skills

python
sql queries
SQL
ETL
Git
Big Data
data modeling
Stored Procedures
Performance Tuning
documentation
data pipelines
Data Quality
Amazon Web Services
Database Modeling
Problem-Solving
Deployment
Data Warehouse
GCP
Version Control
Continuous Integration/Delivery
Real-Time
Data Sources
Data Transformation
Data Validation
Financial Services
EMR
Apache Airflow

Job Details

Job Title: SQL Developer with PySpark and Python

Location: Pittsburgh, PA (Hybrid)
Job Type: Contract

Position Summary:

We are seeking a highly skilled SQL Developer with strong expertise in PySpark and Python to support data engineering and analytics initiatives. The ideal candidate will work on developing and optimizing complex SQL queries, building scalable data pipelines, and supporting the transformation and processing of large data sets in a distributed environment.

Key Responsibilities:

  • Design, develop, and maintain complex SQL queries and stored procedures to extract and transform data.
  • Develop and optimize data pipelines using PySpark in distributed processing environments (e.g., Databricks, EMR, or Spark clusters).
  • Write modular, reusable Python scripts to support ETL workflows, data validation, and automation.
  • Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver high-quality solutions.
  • Monitor and tune performance of queries and pipelines for efficiency and scalability.
  • Ensure data quality and integrity across multiple data sources and systems.
  • Participate in code reviews, documentation, and deployment processes.

Required Qualifications:

  • 5+ years of experience in SQL development, including performance tuning and data modeling.
  • 3+ years of hands-on experience with PySpark for big data processing.
  • Strong experience with Python, especially for data transformation and automation tasks.
  • Experience working in cloud or distributed data platforms (e.g., Azure, AWS, or Google Cloud Platform environments).
  • Familiarity with version control (e.g., Git) and CI/CD practices for data pipelines.
  • Excellent problem-solving, communication, and collaboration skills.

Preferred Qualifications:

  • Experience with Databricks or other Spark-based data platforms.
  • Knowledge of data lake, data warehouse, and real-time processing architectures.
  • Experience with Apache Airflow, Snowflake, or similar technologies is a plus.
  • Prior experience working in financial services, healthcare, or manufacturing domains is desirable.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.