Overview
Skills
Job Details
Location: Hybrid / Remote - Pittsburgh, PA / Dallas, TX / Cleveland, OH
Industry: Financial Services
Type: Contract to Hire (3-6 Months)
Sponsorship: Not Available</>
Join a fast-growing data engineering team at a leading financial institution undergoing a major data modernization initiative. We’re looking for hands-on Data Engineers with strong pipeline and data operations experience to support scalable, secure, and analytics-ready data platforms across commercial, retail, and wealth domains.
You’ll work on high-impact initiatives spanning both on-prem as well as preparing for future state cloud deploytments, with direct involvement in ingestion, transformation, quality, and governance. Ideal for engineers who enjoy building data infrastructure and solving real-world data problems with code.
Key Responsibilities:
Design, build, and optimize batch and streaming data pipelines using Python, PySpark, and SQL
Work with distributed compute (e.g., Hadoop, Spark) and HDFS formats
Collaborate with data analysts, scientists, and product teams to transform raw data into model-ready datasets
Monitor, test, and ensure data quality across pipelines using custom validation logic
Deploy pipelines and workflows using CI/CD, Git, and version control best practices
Participate in cloud-native data engineering across AWS, Azure, or Google Cloud Platform
Ensure compliance with data governance and security standards
Required Skills:
Strong programming experience with Python, PySpark, SQL
Hands-on experience with Hadoop, Hive, Spark, and HDFS
Familiarity with data modeling, schema management, and ETL/ELT workflows
Cloud experience with AWS, Azure, or Google Cloud Platform
Experience with version control (Git) and CI/CD pipelines
Understanding of data quality, governance, and observability tools
Bonus: Experience with modern data orchestration tools (Airflow, dbt), Snowflake, or data catalog tools like Collibra or Alation.