Senior Data Engineer

  • Dallas, TX
  • Posted 2 days ago | Updated 2 days ago

Overview

Hybrid
$140,000 - $160,000
Full Time
No Travel Required

Skills

Hadoop
Pyspark

Job Details

Job Title: Senior/Lead Data Engineer
Location: Pittsburgh, PA/ Dallas, TX/ Cleveland, OH
Client: PNC Bank

Job Description:
PNC Bank is seeking a highly skilled Senior/Lead Data Engineer to join its Data & Analytics team. The ideal candidate will have strong expertise in building and optimizing scalable data pipelines using Hadoop and PySpark, with a focus on enabling data-driven decision-making across the enterprise. This role requires hands-on technical leadership, deep knowledge of big data technologies, and the ability to collaborate with cross-functional teams.


Key Responsibilities:

  • Design, build, and optimize large-scale data pipelines for structured and unstructured data.

  • Develop and maintain data ingestion, transformation, and integration workflows using PySpark, Hadoop (HDFS, Hive, HBase), and related ecosystem tools.

  • Implement best practices for data modeling, performance tuning, and pipeline optimization.

  • Collaborate with data scientists, analysts, and business stakeholders to deliver reliable and scalable data solutions.

  • Ensure data quality, security, and governance across data platforms.

  • Provide technical leadership and mentoring to junior data engineers.

  • Partner with cloud teams to integrate Hadoop/PySpark workloads with cloud platforms (AWS/Azure/Google Cloud Platform).

  • Troubleshoot, monitor, and optimize ETL workflows for high availability and performance.

  • Drive innovation by evaluating and recommending emerging technologies in the big data ecosystem.


Required Skills & Qualifications:

  • 10+ years of experience in Data Engineering, with at least 5+ years in Hadoop & PySpark.

  • Strong expertise in Hadoop ecosystem (HDFS, Hive, HBase, Oozie, Sqoop, Kafka, etc.).

  • Proficiency in PySpark, Python, and SQL for data transformation and analytics.

  • Hands-on experience with ETL/ELT processes, data modeling, and performance tuning.

  • Familiarity with data governance, lineage, and security best practices.

  • Experience integrating Hadoop with cloud platforms (AWS EMR, Azure Databricks, Google Cloud Platform Dataproc, etc.).

  • Strong understanding of distributed computing, parallel processing, and big data architecture.

  • Excellent problem-solving, communication, and leadership skills.


Preferred Qualifications:

  • Experience in the banking or financial services industry.

  • Exposure to streaming technologies (Kafka, Spark Streaming, Flink).

  • Familiarity with DevOps, CI/CD pipelines, and containerization (Docker/Kubernetes).

  • Knowledge of Snowflake or other modern data warehouses is a plus.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.