Data Engineer

Dallas, TX, US • Posted 7 hours ago • Updated 7 hours ago
Full Time
No Travel Required
On-site
Depends on Experience
Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

  • spark
  • pyspark

Summary

Job Title: Data Engineer

Location: Pittsburgh, PA, Cleveland, OH, or Dallas, TX – (5 days Onsite) - Local to any of these locations

Job Type: Permanent Full Time
Salary: $100K - $120K/Year Plus benefits

Visa Accepted: USC

We are seeking a Data Engineer with 8+ years of experience to design and maintain scalable data pipeline supporting analytics, reporting, and operational needs. The role involves collaborating with cross-functional teams to ensure data alignment with business requirements and enterprise standards.

 

Your future duties and responsibilities:

  • Design and build scalable data pipelines aligned with business needs
  • Process large dataset (batch + sometimes near Realtime)
  • Ensure data quality, consistency, and governance standards across systems
  • Support data integration and transformation efforts for analytics and reporting platforms
  • Maintain data dictionaries, metadata, and documentation
  • Participate in data architecture reviews and model validation processes
  • Support analytics reporting and risk platforms

 

Required qualifications to be successful in this role:

  • 5+ years of experience in data engineering and big data processing
  • Strong expertise in Apache Spark (Spark Core, Spark SQL) and PySpark for large-scale batch processing
  • Experience working with structured and semi-structured data, including complex transformations and performance tuning
  • Proficiency in data ingestion and integration from sources like Oracle, SQL Server, Hive, HDFS, and S3; transform data into ‘curated data models''
  • Experience writing data to Hive tables, Data Lakes (Iceberg), and downstream reporting systems
  • Strong knowledge of SQL and data modeling concepts
  • Hands-on experience with Apache Airflow for workflow orchestration (DAG design, scheduling expectations, monitoring)
  • Proficiency in shell scripting for job automation, file validation, dependency handling, and logging. Trigger Spark Jobs, perform file checks and validation; Archive & purge data; mange job dependency, logging & error handling
  • Strong understanding of batch processing and batch job scheduling frameworks
  • Experience migrating from CA7/Control-M  Airflow (daily, hourly, weekly schedules) CI/CD for data pipelines
  • Experience ensuring data quality, reliability, and compliance in regulated environments
  • Good communication and documentation skills
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91173660
  • Position Id: 9008735
  • Posted 7 hours ago
Contact the job poster
RG

Ravi Gupta

Recruiter @ AI ASAP LLC
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

It looks like there aren't any Similar Jobs for this job yet.

Search all similar jobs