Data Engineer
Full Time
No Travel Required
On-site
Depends on Experience
Fitment
Dice Job Match Score™
⭐ Evaluating experience...
Job Details
Skills
- spark
- pyspark
Summary
Job Title: Data Engineer
Location: Pittsburgh, PA, Cleveland, OH, or Dallas, TX – (5 days Onsite) - Local to any of these locations
Job Type: Permanent Full Time
Salary: $100K - $120K/Year Plus benefits
Visa Accepted: USC
We are seeking a Data Engineer with 8+ years of experience to design and maintain scalable data pipeline supporting analytics, reporting, and operational needs. The role involves collaborating with cross-functional teams to ensure data alignment with business requirements and enterprise standards.
Your future duties and responsibilities:
- Design and build scalable data pipelines aligned with business needs
- Process large dataset (batch + sometimes near Realtime)
- Ensure data quality, consistency, and governance standards across systems
- Support data integration and transformation efforts for analytics and reporting platforms
- Maintain data dictionaries, metadata, and documentation
- Participate in data architecture reviews and model validation processes
- Support analytics reporting and risk platforms
Required qualifications to be successful in this role:
- 5+ years of experience in data engineering and big data processing
- Strong expertise in Apache Spark (Spark Core, Spark SQL) and PySpark for large-scale batch processing
- Experience working with structured and semi-structured data, including complex transformations and performance tuning
- Proficiency in data ingestion and integration from sources like Oracle, SQL Server, Hive, HDFS, and S3; transform data into ‘curated data models''
- Experience writing data to Hive tables, Data Lakes (Iceberg), and downstream reporting systems
- Strong knowledge of SQL and data modeling concepts
- Hands-on experience with Apache Airflow for workflow orchestration (DAG design, scheduling expectations, monitoring)
- Proficiency in shell scripting for job automation, file validation, dependency handling, and logging. Trigger Spark Jobs, perform file checks and validation; Archive & purge data; mange job dependency, logging & error handling
- Strong understanding of batch processing and batch job scheduling frameworks
- Experience migrating from CA7/Control-M Airflow (daily, hourly, weekly schedules) CI/CD for data pipelines
- Experience ensuring data quality, reliability, and compliance in regulated environments
- Good communication and documentation skills
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 91173660
- Position Id: 9008735
- Posted 7 hours ago
Create job alert
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs