Data Engineer - Apache

Corning, NY, US • Posted 9 hours ago • Updated 9 hours ago
Contract W2
12 Months
No Travel Required
On-site
$44/hr
Fitment

Dice Job Match Score™

📊 Calculating match score...

Job Details

Skills

  • Apache Airflow
  • Data Engineering
  • Data Quality
  • Data Integrity
  • Data Processing
  • Apache
  • date engineer

Summary

Role: Data Engineer - Apache

Location: Data Engineer

Type: 100% on site in Corning, NY

Duration: Long Term

 

Education and Experience:

·         This position focuses on Data pipelines & workflows

·         Bachelor’s degree in computer science, information systems, data engineering, or related field, or equivalent practical experience. May consider an Associates if the candidate has an additional 3-5 years experience than what is being required.

·         2+ years of professional experience in data engineering, ETL development, or related work, or equivalent hands-on experience

·         Experience or interest in scientific software, materials science, research environments, or technically complex domains is a plus

 

Scope of the position

·         Embed within a cross-functional Agile team, participating in sprint planning, stand-ups, backlog refinement, and technical discussions.

·         Design, build, troubleshoot, and maintain ETL/ELT workflows that support application functionality, analytics, reporting, and scientific workflows.

·         Develop and manage data pipelines using Apache Airflow, ensuring reliable orchestration, scheduling, monitoring, and recovery of data processes.

·         Work with stakeholders including software developers, scientists, and engineers to understand data sources, workflow requirements, and downstream data needs.

·         Extract, transform, validate, and load data across systems, including relational databases such as Postgres SQL and Oracle.

·         Write, optimize, and maintain complex SQL queries, scripts, and transformation logic to support operational and analytical use cases.

·         Troubleshoot data quality issues, ETL failures, pipeline bottlenecks, and schema inconsistencies; identify root causes and implement durable solutions.

·         Support database exploration, data validation, and troubleshooting using tools such as DBeaver and related database utilities.

·         Evaluate and help adopt new data tools and technologies, including lightweight analytics and transformation solutions (e.g. DuckDB) where appropriate.

·         Collaborate with engineering teams to support reliable integration between data pipelines, applications, APIs, and downstream consumers.

·         Assist with schema evolution, data modeling, migration planning, and data consistency across systems.

·         Document pipeline logic, data dependencies, transformation rules, and operational procedures to support maintainability and team knowledge sharing.

·         Help improve data engineering standards, observability, testing practices, and operational reliability across the team.

·         Regularly interacting with scientists and engineers to understand research and technical workflows; experience in scientific or research environments is a strong plus.

 Technical Skills – 2+ years (or commensurate experience):

·         Experience designing, building, and troubleshooting ETL/ELT pipelines

·         Hands-on experience with workflow orchestration tools, preferably Apache Airflow

·         Strong experience writing and optimizing SQL

·         Experience working with relational databases, especially Postgres SQL and Oracle

·         Ability to develop and maintain data transformations, validation steps, and pipeline logic across multiple systems

·         Experience with database tools such as DBeaver or similar for query development, exploration, and troubleshooting

·         Familiarity with modern data processing and analytical tools such as DuckDB or interest in evaluating emerging data technologies

·         Understanding of data modeling, schema design, data integrity, and performance tuning

·         Experience troubleshooting pipeline failures, performance issues, and inconsistent or incomplete datasets

·         Familiarity with scripting or programming for pipeline development and automation; Python experience is strongly preferred

·         Understanding of version control and collaborative development workflows

·         Experience supporting production data systems with an emphasis on reliability, maintainability, and clear documentation

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: indony
  • Position Id: 8970815
  • Posted 9 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Painted Post, New York

Today

Easy Apply

Contract

$40 - $43

Corning, New York

Today

Easy Apply

Contract, Third Party

Depends on Experience

Corning, New York

Today

Full-time

USD 155,029.00 per year

Ithaca, New York

Today

Full-time

USD 140,000.00 - 200,000.00 per year

Search all similar jobs