Bigdata Engineer

New York, NY, US • Posted 12 hours ago • Updated 12 hours ago
Full Time
On-site
$60 - $70/hr
Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

  • Hadoop
  • Pyspark
  • Scala
  • Java
  • Bigdata
  • AWS

Summary

Title: Data Engineer Intermediate Location: Manhattan West, NY - Onsite Duration: Right to hire - W2 (USC)
Job Summary
We are seeking a skilled Data Engineer to design, build, and manage scalable ETL pipelines supporting a centralized data lake and Snowflake data warehouse. The role focuses on automating data ingestion, transformation, and aggregation workflows to enable reliable analytics and data-driven decision-making.

Key Responsibilities

  • Design, develop, and maintain robust ETL pipelines for ingesting data into the enterprise data lake and Snowflake environment.
  • Automate data processing, aggregation, and analytical workflows to improve data availability and performance.
  • Implement and manage orchestration and scheduling of data pipelines using Control$B!>(BM and Apache Airflow.
  • Develop scalable data transformation logic using PySpark and Apache Spark (Java).
  • Work with large, structured and semi-structured datasets on AWS infrastructure.
  • Ensure data quality, integrity, and reliability across data pipelines.
  • Optimize data pipelines for performance, cost, and scalability.
  • Collaborate with analytics, data science, and business teams to understand data requirements.
  • Monitor, troubleshoot, and resolve pipeline failures and performance bottlenecks.
  • Follow best practices for data engineering, security, and documentation.

Required Skills & Qualifications

  • Strong experience with data lake architectures and large-scale data processing.
  • Hands-on experience with AWS services (e.g., S3, EC2, EMR, Glue, or related).
  • Proven expertise in building ETL pipelines for analytics and reporting use cases.
  • Solid working knowledge of Snowflake, including data loading, transformations, and performance optimization.
  • Experience with workflow automation and scheduling tools such as Control$B!>(BM and Apache Airflow.
  • Proficiency in PySpark for distributed data processing.
  • Strong programming experience with Apache Spark using Java.
  • Good understanding of data modeling, partitioning, and performance tuning concepts.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10115448
  • Position Id: 8943783
  • Posted 12 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

New York, New York

Today

Full-time

USD 149,302.00 - 202,267.00 per year

New York, New York

2d ago

Full-time

New York, New York

Today

Full-time

USD 152,000.00 - 215,000.00 per year

Hybrid in Newark, New Jersey

6d ago

Easy Apply

Full-time

$120,000 - $130,000

Search all similar jobs