Data Engineer Python & Iceberg Specialist

Overview

Remote
$0 - $0
Contract - W2
Contract - Independent
Contract - 12 Month(s)

Skills

Apache Flink
Apache Spark
Apache Ice berg

Job Details

We are currently looking to hire a [Data Engineer Python & Iceberg Specialist] and we believe your skills and expertise are a better match for this role. We have an exciting career opportunity for you with one of our esteemed clients at [Remote]

NJTECH is a globally managed IT services, IT consulting and business solutions partner. Our "High Performance Business" strategy builds our expertise in technology and consulting. We play a major role in helping our clients to achieve their objectives at the highest level; ultimately creating sustainable value to customers.

Role: Data Engineer Python & Iceberg Specialist

Location: Remote

Duration: Long-term

Responsibilities:

Key Responsibilities

  • Design and implement data access layers for web applications using Iceberg.
  • Develop efficient querying workflows using Pandas, PyArrow, and DuckDB.
  • Optimize memory-heavy operations and improve performance for large datasets.
  • Build and maintain ETL pipelines for batch updates and overwrite workflows.
  • Manage Iceberg table metadata, schema evolution, and partitioning strategies.
  • Collaborate with backend engineers to integrate data services into RESTful APIs.
  • Implement caching and pre-processing strategies to reduce latency.
  • Ensure data integrity and consistency across snapshots and versions.

Required Skills & Experience

  • Strong Python programming skills with experience in data engineering.
  • Hands-on experience with PyIceberg or similar technologies (Delta Lake, Hive).
  • Proficiency in Pandas, PyArrow, and DuckDB for data manipulation.
  • Understanding of data lake architectures, Parquet format, and columnar storage.
  • Experience with ETL design, batch processing, and overwrite workflows.
  • Familiarity with cloud storage systems (e.g., AWS S3, Azure Data Lake Storage).
  • Knowledge of query optimization and performance tuning for large datasets.

Preferred Qualifications

  • Experience integrating data workflows with FastAPI, Flask, or similar frameworks.
  • Background in data governance, metadata management, and schema evolution.
  • Exposure to distributed systems and big data processing frameworks (Spark, Flink) is a plus.

Soft Skills

  • Strong problem-solving and analytical skills.
  • Ability to work collaboratively with cross-functional teams.
  • Adaptability to evolving technologies and project requirements.

Why Join Us?

  • Work on cutting-edge data lake technologies.
  • Collaborate with a dynamic team building scalable web applications.
  • Opportunity to influence architecture and performance optimizations.

NJTECH is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

NJTECH is a globally managed IT service, IT consulting and Business solutions partner. Our "High Performance Business" strategy builds our expertise in technology and consulting. Our offshore consulting plays a major role in helping clients to achieve their objectives in the highest level; ultimately creating sustainable value to customers. Come, transform your career with us and be a part of our high-performing team.

REGARDS

HAAS A

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.