Data Engineer

Overview

On Site
Depends on Experience
Full Time

Skills

Spark
AWS
EMR
Glue

Job Details

1. Data Engineers

Fulltime Position

Key Skills

  • Apache Spark (PySpark or Scala): Writing scalable distributed rule engines.
  • Apache Hive: Querying partitioned datasets, schema handling.
  • Amazon EMR: Cluster management, tuning jobs for large datasets (20+ TB).
  • AWS S3: Data partitioning, object lifecycle, access patterns.
  • Glue Data Catalog: Managing Hive table metadata.
  • Data Quality Frameworks: (Great Expectations, Deequ, or custom rules).
  • CI/CD: Code deployment pipelines (Git, Jenkins, Airflow, etc.).
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.