Cloudera Data Engineer - REMOTE

  • Posted 4 hours ago | Updated 4 hours ago

Overview

Remote
Depends on Experience
Contract - Independent
Contract - W2
Contract - 16 Month(s)

Skills

big data
Cloudera
Bash scripting
Scala

Job Details

Key Responsibilities

  • Replicate and configure existing Cloudera cluster (HDFS, YARN, Hive, Spark) in the new AWS account.
  • Coordinate with project team to ensure proper infrastructure provisioning (EC2, security groups, IAM roles, and networking).
  • Reconfigure cluster connectivity and job dependencies for the new environment.
  • Migrate and validate metadata stores (Hive Metastore, job configs, dependencies).
  • Validate job execution and data outputs for parity with existing environment.
  • Deploy, test, and operate existing Hive, Spark (Scala) jobs post-migration.
  • Maintain job schedules, dependencies, and runtime configurations.
  • Monitor job performance, identify bottlenecks, and apply tuning or code-level optimizations.
  • Troubleshoot failures and implement automated recovery or alerting where applicable.
  • Monitor Cloudera Manager dashboards, cluster health, and resource utilization.
  • Manage user roles and access within Cloudera environment.
  • Implement periodic data cleanup, archiving, and housekeeping processes.
  • Document configurations, migration steps, and operational runbooks.

Required Skills and Experience:

  • Bachelor s degree in computer science, Information Systems, or a related field.
  • 7+ years of experience in data engineering or big data development
  • 4+ years experience with Cloudera platform (HDFS, YARN, Hive, Spark, Oozie)
  • Experience deploying and operating Cloudera workloads on AWS (EC2, S3, IAM, CloudWatch)
  • Strong proficiency in Scala, Java and HiveQL; Python or Bash scripting experience preferred
  • Strong proficiency in Apache Spark & Scala programming for data processing and transformation.
  • Hands on experience with Cloudera distribution of Hadoop.
  • Hands-on experience implementing business-rules processing using Drools.
  • Able to work with infrastructure, DevOps, and data governance teams in a multi-disciplinary environment.

Preferred Qualifications:

  • Candidates with Cloudera certification (e.g., CDP Data Engineer or Cloudera Administrator)
  • Experience with Cloudera version upgrades or AWS-to-AWS environment migrations.
  • Experience in public-sector or large enterprise data environments.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.