Urgent Direct End Client -Cloud DevOps Administrator Expert (100% Remote work)

Overview

Remote

$40 - $90

Contract - W2

Contract - Independent

Contract - 6 Month(s)

Skills

Bachelors degree in Computer Science

Information Systems

7+ years of experience in data engineering or big data development

4+ years of experience with Cloudera platform (HDFS

YARN

Hive

Spark

Oozie)

experience deploying and operating Cloudera workloads on AWS (EC2

IAM

CloudWatch)

strong proficiency in Scala

Java

and HiveQL

Python or Bash scripting experience preferred

strong proficiency in Apache Spark and Scala programming for data processing and transformation

hands-on experience with Cloudera distribution of Hadoop

hands-on experience implementing business-rules processing using Drools

ability to work with infrastructure

DevOps

and data governance teams in a multi-disciplinary environment.

Job Details

Cloud DevOps Administrator Expert

Important Details

Position Type: Remote
Client:

Position Overview

The Cloudera Data Engineer will play a key role in supporting the migration of a Medicaid Data Warehouse within an AWS cloud environment.
This role focuses on ensuring the seamless migration and continued operation of an existing Cloudera/Hive/Scala-based data pipeline from one AWS account to another while maintaining data integrity, system performance, and operational stability.

The selected consultant will collaborate closely with the AWS infrastructure team (VPC, IAM, S3, EC2, and networking) to replicate, configure, and optimize the Cloudera ecosystem for the new environment.

Key Responsibilities

Migration & Configuration

Replicate and configure existing Cloudera clusters (HDFS, YARN, Hive, Spark) within the new AWS account.
Coordinate with the infrastructure team to ensure proper provisioning of EC2, IAM roles, security groups, and networking.
Reconfigure cluster connectivity and job dependencies for the migrated environment.
Migrate and validate metadata stores, including Hive Metastore, job configurations, and dependencies.
Validate data integrity and ensure job outputs match the source environment.

Post-Migration Operations

Deploy, test, and operate existing Hive and Spark (Scala) jobs post-migration.
Maintain and manage job schedules, dependencies, and runtime configurations.
Monitor job performance and identify optimization opportunities.
Troubleshoot pipeline or cluster issues, implementing automated recovery and alert mechanisms.

Cluster Management

Monitor Cloudera Manager dashboards, ensuring cluster health and efficient resource utilization.
Manage user roles, permissions, and access within the Cloudera ecosystem.
Implement data cleanup, archiving, and housekeeping tasks to maintain system efficiency.
Create and maintain detailed migration documentation and operational runbooks.

Required Skills & Experience

Education:
• Bachelor’s degree in Computer Science, Information Systems, or a related discipline

Experience:
• 7+ years of experience in data engineering or big data development
• 4+ years working with Cloudera platform (HDFS, YARN, Hive, Spark, Oozie)
• Proven experience deploying and operating Cloudera workloads on AWS (EC2, S3, IAM, CloudWatch)
• Strong proficiency in Scala, Java, and HiveQL; scripting skills in Python or Bash preferred
• Advanced skills in Apache Spark & Scala programming for data processing and transformation
• Hands-on experience with Cloudera Hadoop distributions and Drools-based business rules processing
• Ability to collaborate effectively with infrastructure, DevOps, and data governance teams in complex enterprise environments

Preferred Qualifications

Cloudera Certification: CDP Data Engineer or Cloudera Administrator
Experience performing Cloudera version upgrades or AWS-to-AWS migrations
Experience in public sector or large enterprise data warehouse environments

Ideal Candidate Profile

The ideal candidate is a technically strong, detail-oriented data engineer with proven expertise in Cloudera, Hadoop, and AWS-based ecosystems.
They should have a track record of managing complex data migrations, optimizing Spark/Scala workflows, and ensuring data consistency and operational reliability in production environments.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.