Hadoop Platform Engineer

  • Dallas, TX
  • Posted 60+ days ago | Updated 22 days ago

Overview

On Site
Depends on Experience
Full Time
Accepts corp to corp applications
Able to Provide Sponsorship

Skills

Hadoop
Python

Job Details

Hi,

Myself Mankar from Photon and I have a position with our direct client, please send me your updated resume if you are interested and you can connect me through

Job Title: Hadoop Platform Engineer

Location: Dallas, TX - onsite

Job Description:

Required Skills:

  • Expertise in design, implement, and maintain Hadoop clusters in large volume, including components such as HDFS, YARN, and MapReduce.
  • Collaborate with data engineers and data scientists to understand data requirements and optimize data pipelines.
  • Experience in administering and monitoring Hadoop clusters to ensure high availability, reliability, and performance.
  • Experience in troubleshooting and resolving issues related to Hadoop infrastructure, data ingestion, data processing, and data storage.
  • Experience in Implementing and managing security measures within Hadoop clusters, including authentication, authorization, and encryption.
  • Collaborate with cross-functional teams to define and implement backup and disaster recovery strategies for Hadoop clusters.
  • Experience in optimizing Hadoop performance through fine-tuning configurations, capacity planning, and implementing performance monitoring and tuning techniques.
  • Work with DevOps teams to automate Hadoop infrastructure provisioning, deployment, and management processes.
  • Stay up to date with the latest developments in the Hadoop ecosystem.
  • Recommend and implement new technologies and tools that enhance the platform.
  • Experience in documenting Hadoop infrastructure configurations, processes, and best practices.
  • Provide technical guidance and support to other team members and stakeholders.

Skills:

Technical Proficiency:

Experience with Hadoop and Big Data technologies, including Cloudera CDH/CDP, Data Bricks, HD Insights, etc.

Strong understanding of core Hadoop services such as HDFS, MapReduce, Kafka, Spark, Hive, Impala, HBase, Kudu, Sqoop, and Oozie.

Proficiency in RHEL Linux operating systems, databases, and hardware administration.

Operations and Design:

Operations, design, capacity planning, cluster setup, security, and performance tuning in large-scale Enterprise Hadoop environments.

Scripting and Automation:

Proficient in shell scripting (e.g., Bash, KSH) for automation.

Security Implementation:

Experience in setting up, configuring, and managing security for Hadoop clusters using Kerberos with integration with LDAP/AD.

Problem Solving and Troubleshooting:

Expertise in system administration and programming skills for storage capacity management, debugging, and performance tuning.

Collaboration and Communication:

Collaborate with cross-functional teams, including data engineers, data scientists, and DevOps teams.

Provide technical guidance and support to team members and stakeholders.

Skills:

On-prem instance

Hadoop config, performance, tuning

Ability to manage very large clusters and understand scalability

Interfacing with multiple teams

Many teams have self service capabilities, so should have this experience managing this with multiple teams across large clusters.Hands-on and strong understanding of Hadoop architecture

Experience with Hadoop ecosystem components - HDFS, YARN, MapReduce & cluster management tools like Ambari or Cloudera Manager and related technologies.

Proficiency in scripting, Linux system administration, networking, and troubleshooting skills

Qualifications:

  • Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent work experience).
  • Strong experience in designing, implementing, and administering Hadoop clusters in a production environment.
  • Proficiency in Hadoop ecosystem components such as HDFS, YARN, MapReduce, Hive, Spark, and HBase.
  • Experience with cluster management tools like Apache Ambari or Cloudera Manager.
  • Solid understanding of Linux/Unix systems and networking concepts.
  • Strong scripting skills (e.g., Bash, Python) for automation and troubleshooting.
  • Knowledge of database concepts and SQL.
  • Experience with data ingestion tools like Apache Kafka or Apache NiFi.
  • Familiarity with data warehouse concepts and technologies.
  • Understanding of security principles and experience implementing security measures in Hadoop clusters.
  • Strong problem-solving and troubleshooting skills, with the ability to analyze and resolve complex issues.
  • Excellent communication and collaboration skills to work effectively with cross-functional teams.
  • Relevant certifications such as Cloudera Certified Administrator for Apache Hadoop (CCAH) or Hortonworks Certified Administrator (HCA) are a plus.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.