Overview
Skills
Job Details
Job Hadoop Platform Engineer
Duration 6 Months
Location Minnesota 55445 (Remote)
Interview- Video
Note: The client doesn't need a data engineer, but rather someone with the exact same title as specified in the job description.
Shift Information M-F 8:00 - 5:00 PM
Must have
- Apache Hadoop
- Apache Hive
- Apache Spark
- Hadoop Distributed File System (HDFS)
- Java
- Linux
Nice To Have
Role is specifically a Hadoop Platform Engineer
Engineer will have deep, hands-on experience across the Apache Hadoop ecosystem, focused on building and maintaining a reliable, scalable data platform; strong core Hadoop and Java expertise, enabling them to diagnose, optimize, and tune performance at the cluster and infrastructure levels. This role will also drive platform observability improvements, including standardizing monitoring, implementing health checks, and developing automated alerting systems-all to proactively identify and resolve issues before they impact users.
Description:
- Design, build, and maintain a reliable, scalable, and high-performance Hadoop platform that supports large-scale data processing and analytics workloads.
- Diagnose and optimize cluster-level performance issues across core Apache Hadoop components (HDFS, YARN, MapReduce, Hive, Spark, HBase) using deep Hadoop and Java expertise.
- Develop and standardize monitoring and observability frameworks for the Hadoop ecosystem, ensuring proactive detection of system health issues.
- Leverage strong Linux expertise (especially Ubuntu) to analyze system bottlenecks, perform kernel tuning, and optimize resource allocation for Hadoop workloads.
- Implement automated health checks and alerting systems to reduce operational noise and minimize reliance on user-reported issues.
- Collaborate with data engineering, platform, and infrastructure teams to tune resource utilization, improve job efficiency, and ensure cluster stability.
- Establish and enforce operational standards for performance tuning, capacity planning, and version upgrades across the Hadoop platform.
- Automate repetitive operational tasks and improve workflow efficiency using scripting languages (Python, Bash, or Scala)
- Maintain and enhance data security, governance, and compliance practices within the Hadoop ecosystem.
- Drive root cause analysis (RCA) and develop preventive measures for recurring production incidents.
- Document best practices, operational runbooks, and configuration standards to ensure knowledge sharing and consistent platform management.