Hadoop Engineer

Machine learning, ML, MLLIB, Big Data Hadoop, Java, Scala, Python, SQL, NoSQL, Sqoop, Hive, Pig, Solr, MR, Spark, Spark SQL
Contract W2
Depends on Experience
Travel not required

Job Description

Hadoop Engineer

Pleasanton CA 94588

12 months contract



The tasks for the Hadoop Engineer include, but are not limited to, the following:

  • Provide vision, gather requirements and translate client user requirements into technical architecture.
  • Design and implement an integrated Big Data platform and analytics solution
  • Design and implement data collectors to collect and transport data to the Big Data Platform.
  • Implement monitoring solution(s) for the Big Data platform to monitor health on the infrastructure.



  • Project Experience in Query Processing Language (QPL) – a search engine independent technology for Advance Query Processing is highly desirable.
  • 4+ years of hands-on Development, Deployment and production Support experience in Big Data environment.
  • 4-5 years of programming experience in Java, Scala, Python. 
  • Proficient in SQL and relational database design and methods for data retrieval.
  • Knowledge of NoSQL systems like HBase or Cassandra
  • Hands-on experience in Cloudera Distribution 6.x
  • Hands-on experience in creating, indexing Solr collections in Solr Cloud environment.
  • Hands-on experience building data pipelines using Hadoop components Sqoop, Hive, Solr, MR, Impala, Spark, Spark SQL.
  • Must have experience with developing Hive QL, UDF’s for analyzing semi structured/structured datasets.
  • Must have experience with Spring framework, Web Services, and REST APIs.
  • Hands-on experience ingesting and processing various file formats like Avro/Parquet/Sequence Files/Text Files etc.
  • Must have working experience in the data warehousing and Business Intelligence systems.
  • Expertise in Unix/Linux environment in writing scripts and schedule/execute jobs.
  • Successful track record of building automation scripts/code using Java, Bash, Python etc. and experience in production support issue resolution process.
  • Experience in building ML models using MLLib or any ML tools.
  • Hands-on experience working in Real-Time analytics like Spark/Kafka/Storm
  • Experience with Graph Databases like Neo4J, Tiger Graph, Orient DB
  • Agile development methodologies.
Dice Id : objectwi
Position Id : MK-12111801
Originally Posted : 3 years ago
Have a Job? Post it