Hadoop Developer / Hadoop Architect at Charlotte NC/ Pennington NJ / Dallas TX || Full Time \

Hadoop, ETL
Full Time, 4.6 years

Job Description

Hadoop Developer

Charlotte NC / Pennington NJ/ Dallas TX

Full Time

Minimum 7+ years exp.

Extensive Knowledge on ETL and Teradata.

Good exposure to Hadoop Eco system.

Minimum 1 year hands on exp in Pyspark.

Job scheduling tools (e.g. Autosys) & Version control tool like Git .

Unix shell scripting.

Basic knowledge on Mainframe, should be able to navigate through the jobs and code.

Quick learner and self-starter who requires minimal supervision to excel in a dynamic environment.

Strong Verbal and written Communication skills.

Prior experience of working with globally distributed teams

Agile driven development

Hadoop Architect

Charlotte NC


Hadoop, Spark, SparkML, IBM Spectrum


Minimum 7 years of experience. Be responsible for the design and support of the API(s) between IBM Spectrum Conductor (as a distributed compute and storage platform), and the Bank of America application that exposes a data scientist user experience and model governance.

Capture cluster tenant compute and storage functional and nonfunctional requirements; and translate into distributed cluster capacity, configuration, and user provisioning settings.

Develop, test, and analyze code/scripts written in PySpark, Python, Java, and other shell scripts, to provide specified behavior on a distributed IBM Spectrum Conductor cluster.

Provide "how-to" technical support for tenant developers developing runtimes and persisting data on a distributed IBM Spectrum Conductor cluster.

Be an active member of the Agile scrum, and be a part of the features that emerge from the team.

Perform peer code and test case reviews, and help foster a healthy technical community by helping peers.


Experience with Agile/Scrum methodology is essential.

Experience with either Apache Spark-On-YARN (Hadoop) or Apache Spark-On-EGO (IBM Spectrum Conductor) is essential.

Experience with Apache Spark Libraries: PySpark, Spark SQL, Spark Streaming, MLlib are essential.

Experience with either Hadoop/YARN or IBM Spectrum Conductor/EGO cluster resource manager is essential.

Experience with RedHat Linux (RHEL) command line and shell scripting are essential.

Experience with file formats CSV, JSON, ORC, Avro, Parquet, Protocol Buffers are essential.

Experience with Python, Java, and R are highly desirable.

Experience with Numpy, and Pandas are highly desirable.

Experience with designing and configuring distributed architectures are desirable.

Knowledge of CI/CD SDLC practices.

Knowledge of Scikit-Learn, PyTorch, Keras, H20.ai.

Strong communication skills, should be able to communicate effectively with business and other stakeholders.

Demonstrate ownership and initiative taking

Awanish Shekhar

Executive Sales

USA Okaya Inc.

4949 Expy Dr N, Suite 101, Ronkonkoma, NY 11779

Cell No

Landline:extn 641

Email Awanish.shekhar@okayainc.net || www.okayainc.com


Dice Id : 10267824
Position Id : 2021-75026
Originally Posted : 1 month ago
Have a Job? Post it

Similar Positions

Security Analyst
  • TM Floyd & Company
  • Columbia, SC, USA
Frontend Engineer
  • Jobot
  • Orem, UT, USA
UI Designer
  • Jobot
  • Los Angeles, CA, USA
Java Engineer
  • Jobot
  • Cleveland, OH, USA
Lead DevOps Engineer
  • Jobot
  • San Francisco, CA, USA
Implementation Engineer - Salesforce
  • Jobot
  • Sacramento, CA, USA
Quality Manager
  • Jobot
  • Montville, NJ, USA
Product Manager
  • Jobot
  • Oklahoma City, OK, USA