Cloudera Hadoop Developer
Candidates will assist with data architecture, migration, and development of new processes flows on the Cloudera stack. Candidates must have a strong background with the Netezza architecture.
- Understanding and hands on technical expertise of building extracts and services layers from Hadoop / Impala data ecosystem for third-party analytics solution integration
- Needed to help design a solution to enable other data scientists, enable the Cloudera platform, and help configure the platform.
- Someone who knows the Hadoop ecosystem and how R or SAS work hand in hand within the platform.
- Experience in one or more of the following Python, Mathematica, Matlab, Octave, C++, Java
- Experience with Netezza architecture and migrations to Hadoop stack is a must
- Develop and deploy analytics with heterogeneous data environment
- Understanding of statistical modeling and machine learning techniques
- Proven data manipulation skills (SQL applied to Oracle and DB2 databases, scripting languages, Big Data related technologies)
- Strong ability to formulate business problems mathematically and strong problem solving skills;
- Superior oral and written communication skills, especially the ability to communicate technical issues to a non-tech audience;
- Critical thinking and intellectual curiosity
- Superior learning abilities
- The drive to deliver on commitments and an openness to new ideas
- Enjoy working in a team-based environment
- Experience with Hadoop Cloudera stack
- Experience with Netezza
- Experience with Electronic Medical Records (EMR) / Clinical systems
*Cloudera Hadoop Developer Consultants are required to have:
- 4+ years of relevant technology architecture consulting or industry experience to include experience in Big data platforms, Information delivery, Analytics and Business Intelligence based on data from Cloudera Hadoop
- At least 4 year hands-on working experience with the following technologies: Hadoop, Mahout, Pig, Hive, HBase, Sqoop, Zookeeper, Ambari, MapReduce.
- Proven track record of architecting, designing, developing, implementing and maintaining large scale Cloud Data Service technologies and processes
- AWS certified SysOps Administrator, DevOps Engineer or Solutions Architect
- Extensive experience with Amazon Web Services (AWS), having managed services and applications in a large AWS cross-account environment.
- Columnar and MPP database Redshift, or similar technologies
- Experience using scheduling tools preferably Autosys
- Understanding of the benefits of big data ecosystem, data warehousing, data architecture, data quality processes, data warehousing design and implementation, table structure, fact and dimension tables, logical and physical database design, data modeling, reporting process metadata, and ETL processes
- Experience designing and developing data cleansing routines utilizing typical data quality functions involving standardization, transformation, rationalization, linking and matching
- Experience with Data Integration on traditional and Hadoop environments
- Experience working with multi-Terabyte data sets
- Experience working with commercial distributions of HDFS, preferably Cloudera
- Experience with Hadoop Cluster Administration and performance tuning
The Company is an equal opportunity employer and makes employment decisions on the basis of merit and business needs. The Company will consider all qualified applicants for employment without regard to race, color, religious creed, citizenship, national origin, ancestry, age, sex, sexual orientation, genetic information, physical or mental disability, veteran or marital status, or any other class protected by law. To comply with applicable laws ensuring equal employment opportunities to qualified individuals with a disability, the Company will make reasonable accommodations for the known physical or mental limitations of an otherwise qualified individual with a disability who is an applicant or an employee unless undue hardship to the Company would result.