Profile: Hadoop Stack Developer and Administrator
“Transforming large, unruly data sets into competitive advantages”Purveyor of competitive intelligence and holistic, timely analyses of Big Data made possible by the successful installation, configuration and administration of Hadoop ecosystem components and architecture.
- Two years’ experience installing, configuring, testing Hadoop ecosystem components.
- Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
- Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review.
- Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
- Hortonworks Certified Hadoop Developer, Cloudera Certified Hadoop Developer and Certified Hadoop Administrator.
- Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Cassandra, Oozie, Flume, Chukwa, Pentaho Kettle and Talend
- Programming Languages: Java, C/C++, eVB, Assembly Language (8085/8086)
- Databases: NoSQL, Oracle
- UNIX Tools: Apache, Yum, RPM
- Tools: Eclipse, JDeveloper, JProbe, CVS, Ant, MS Visual Studio
- Platforms: Windows(2000/XP), Linux, Solaris, AIX, HPUX
- Application Servers: Apache Tomcat 5.x 6.0, Jboss 4.0
- Testing Tools: NetBeans, Eclipse, WSAD, RAD
- Methodologies: Agile, UML, Design Patterns
Professional Experience: Hadoop Developer Investor Online Network, Englewood Cliff, New Jersey2013 to present Facilitated insightful daily analyses of 60 to 80GB of website data collected by external sources. Spawning recommendations and tips that increased traffic 38% and advertising revenue 16% for this online provider of financial market intelligence.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
- Managed and reviewed Hadoop log files.
- Tested raw data and executed performance scripts.
- Shared responsibility for administration of Hadoop, Hive and Pig.
- Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Supported code/design analysis, strategy development and project planning.
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Assisted with data capacity planning and node forecasting.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Administrator for Pig, Hive and Hbase installing updates, patches and upgrades.
- Led the migration of monthly statements from UNIX platform to MVC Web-based Windows application using Java, JSP, Struts technology.
- Prepared use cases, designed and developed object models and class diagrams.
- Developed SQL statements to improve back-end communications.
- Incorporated custom logging mechanism for tracing errors, resolving all issues and bugs before deploying the application in the WebSphere Server.
- Received praise from users, shareholders and analysts for developing a highly interactive and intuitive UI using JSP, AJAX, JSF and JQuery techniques.
- View samples at www.myportfolio.com/aburke
Education, Training and Professional Development New Jersey Institute of Technology, BS Computer Science Hadoop Training Accelebrate: “Hadoop Administration Training” Cloudera University Courses: “Hadoop Essentials” and “Hadoop Fundamentals I & II” MapReduce Courses: “Introduction to Apache MapReduce and HDFS,” “Writing MapReduce Applications” and “Intro to Cluster Administration” Nitesh Jain: “Become a Certified Hadoop Developer” Member, Hadoop Users Group of New Jersey