Data Warehousing/ Data integration (Big Data/ Hadoop) Consultant

Overview

On Site
Depends on Experience
Accepts corp to corp applications
Contract - W2
Able to Provide Sponsorship

Skills

Access Control
Agile
Amazon Web Services
Apache Avro
Apache Hadoop
Apache Flume
Apache Hive
Apache Kafka
Apache Oozie
Apache Parquet
Apache Spark
Apache Sqoop
Apache ZooKeeper
BMC Control-M
Big Data
Cloud Computing
Cloudera
Cloudera Impala
Code Review
Conflict Resolution
Critical Thinking
DRP
Data Analysis
Data Dictionary
Data Governance
Data Integration
Data Masking
Data Migration
Data Profiling
Data Quality
Data Security
Attention To Detail
Data Warehouse
Database Architecture
Design Documentation
ELT
Extract
Transform
Load
File Formats
HDFS
Hue
Impact Analysis
Java
Kerberos
Linux
Log Analysis
MapReduce
Mapping
Microsoft PowerPoint
Microsoft Visio
Mockups
OAuth
Open Source
Oracle
PL/SQL
POC
Performance Tuning
Effective Communication
PySpark
Python
SAML
SAS
SQL
Scheduling
Scripting
Regulatory Compliance
Shell
Shell Scripting
Snow Flake Schema
Supervision
Systems Analysis/design
Unit Testing
Technical Writing
Use Cases
Waterfall
Workflow
Technical Support
Unix

Job Details

One of our clients is looking for the position of Technical Specialist based on following skills:

Location-Columbus, Ohio

On site, 5 days a week

Work Hours M-F 8:00AM to 5:00PM EST

Technical Specialist 4 (TS4) - for critical Enterprise Data warehouse. EDW M&O to migrate the production data (Weekly, Monthly, and yearly) from the current Big Data Environment to Snowflake Environment, run ELT jobs, and check the data quality from disparate data sources in Snowflake Environment to achieve Client s strategic and long-term business goals.


Responsibilities:

  • Participate in Team activities, Design discussions, Stand up meetings and planning Review with team.
  • Provide Snowflake database technical support in developing reliable, efficient, and scalable solutions for various projects on Snowflake.
  • Ingest the existing data, framework and programs from CLIENT EDW IOP Big data environment to the CLIENT EDW Snowflake environment using the best practices.
  • Design and develop Snowpark features in Python, understand the requirements and iterate.
  • Interface with the open-source community and contribute to Snowflake s open-source libraries including Snowpark Python and the Snowflake Python Connector.
  • Create, monitor, and maintain role-based access controls, Virtual warehouses, Tasks, Snow pipe, Streams on Snowflake databases to support different use cases.
  • Performance tuning of Snowflake queries and procedures. Recommending and documenting the best practices of Snowflake.
  • Explore the new capabilities of Snowflake, perform POC and implement them based on business requirements.
  • Responsible for creating and maintaining the Snowflake technical documentation, ensuring compliance with data governance and security policies.
  • Implement Snowflake user /query log analysis, History capture, and user email alert configuration.
  • Enable data governance in Snowflake, including row/column-level data security using secure views and dynamic data masking features.
  • Perform data analysis, data profiling, data quality and data ingestion in various layers using big data/Hadoop/Hive/Impala queries, PySpark programs and UNIX shell scripts.
  • Follow the organization coding standard document, Create mappings, sessions and workflows as per the mapping specification document.
  • Perform Gap and impact analysis of ETL and IOP jobs for the new requirement and enhancements.
  • Create mockup data, perform Unit testing and capture the result sets against the jobs developed in lower environment.
  • Updating the production support Run book, Control M schedule document as per the production release.
  • Create and update design documents, provide detail description about workflows after every production release.
  • Continuously monitor the production data loads, fix the issues, update the tracker document with the issues, Identify the performance issues.
  • Performance tuning long running ETL/ELT jobs by creating partitions, enabling full load and other standard approaches.
  • Perform Quality assurance check, Reconciliation post data loads and communicate to vendor for receiving fixed data.
  • Participate in ETL/ELT code review and design re-usable frameworks.
  • Create Change requests, workplan, Test results, BCAB checklist documents for the code deployment to production environment and perform the code validation post deployment.
  • Work with Snowflake Admin, Hadoop Admin, ETL and SAS admin teams for code deployments and health checks.
  • Create re-usable framework for Audit Balance Control to capture Reconciliation, mapping parameters and variables, serves as single point of reference for workflows.
  • Create Snowpark and PySpark programs to ingest historical and incremental data.
  • Create SQOOP scripts to ingest historical data from EDW oracle database to Hadoop IOP, created HIVE tables and Impala views creation scripts for Dimension tables.
  • Participate in meetings to continuously upgrade the Functional and technical expertise.

REQUIRED Skill Sets:

  • Proficiency in Data Warehousing, Data migration, and Snowflake is essential for this role.
  • Strong Experience in the implementation, execution, and maintenance of Data Integration technology solutions.
  • Minimum (4-6) years of hands-on experience with Cloud databases.
  • Minimum (2-3) years of hands-on data migration experience from the Big data environment to Snowflake environment.
  • Minimum (2-3) years of hands-on experience with the Snowflake platform along with Snowpipe and Snowpark.
  • Strong experience with Snow SQL, PL/SQL, and expertise in writing snowflake procedures using SQL/python/Java.
  • Experience with optimizing Snowflake database performance and real-time monitoring.
  • Strong database architecture, critical thinking, and problem-solving abilities.
  • Experience with the AWS platform Services.
  • Snowflake Certification is Highly desirable.
  • Snowpark with Python programming languages is preferred to be used to build data pipelines.
  • 8+ years of experience with Big Data, Hadoop on Data Warehousing or Data Integration projects.
  • Analysis, Design, development, support and Enhancements of ETL/ELT in data warehouse environment with Cloudera Bigdata Technologies (with a minimum of 8-9 years experience in Hadoop, MapReduce, Sqoop, PySpark, Spark, HDFS, Hive, Impala, StreamSets, Kudu, Oozie, Hue, Kafka, Yarn, Python, Flume, Zookeeper, Sentry, Cloudera Navigator) along with Oracle SQL/PL-SQL, Unix commands and shell scripting;
  • Strong development experience (minimum of 8-9 years) in creating Sqoop scripts, PySpark programs, HDFS commands, HDFS file formats (Parquet, Avro, ORC etc.), StreamSets pipeline creation, jobs scheduling, hive/impala queries, Unix commands, scripting and shell scripting etc.
  • Writing Hadoop/Hive/Impala scripts (minimum of 8-9 years experience) for gathering stats on table post data loads.
  • Strong SQL experience (Oracle and Hadoop (Hive/Impala etc.)).
  • Writing complex SQL queries and performed tuning based on the Hadoop/Hive/Impala explain plan results.
  • Experience building data sets and familiarity with PHI and PII data.
  • Expertise implementing complex ETL/ELT logic.
  • Accountable for ETL/ELT design documentation.
  • Basic knowledge of UNIX/LINUX shell scripting.
  • Utilize ETL/ELT standards and practices towards establishing and following centralized metadata repository.
  • Good experience in working with Visio, Excel, PowerPoint, Word, etc.
  • Effective communication, presentation and organizational skills.
  • Familiar with Project Management methodologies like Waterfall and Agile.
  • Ability to establish priorities and follow through on projects, paying close attention to detail with minimal supervision.
  • Required Education: BS/BA degree or combination of education & experience.

DESIRED Skill Sets:

  • In addition, to overall Snowflake experience, a candidate should have experience in the development work in both the Snowpipe and Snowpark.
  • Experience with Data Migration from Big data environment to Snowflake environment.
  • Strong understanding of Snowflake capabilities like Snowpipe, STREAMS, TASKS etc.
  • Knowledge of security (SAML, SCIM, OAuth, OpenID, Kerberos, Policies, entitlements etc.).
  • Has experience with System DRP for Snowflake systems
  • Demonstrate effective leadership, analytical and problem-solving skills.
  • Required excellent written and oral communication skills with technical and business teams.
  • Ability to work independently, as well as part of a team.
  • Stay abreast of current technologies in area of IT assigned.
  • Establish facts and draw valid conclusions.
  • Recognize patterns and opportunities for improvement throughout the entire organization.
  • Ability to discern critical from minor problems and innovate new solutions.

The successful candidate may have to undergo a drug test and background check.


Sincerely,

Tamana Nair

Digitek Software, Inc.

650 Radio Drive, Lewis

Center, OH 43035 Tel No ext. 3105/ Fax

Email

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.