Our client is currently seeking a Data/Cloud Engineer (No C2C)
This job will have the following responsibilities:
* Develop data driven solutions utilizing current and next generation technologies to meet evolving business needs.
* Work heavily within the AWS ecosystem, using AWS services
* Operationalize open source data-analytic tools for enterprise use.
* Develop real-time data ingestion and stream-analytic solutions leveraging technologies such as Kafka, Apache Spark, NIFI, Python, Kinesis, and Hadoop/EMR.
* Custom Data pipeline development (Cloud and locally hosted)
* Work heavily within the Hadoop ecosystem.
* Provide support for deployed data applications and analytical models by being a trusted advisor to Data Scientists and other data consumers by identifying data problems and guiding issue resolution with partner Data Engineers and source data providers.
* Provide subject matter expertise in the analysis, preparation of specifications and plans for the development of data processes.
* Ensure proper data governance policies are followed by implementing or validating Data Lineage, Quality checks, classification, etc.
* Utilize multiple development languages/tools such as Python, SPARK(in scala), Java to build prototypes and evaluate results for effectiveness and feasibility.
* Ability to quickly identify an opportunity and recommend possible technical solutions.
Qualifications & Requirements:
* AWS data services (Lambda,Glue, EMR, Kinesis, Step Functions, Data Pipeline)
* Data pipeline development
* Experience in developing Python / R applications
* Spark application coding in Scala / Python (pySpark)
* Deep knowledge and very strong in SQL, and Relational Databases
* Strong in Unix / Shell scripting
* Experience in creating very efficient HiveQL and SparkQL queries and can educate peers on the topic
* Building custom NiFi processors
* Deep understanding of the Hadoop technology stack