Overview
On Site
Up to $60
Accepts corp to corp applications
Contract - W2
Contract - 12 Month(s)
Skills
spark
pyspark
python coding
GCP
Dataflow
Data Proc
BigQuery
Airflow
Hadoop Big data ecosystem
Spark - Batch & Streaming
Apache Kafka
Data analysis / Data mapping
JSON and XML
Job Details
Good spark pyspark python coding along with Google Cloud Platform
Required Qualifications:
Familiarity with the Technology stack available in the industry for data management, data ingestion, capture, processing and curation
ETL development experience with strong SQL background, analyze huge data sets, trends and issues, and create structured outputs.
Experience in building high-performing data processing frameworks leveraging Google Cloud Platform
Experience in building data pipelines supporting both batch and real-time streams to enable data collection, storage, processing, transformation and aggregation.
Experience in utilizing Google Cloud Platform Services like Big Query, Composer, Dataflow, Pub-Sub, Cloud Monitoring
Experience in performing ETL and data engineering work by leveraging multiple google cloud components using Dataflow, Data Proc, BigQuery
Experience in scheduling like Airflow, Cloud Composer etc.
Experience in Hadoop Big data ecosystem
Spark - Batch & Streaming (Python, Scala )
Apache Kafka hands on experience
Experience in developing both batch and real-time streaming data pipelines
Python/Shell scripting
Nice to have Qualifications:
Strong understanding towards Kubernetes, Docker containers and to deploy Google Cloud Platform services is a plus
Knowledge of Scrum/Agile development methodologies is a plus
Any experience with Spark, PySpark, or Kafka is a plus
Data analysis / Data mapping skills is a plus
Knowledge in data manipulation JSON and XML
Familiarity with the Technology stack available in the industry for data management, data ingestion, capture, processing and curation
ETL development experience with strong SQL background, analyze huge data sets, trends and issues, and create structured outputs.
Experience in building high-performing data processing frameworks leveraging Google Cloud Platform
Experience in building data pipelines supporting both batch and real-time streams to enable data collection, storage, processing, transformation and aggregation.
Experience in utilizing Google Cloud Platform Services like Big Query, Composer, Dataflow, Pub-Sub, Cloud Monitoring
Experience in performing ETL and data engineering work by leveraging multiple google cloud components using Dataflow, Data Proc, BigQuery
Experience in scheduling like Airflow, Cloud Composer etc.
Experience in Hadoop Big data ecosystem
Spark - Batch & Streaming (Python, Scala )
Apache Kafka hands on experience
Experience in developing both batch and real-time streaming data pipelines
Python/Shell scripting
Nice to have Qualifications:
Strong understanding towards Kubernetes, Docker containers and to deploy Google Cloud Platform services is a plus
Knowledge of Scrum/Agile development methodologies is a plus
Any experience with Spark, PySpark, or Kafka is a plus
Data analysis / Data mapping skills is a plus
Knowledge in data manipulation JSON and XML
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.