Lead Data Engineer

Python, Spark, AWS, Hive, Pyspark, EMR/Athena, Pipelines
Contract W2, Contract Independent, Contract Corp-To-Corp, 12 Months
Depends on Experience
Work from home available

Job Description

About GSPANN

We work in an exploding market of retail and e-commerce. We have served as a trusted business partner for some of the world’s most respected brands. We’ve worked with more than 200 organizations and have served as a trusted business partner for some of the world’s most respected brands. Our solutions have the businesses create custom-designed technology platforms, which have transformed the way our clients connect with their employees, partners, and customers. GSPANN is headquartered in Milpitas, CA with satellite offices around the world.

 

To know more about GSPANN, visit our website www.gspann.com Or connect on social media platforms: LinkedIn, Twitter and Facebook.

 

Location: Portland, OR (Remote Till Covid Control)
Duration : Long Term

Skills: Bigdata/ Hadoop, Spark/Pyspark, Hive, AWS EMR/Athena, Data Pipeline, PySpark, Java with OO programming skills

Must have

  • Working experience and communicating with business stakeholders and architects
  • Industry experience in developing relevant big data/ETL data warehouse experience building cloud native data pipelines 
  • Experience in Python, Pyspark, Scala, Java and SQL Strong Object and Functional programming experience in Python
  • Experience worked with REST and SOAP based APIs to extract data for data pipelines
  • Extensive experience working with Hadoop and related processing frameworks such as Spark, Hive, Sqoop, etc.
  • Experience working in a public cloud environment, particularly AWS is mandatory
  • Ability to implement solutions with AWS Virtual Private Cloud, EC2, AWS Data Pipeline, AWS Cloud Formation, Auto Scaling, AWS Simple Storage Service, EMR and other AWS products, HIVE, Athena
  • Experience in working with Real time data streams and Kafka Platform.
  • Working knowledge with workflow orchestration tools like Apache Airflow design and deploy dags.
  • Hands on experience with performance and scalability tuning
  • Professional experience in Agile/Scrum application development using JIRA


Dice Id : 10200091
Position Id : 7171196
Originally Posted : 2 months ago
Have a Job? Post it