Sr. Big Data Engineer

Depends on Experience

Contract: W2, Independent, Corp-To-Corp, 12 Month(s)


    • Pyspark
    • SQL
    • Hadoop

    Job Description

    Here are some tasks that you could do day to day.
    Design and implement distributed data processing pipelines using Spark, Hive, Python, and other tools and languages prevalent in the Hadoop ecosystem.  You will be given the opportunity to own the design and implementation. You will collaborate with Product managers, Data Scientists, Engineering folks to accomplish your tasks.
    Publish RESTful API’s to enable real-time data consumption using OpenAPI specifications. This will enable many teams to consume the data that is being produced.
    Explore and build proof of concepts using open source NOSQL technologies such as HBase, DynamoDB, Cassandra and Distributed Stream Processing frameworks like ApacheSpark, Flink, Kafka stream.
    Take part in DevOps by building utilities, user defined functions and frameworks to better enable data flow patterns.
    Work with architecture/engineering leads and other teammates to ensure high quality solutions through code reviews, engineering best practices documentation.
    Experience in Business Rule management systems like Drools will also come in handy.
    Some combination of these qualifications and technical skills will position you well for this role:
    MS/BS degree in a computer science or related discipline
    5+ years’ experience in large-scale software development/Big Data technologies
    Programming skills in Java/Scala, Python, Shell scripting, and SQL
    Development skills around Spark, MapReduce, and Hive
    Strong skills around developing RESTful API’s