This position is 100% remote.
Fulltime position with benefits (Medical, dental, vision), holidays and vacation.
As a Data Engineer, you will be a part of an early-stage team that builds the data transport, collection, and storage, We are looking for a Data Engineer to build a scalable data platform. You'll have ownership of our core data pipeline that powers client’s top line metrics; You will also use data expertise to help evolve data models in several components of the data stack; You will help architect, building, and launching scalable data pipelines to support growing data processing and analytics needs. Your efforts will allow access to business and user behavior insights, using huge amounts of data to fuel several teams such as Analytics, Data Science, Marketplace and many others.
Owner of the core company data pipeline, responsible for scaling up data processing flow to meet the rapid data growth.
Evolve data model and data schema based on business and engineering needs
Implement systems tracking data quality and consistency
Develop tools supporting self-service data pipeline management (ETL)
SQL and MapReduce job tuning to improve data processing performance
5+ years of relevant professional experience
Experience with Hadoop (or similar) Ecosystem (MapReduce, Yarn, HDFS, Hive, Spark, Presto, Pig, HBase, Parquet)
Proficient in at least one of the SQL languages (MySQL, PostgreSQL, SqlServer, Oracle)
Good understanding of SQL Engine and able to conduct advanced performance tuning
Strong skills in scripting language (Python, Ruby, Bash)
1+ years of experience with workflow management tools (Airflow, Oozie, Azkaba