We have a long term contract position with our client, to work onsite. Please review below job description and send us Resume, Work Authorization, Current Location and Availability, if interested to sam at 40kinc dot com
- Spark Developer with experience in large scale production deployment, data pipeline and large scale computation
- Experience in using programming languages such as Scala, Python, PySpark to mine and query data for analysis and sometimes use big data SQL engines.
- Develop data set processes for data modeling, mining and production, recommend ways to improve data reliability, efficiency and quality
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
- Engineer works with the data scientists in order to understand and aid in the implementation of requirements, analyze performance, and troubleshoot any existent issues.
- Defines and builds the data pipelines that will enable faster, better, data-informed decision-making within the business.
- Designing and developing scalable ETL packages from the business source systems and the development of ETL routines in order to populate databases from sources and also to create aggregates