- As a part of our Big Data Product team, the Data Solutions Engineer will be responsible for developing and validating of Big data products which runs on the large Hadoop cluster.
- The qualified engineer will be developing and testing ETL process, Migrating applications to the Corporate Grid , developing Data validation tools used for performing quality assessments and measurements on different data set that feeds BDAI products.
- The candidate will be involved in:
- Building big data and batch/real-time analytical solutions that leverage emerging technologies.
- Migrating applications to The Corporate Grid server.
- Perform data migration and conversion activities on different applications and platforms.
- Design , development and testing of data ingestion pipelines, perform end to end automation of ETL process and migration of various datasets that are being ingested into the platform.
- Perform data profiling, discovery, analysis, suitability and coverage of data, and identify the various data types, formats, and data quality issues which exist within a given data source.
- Develop transformation logic, interfaces and reports as need to meet project requirements.
- Participate in discussion for technical architecture, data modeling , ETL standards, Migration activities , collaborate with Product Managers and Architects to establish the physical application framework (e.g. libraries, modules, execution environments)
- Provide technical guidance to the other team members and contribute to the technical design and development for Data migration and Data Quality framework.
- Tuning performance optimization of data pipelines.
- Develop unit and integrated automated test suites to validate end to end data pipeline flow, data transformation rules, and data integrity.
- Develop tools to measure the data quality and visualize the anomaly pattern in source and processed data.
- Integrate automated process into continuous integration workflows.
- Contribute to data quality assurance standards and procedures.
- Bachelor’s degree in Computer Science or equivalent education/training
- 4- 5 years of Software development and testing experience.
- 3+ years of Working experience on tools like Hive, Spark, HBase, Sqoop, Impala, Kafka, Flume, Oozie, MapReduce, etc.
- 3+ years of programming experience in Scala, Java or Python
- Experience with development and automated testing in a CI/CD environment. Knowledge of GIT/Jenkins and pipeline automation is must.
- Experience with developing and testing real-time data-processing and Analytics Application System.
- Strong knowledge in SQL development on Database and/or BI/DW
- Strong knowledge in shell scripting
- Experience in Web Services - API development and testing.
- A solid understanding of common software development practices and tools.
- Strong analytical skills with a methodical approach to problem solving applied to Big Data domain
- Good organizational skills and strong written and verbal communication skills.
- Working experience on large migration Projects is a big plus.
- Development and Testing experience of Machine Learning Applications is a plus.
- Experience in load and performance testing, familiarity with testing tools such as JMeter.
- Familiarity with project Management and bug tracking tools, i.e., JIRA
No Sub-Contract, Third Parties or Corp to Corp arrangements! W2 only!
Note: U.S. Citizens and those authorized to work in the U.S. without sponsorship will be considered. We are unable to sponsor at this time.
Send resume to
Follow us on LinkedIn for job updates. https://www.linkedin.com/company/staffing-headquarters