Data Solutions Engineer
• As a part of our Big Data Product team, the Data Solutions Engineer will be responsible for developing and validating of Big data products which runs on the large Hadoop cluster.
• The qualified engineer will be developing and testing ETL process, Migrating applications to the Corporate Grid , developing Data validation tools used for performing quality assessments and measurements on different data set that feeds BDAI products.
• The candidate will be involved in:
o Building big data and batch/real-time analytical solutions that leverage emerging technologies.
o Migrating applications to The Corporate Grid server.
o Perform data migration and conversion activities on different applications and platforms.
o Design , development and testing of data ingestion pipelines, perform end to end automation of ETL process and migration of various datasets that are being ingested into the platform.
o Perform data profiling, discovery, analysis, suitability and coverage of data, and identify the various data types, formats, and data quality issues which exist within a given data source.
o Develop transformation logic, interfaces and reports as need to meet project requirements.
o Participate in discussion for technical architecture, data modeling , ETL standards, Migration activities , collaborate with Product Managers and Architects to establish the physical application framework (e.g. libraries, modules, execution environments)
o Provide technical guidance to the other team members and contribute to the technical design and development for Data migration and Data Quality framework.
o Tuning performance optimization of data pipelines.
o Develop unit and integrated automated test suites to validate end to end data pipeline flow, data transformation rules, and data integrity.
o Develop tools to measure the data quality and visualize the anomaly pattern in source and processed data.
o Integrate automated process into continuous integration workflows.
o Contribute to data quality assurance standards and procedures.
• Bachelor’s degree in Computer Science or equivalent education/training
• 4- 5 years of Software development and testing experience.
• 3+ years of Working experience on tools like Hive, Spark, HBase, Sqoop, Impala, Kafka, Flume, Oozie, MapReduce, etc.
• 3+ years of programming experience in Scala, Java or Python
• Experience with development and automated testing in a CI/CD environment. Knowledge of GIT/Jenkins and pipeline automation is must.
• Experience with developing and testing real-time data-processing and Analytics Application System.
• Strong knowledge in SQL development on Database and/or BI/DW
• Strong knowledge in shell scripting
• Experience in Web Services - API development and testing.
• A solid understanding of common software development practices and tools.
• Strong analytical skills with a methodical approach to problem solving applied to Big Data domain
• Good organizational skills and strong written and verbal communication skills.
• Working experience on large migration Projects is a big plus.
• Development and Testing experience of Machine Learning Applications is a plus.
• Experience in load and performance testing, familiarity with testing tools such as JMeter.
• Familiarity with project Management and bug tracking tools, i.e., JIRA
No Sub-Contract, Third Parties or Corp to Corp arrangements! W2 only!
Note: U.S. Citizens and those authorized to work in the U.S. without sponsorship will be considered. We are unable to sponsor at this time.
Send resume to Jobs@staffinghq.com
Follow us on LinkedIn for job updates. https://www.linkedin.com/company/staffing-headquarters