This position is an opportunity to use Informatica Big Data Management(BDM) with HDFS (Cloudera), Spark and Kafka technology for batch and streaming analytics.
Seeking a Big Data engineer who possesses a base of programming skills and have a passion to learn new tools and techniques in a big data landscape that is endlessly changing.
Data processes in BDM, if appropriate, include ingestion, standardization, metadata management, business rule curation, data enhancement, and statistical computation against data sources that include relational, XML,JSON, streaming, REST API, and unstructured data. BDM has provided a metadata injection layer for data ingestion across six SQLServer sources that will be leveraged for additional source ingestion. Job Duties The role has responsibility to understand, prepare, process and analyze data to drive operational, analytical and strategic business decisions.
The Senior Big Data Engineer will work closely with big data engineers, product owners, information engineers, data scientists, data modelers, infrastructure support and data governance positions.
This role requires exceptional BDM developer skills who can mentor big data engineers in the appropriate solution architecture and BDM development practices.
While you do not have to have big data skills, we are seeking BDM engineers who start with a base of programming skills but who also love to learn new tools and techniques in a big data landscape that is endlessly changing.
•Build end to end data flows from sources to fully curated and enhanced data sets. This can include the effort to locate and analyze source data, create data flows to extract, profile, and store ingested data, define and build data cleansing and imputation, map to a common data model, transform to satisfy business rules and statistical computations, and validate data content.
•Produce data building blocks, data models, and data flows for varying client demands such as dimensional data, data feeds, dashboard reporting, and data science research & exploration
•Create, modify and maintain BDM code and complex SQL for BI/DW data flows
•Learn to program in Spark and Python where BDM Spark co-generation is not adequate
•Produce automated tests of data flow components
•Use knowledge of the business to automate business-specific tests for data content quality
•Automate code deployment and promotion
•Build automated orchestration with BDM and error handling for use by production operation teams
•Provide technical expertise to diagnose errors from production support teams
•Collaborate with team members in an Agile team (e.g., Scrum)
•Participate as both leader and learner in team tasks for architecture, design and analysis
•Coordinate within collocated on-site teams as well as with work plans for off-shore resources
•Bachelor’s Degree or Two Year Technical Program with a Programming Specialization
•B.S. preferred in Computer Science, Information Systems, or related field
•Minimum two years of development experience with BDM with a preference for five years of ETL tool experience which can be from other tools such as Talend, DataStage, and Informatica
•Experience with development of metadata-driven and fully parameterized BDM environments
•Advanced SQL coding skills for data transformations, profiling, and query tasks
•Unix commands and scripting
•Experience in agile environments such as scrum and Kanban
•Preference for experience in Hadoop fundamentals and architecture: HDFS, map-reduce, job performance
•Preference for open source big data skills in tools such as Hive, HBase, parquet, Spark SQL
•Programming in a language such as Python (preferred), Scala, etc
8201 Main Street Suite 12 Williamsville, NYContact