Role: Senior Data Engineer/ETL developer
Location ( Open for multiple locations): Concord-CA, Fremont-CA, St Louis Park-MN, Fremont-CA, Charlotte-NC
The senior data engineer will use ETL tools like Informatica, Ab Initio, and data warehouse tools to deliver critical Artificial Intelligence Model Operationalization services to the Enterprise.
- Data modeling, coding, analytical modeling, root cause analysis, investigation, debugging, testing and collaboration with the business partners, product managers, architects & other engineering teams.
- Adopting and enforcing best practices related to data ingestion and extraction of data from the big data platform.
- Extract business data from multiple data sources and store in MapR DB HDFS location.
- Work with Data Scientists and build scripts to meet their data needs
- Work with Enterprise Data Lake team to maintain data and information security for all use cases
- Build automation script using AUTOSYS to automate the loads
- Design and develop scripts and configurations to successfully load data using Data Ingestion Frameworks or Ab initio
- Post-production support of the AIES Open Source Data Science (OSDS) Platform
- Supporting end-to-end Platform application delivery, including Infrastructure provisioning & automation and integration with Continuous Integration/Continuous Development (CI/CD) platforms, using existing and emerging technologies
- Be willing to work non-standard hours to support production execution or issue resolution
- Be willing to be on-call for production escalation
- 1+ year experience with Ab Initio suite of tools GDE, Express IT
- 3+ years experience with Big Data platforms Hadoop, MapR, Hive, Parquet
- 5+ years of ETL (Extract, Transform, Load) Programming with tools including Informatica
- 2+ years of Unix or Linux systems with scripting experience in Shell, Perl or Python
- Experience with Advanced SQL preferably Teradata
- Strong Hadoop scripting skills to process petabytes of data
- Experience working with large data sets, experience working with distributed computing (MapReduce, Hadoop, Hive, HBase, Pig, Apache Spark, etc.).
- Possession of excellent analytical and problem-solving skills with high attention to detail and accuracy
- Demonstrated ability to transform business requirements to code, metadata specifications, specific analytical reports and tools
- Good verbal, written, and interpersonal communication skills
- Experience with SDLC (System Development Life Cycle) including understanding of project management methodologies used in Waterfall or Agile development projects