The Data Engineer is responsible for developing data pipeline and data engineering components to support strategic initiatives and ongoing business processes. In this role, you will work with internal staff to understand requirements, develop technical solutions, and ensure the reliability and performance of the data engineering solutions.
- Deliver and drive efforts in developing data solutions focused on data engineering and analytics using Informatica, Google Clound Platform (GCP), Google Bigquery, RDBMS, R, SQL, Data Modeling, Java, Unix Scripting etc
- Troubleshoot technical issues in existing processes and current development work, solicit assistance from other roles and groups, and drive resolution to ensure the integrity of platform quality and project timelines.
- Understand and improve shared standard patterns, templates, and artifacts for BI and data warehouse platform architecture, data models, and new technology adoption and rollout.
- Assist others using the data lake, data warehouse, and related systems to build analytical applications, visualizations, queries, and dashboards. Tools used include SQL, MicroStrategy, Power BI, Python, R, and Jupyter notebooks.
- Proactively generalize and share technical approaches and best practices among other developers and simplify and communicate completed work to broader audiences across the company.
- Help support data consumers to ensure they have reliable access to trusted data. This includes periodic responsibility for 24x7 on call production support.
- 1-4 years Hands-on experience in delivering solutions in at least two of data modeling, data integration, data analysis, GCP
- High aptitude towards problem solving using data to solve business problems
- Good understanding of statistics to drive analysis around large structured and unstructured data sets
- Understand core principles of data warehousing, data science, and machine learning
- Ability to work with MPP databases like Netezza or Redshift or Teradata
- Write SQL fluently, recognize and correct inefficient or error-prone SQL, and perform test-driven validation of SQL queries and their results
- Data pipeline concepts (like in Pentaho, SSIS, Informatica, Data Stage etc)
- Data Analytics and Visualization using R and Python (Numpy, Pandas, Scipy)
- Ability to read and understand logical data models and physical data models
Preferred but not required
- Data Science specialization from Coursera, Udacity, Data Camp etc.
- Certified Business Intelligence Professional from TDWI
- Certified Data Management Professional from DAMA
- Certifications in Data Modeling & Data Engineering