Immediate need for lead data engineer having experience in AWS stack for Bigdata
Main tasks as lead
1. Lead data engineering team
2. Work with data engineers on ETL pipelines using Airflow, AWS Glue, Python
3. Data architecture (AWS Redshift data warehouse, Amazon S3 data lake) or Snowflake based data warehouse
4. Architect and build data pipelines using Python, Airflow and AWS Glue
5. Understand existing ETL pipelines developed in other tools and convert to Airflow based pipelines
6. Data model data warehouse (AWS Redshift)
7. Performance tuning of Redshift data warehouse
8. Experience in bulk loading of data into Redshift or Snowflake cloud data warehouse
Collaborate with IT and business partners to devise a data strategy that caters to Stanford requirements.
Build Data and Technology inventory and drive architecture.
Deep understanding and commitment to software engineering principles/processes (e.g. Lean, Agile, DevOps, CI/CD) and continuous improvement through measurement
Thorough knowledge, expertise and practice Data Management Framework to design world class data stores. Best practices, Data Quality and security are critical.
Understand data endpoints, consumers and develop strategy.
Fluid end-to-end data vision, design pipelines for seamless data flow.
Lead and perform the design, development, implementation and maintenance of complex Data Store/ Data Lake/Lake house and Data warehousing systems and data-intensive solutions that are scalable, optimized, and fault-tolerant.
Design and implement Data Migration and Data Integration across cloud and hybrid environments.
Mastery and hands-on experience with Data Engineering technologies and scripting languages. Identify new technologies and provide recommendations to Management.
Solid understanding and experience in Cloud technologies and applications. Data Migration, Integration, API’s development, Data Streaming (Batch and continuous) and scheduling.
Lead and Mentor Junior Data engineers and support with best design practices.
Data Modeling skills. Able to come up with a Canonical Data Model and simplify data flow and interaction between different applications. Should be able to integrate new sources smoothly.
Ability to translate complex functional and technical requirements into detailed architecture, design and high performing software.
Design, build and optimize pipelines for data collection for storage, access and analytics.
Out of the box thinking to overcome engineering challenges with innovative design principles.
Thorough understanding and experience in Data Lake, Lake House and Data Warehousing Architecture. Should be able to suggest, architect and implement Data Lake/Lake house/ DataWarehouse solution with a set of available cloud tools and programming.
Experience with DataOps and related set of practices, processes and technologies.
Experienced in Data Migration and Data Integration. Know the pain points in Data integration across SaaS applications and implement the best solution that fits the organization.
Hands-on experience and expertise in Advanced SQL, Advanced Python programming, AWS and Other Data Engineering tools, SnowFlake, Informatica, SnapLogic, KAFKA, Airflow, Oracle Cloud data lake and other open source tools. Experience in Data Migration/Integration tools such as AWS Data migration services, AppFlow, MuleSoft, RJ, open source and any other market available tools
Experience in writing reusable complex Python scripts for ELT, Business Logic OR APIs. Other Coding experience such as Scala and R Programming is a plus.
Hands-on development work on all aspects of data analysis, data provisioning, modeling, performance tuning and optimization.
Experience in working on AWS cloud environment, using the marketplace for the right tool, efficient utilization of them to meet business requirements.
Mastery of relational, NoSQL or NewSQL database systems. Expertise in working with unstructured, structured and semi-structured data.
Build scalable data pipelines for both real time and batch using best practices in data modeling, ETL/ELT processing using various technology stack
Experience in designing and implementing tight data security at various levels.
Experience in streaming data from SaaS/PaaS applications - SalesForce, ServiceNow, Workday, Oracle Cloud, Marketo and others.
Experience in data migration and integration across cloud and on-premises systems
Constantly monitor operations, tune for better performance and utilization.