Note: This role is remote until COVID restrictions are in place after which team members are expected to be on-site in Princeton, NJ
Role Overview: We're looking for a strong Data Engineer who will be responsible for expanding and optimizing data and data pipeline architecture, as well as optimizing data flow and collection in AWS environment.
- 5+ years of experience in a Data Engineering role, Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- Advanced working SQL knowledge and experience working with relational databases, query authoring and familiarity with a variety of databases
- Experience with database design principals including relationships and normalization, database structures, indexes and views and analyzing requirements and purpose of the database
- Experience building and optimizing big data data pipelines, architectures and data sets
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
- Strong analytic skills
- Experience building processes supporting data transformation, data structures, metadata, dependency and workload management
- Working knowledge of message queuing, stream processing, and highly scalable big data data stores
- Experience with relational SQL and NoSQL databases
- Strong skills with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Ability to architect end-to-end ETL pipelines
- Experience in building Data pipelines in AWS
- Experience with AWS Services
- Strong experience with AWS Lambda functions
- Experience with BI Tools
- Strong quantitative, analytical, process development, facilitation and organizational skills required
- Excellent documentation and verbal communication skills.
- Ability to communicate technical vision in clear terms to peers as well as outside of the engineering/development team
- Responsible for expanding and optimizing data and data pipeline architecture, as well as optimizing data flow and collection
- Supports software developers, database architect, and data analysts on data initiatives and ensures optimal data delivery architecture is consistent throughout ongoing projects
- Must be self-directed and comfortable supporting the data needs of multiple teams, systems and products
- Responsible for optimizing or re-designing data architecture to support the next generation of products and data initiatives
- Develops and maintains scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity
- Collaborates with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility
- Implements processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes
- Writes unit/integration tests
- Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues
- Works closely with a team of frontend and backend engineers, product managers, and analysts
- Contributes to definition of company data assets (data models), and processes to populate data models
- Designs data integrations and data quality framework
- Designs and evaluates open source and vendor tools for data lineage
- Works closely with all business units and engineering teams to develop strategy for long term data platform architecture
- Contribute to development of a rapidly growing, integrated data warehouse that will provide a complete vision of the entire healthcare landscape
We look forward to reviewing your profile.
Talent Team @ SIALTP