Overview
Skills
Job Details
Job Title: Mid to Senior Data Engineer (Contract)
Project Focus: ID Resolution Co-Innovation Project
Location: New York City, NY (On-site, 5 days/week)
Duration: Minimum 6-month contract
About the Project:
We are seeking a talented Mid to Senior Data Engineer to join our team for an exciting ID Resolution Co-Innovation Project. This project involves a collaborative effort between a Leading Service Provider and an Innovative Client to explore, develop, and operationalize novel methods for entity and identity resolution within graph-based data systems. You will play a crucial role in integrating client graph data with advanced graph models, testing analytical approaches, and evaluating probabilistic identity resolution within a production environment.
Key Responsibilities:
- Data Ingestion & Transformation: Design, build, and maintain robust data pipelines to ingest, transform, and prepare large-scale graph datasets (potentially involving millions of user entities) from various sources (e.g., CSV, JSON) into a format suitable for advanced graph modeling.
- Graph Data Engineering: Work extensively with graph data structures and technologies, ensuring efficient storage, retrieval, and processing of complex interconnected data.
- System Integration: Collaborate closely with Client and Service Provider teams to gain access to key systems and infrastructure, confirm data availability, and integrate developed solutions into existing online software systems.
- Data Security & Governance: Implement and enforce data security requirements, privacy regulations, and PII handling best practices throughout the data lifecycle.
- Methodology Support: Provide engineering support for the development and evaluation of ID resolution methodologies, including graph learning, spectral graph theory, and other machine learning approaches.
- Operationalization: Support the deployment and operationalization of selected ID resolution approaches within test and production environments, including configuration and integration.
- Documentation: Contribute to the clear and concise documentation of data mappings, ingestion processes, transformation logic, and system configurations.
Required Skills & Experience:
- 5+ years of experience in data engineering, with a strong focus on building and maintaining scalable data pipelines (ETL/ELT).
- Proficiency in graph databases/technologies (e.g., Neo4j, AWS Neptune, ArangoDB, or similar) and deep understanding of graph data modeling principles.
- Experience with large-scale data processing frameworks (e.g., Apache Spark, Flink).
- Strong programming skills in languages commonly used for data engineering (e.g., Python, Scala, Java).
- Familiarity with cloud platforms (AWS, Azure, Google Cloud Platform) and their data services.
- Solid understanding of data security, privacy, and data governance best practices.
- Experience with API integration for connecting disparate systems.
- Ability to work effectively in a collaborative, fast-paced environment.
- Excellent problem-solving skills and attention to detail.
Nice-to-Have Skills:
- Exposure to machine learning concepts, particularly in the context of identity resolution, recommendation systems, or fraud detection.
- Experience working on co-innovation or R&D focused projects
Sincerely,
Tamana Nair
Digitek Software, Inc.
650 Radio Drive, Lewis
Center, OH 43035 Tel No ext. 3105/ Fax