Data Engineer

Overview

Remote

$50 - $60

Full Time

Skills

API

Apache Spark

Big Data

Cloud Computing

Collaboration

Communication

Continuous Delivery

Agile

Amazon Redshift

Amazon Web Services

Apache Hadoop

Data Quality

Apache Kafka

Continuous Integration

Data Engineering

Data Governance

Extract

Transform

Load

Data Lake

Data Security

DevOps

Documentation

Engineering Support

Health Informatics

JSON

Mentorship

Meta-data Management

Microservices

GitHub

Good Clinical Practice

Google Cloud Platform

PL/SQL

Pandas

HIPAA

Health Care

Microsoft Azure

Migration

MySQL

Orchestration

Performance Tuning

PostgreSQL

Public Health

Python

RabbitMQ

Real-time

Regulatory Compliance

SAS

SQL

Scrum

Snow Flake Schema

Stored Procedures

Streaming

System On A Chip

Terraform

Testing

Workflow

XML

Job Details

Comtech is seeking a Data Engineer who is responsible for evaluating and implementing the technical
integration between LINKS and the Data Lake. This includes coordinating with OPH and STChealth on
data standards, building and optimizing ETL pipelines, and integrating related datasets like Vital Records
and lab data. The position focuses on creating scalable, high-quality data workflows with strong
governance, observability, and lineage tracking. It also involves mentoring engineering staff, supporting
testing, enforcing architectural best practices, and clearly documenting and communicating all technical
solutions to both technical and non-technical stakeholders.

The scope of the proposed services will include the following:
Assess feasibility and technical requirements for LINKS DataLake integration.
Collaborate with OPH Immunization Program, OPH Bureau of Health Informatics and
STChealth on data specifications and recurring ingestion pipelines.
Build and optimize ETL workflows for LINKS and complementary datasets (Vital
Records, labs, registries).
Design scalable data workflows to improve data quality, integrity, and identity resolution.
Implement data governance, observability, and lineage tracking across all pipelines.
Mentor engineers, support testing, and enforce best practices in orchestration and architecture.
Document and communicate technical solutions to technical and non-technical stakeholders.

5. Expertise and/or Relevant Experience:
Expertise and/or relevant experience in the following areas are mandatory:
M1 3 years of experience in data engineering and/or data architecture
M2 2 years of experience with Python for ETL and automation (pandas, requests, API
integration).
M3 2 years hands-on experience with SQL queries, stored procedures, performance tuning
(preferable Oracle, SQL Server, MySQL)
M4 1 year experience with ETL orchestration tools (Prefect, Airflow or equivalent).
M5 1 year experience with cloud platforms (Azure, AWS, or Google Cloud Platform), including data
onboarding/migration.
M6 1 year exposure to data lake / medallion architecture (bronze, silver, gold)
M7 2 years of experience providing written documentation and verbal communication for
crossfunctional collaboration.

Expertise and/or relevant experience in the following areas are desirable but not mandatory:
D1 5+ years of experience in data engineering roles
D2 Experience integrating or developing REST/JSON or XML APIs
D3 Familiarity with CI/CD pipelines (GitHub Actions, Azure DevOps, etc.).
D4 Exposure to Infrastructure as Code experience (Terraform, CloudFormation).
D5 Experience with data governance and metadata tools (Atlan, OpenMetadata, Collibra).
D6 Public health/healthcare dataset or similar experience, including PHI/PII handling.
D7 Familiarity with SAS and R workflows to support epidemiologists and analysts.
D8 Experience with additional SQL platforms (Postgres, Snowflake, Redshift, BigQuery).
D9 Familiarity with data quality frameworks (Great Expectations, Deequ).
D10 Experience with real-time/streaming tools (Kafka, Spark Streaming).
D11 Familiarity with big data frameworks for large-scale transformations (Spark, Hadoop).
D12 Knowledge of data security and compliance frameworks (HIPAA, SOC 2, etc.).
D13 Agile/SCRUM team experience.

Expertise and/or relevant experience in the following areas are mandatory:
M1 3 years of experience in data engineering and/or data architecture
M2 2 years of experience with Python for ETL and automation (pandas, requests, API
integration).
M3 2 years hands-on experience with SQL queries, stored procedures, performance tuning
(preferable Oracle, SQL Server, MySQL)
M4 1 year experience with ETL orchestration tools (Prefect, Airflow or equivalent).
M5 1 year experience with cloud platforms (Azure, AWS, or Google Cloud Platform), including data
onboarding/migration.
M6 1 year exposure to data lake / medallion architecture (bronze, silver, gold)
M7 2 years of experience providing written documentation and verbal communication for
crossfunctional collaboration.

D16 Experience working in Agile environments with iterative development practices.
D17 Experience with cloud platforms such as Microsoft Azure or AWS.
D18 Familiarity with API integrations for external data sources.
D19 Exposure to modular or microservices architectures.
D20 Knowledge of message-based systems (e.g., RabbitMQ) and asynchronous programming
models.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share