Data Engineer (Python, DBT, Redshift, SQL)

Overview

Remote

Contract - W2

Contract - 12 month(s)

Skills

CI/CD

Data Engineer

Python/SQL scripts

Data Mapping (S2T)

ELT Development

Pandas or PySpark

Job Details

Job Description:

Key Responsibilities:

1. Data Profiling

Develop repeatable Python/SQL scripts to compute column statistics, null/unique distributions, outlier checks, referential integrity, and rule-based quality validations.
Generate and publish standardized profiling reports/dashboards for stakeholder review.

2. Data Mapping (S2T)

Create and maintain source-to-target mappings for ingestion and transformation layers, capturing business rules, lineage, assumptions, and edge cases.
Maintain version control of mapping documents in GitLab.

3. ELT Development

Extract/Load (Mage): Build and operate ingestion pipelines with retries, alerting, schema enforcement, and parameterized environment configurations.
Transform (dbt): Develop staging, cleansing, and mart-level models with dbt tests (unique, not_null, accepted_values) and generate documentation.

4. Versioning & CI/CD

Utilize GitLab for branching, merge request reviews, linting, dbt tests, and automated CI/CD deployments.

5. Data Quality Management

Implement and monitor data quality tests at every stage of the pipeline.
Track SLAs and enforce merge blocking on failures to prevent regression.

6. Documentation & Hand-offs

Maintain runbooks, S2T documents, dbt docs, and pipeline diagrams.
Ensure documentation updates within 3 business days of any change.

7. Collaboration

Partner with analysts, architects, and QA teams to clarify transformation rules, review designs, and meet acceptance criteria.

Requirements:

Required Qualifications:

3 6+ years of experience in Data Engineering, preferably with offshore/remote delivery exposure.
Strong expertise in SQL (advanced queries, window functions, performance tuning) and Python (data processing with Pandas or PySpark).
Hands-on experience with Mage orchestration, dbt modeling, and GitLab workflows.
Solid understanding of data modeling, lineage tracking, and data quality frameworks.
Excellent communication skills and disciplined documentation practices.

Preferred Skills:

Experience with Snowflake, BigQuery, Redshift, or Azure Synapse.
Exposure to PySpark, Databricks, or Airflow.
Awareness of BI tools such as Power BI, Tableau, or Looker for downstream analytics integration.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About Aroha Technologies

Share