Job Title: Lead AWS Data Engineer with Python - C2C - Direct Client

Overview

On Site
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 12 Month(s)

Skills

Apache Iceberg or Delta
Python
ELT/ETL pipelines
AWS S3
CI/CD
SQL

Job Details

Job Title: Lead AWS Data Engineer with Python

Duration: 12+ Months

Location: Houston, TX (Onsite)

About the Role

We are seeking a lead Data Engineer (Level 3) to design, build, and optimize large-scale, high-reliability data pipelines and lakehouse architectures. The ideal candidate combines deep data engineering expertise with strong software engineering fundamentals to deliver modular, scalable, and testable data systems. This role involves leading core architectural decisions and end-to-end patterns across ingestion, transformation, data modeling, and delivery, including partitioning strategies and partition key design for high-performance analytics.

Key Responsibilities

  • Design, build, and maintain ELT pipelines across ingestion, transformation, modeling, and delivery layers (bronze silver gold).
  • Implement incremental loads, change-data-capture (CDC), merge/upsert, and idempotent pipeline patterns to ensure reliability and repeatability.
  • Define and apply data architectural patterns (e.g., layered lakehouse, domain-oriented datasets, and semantic models) aligned to business objectives.
  • Engineer physical data designs including partitioning strategies, partition key selection, clustering/micro-partitioning, and compaction for performance and cost efficiency.
  • Develop curated datasets and data marts that enable analytics and self-service BI.
  • Implement data quality, observability, and lineage (validations, profiling, SLAs, monitoring, and alerting).
  • Optimize performance on cloud data platforms (e.g., Snowflake tasks/streams, compute sizing, query optimization).
  • Design and manage Lakehouse table formats (e.g., Apache Iceberg or Delta) on object storage including schema evolution and maintenance.
  • Collaborate with Data Architects, Analytics Engineering, and business stakeholders to translate requirements into scalable data solutions.
  • Mentor junior engineers, lead design reviews, and contribute to engineering standards and reusable frameworks.
  • Automate and optimize the data lifecycle using CI/CD and infrastructure-as-code; apply DevOps principles to data pipelines.

Required Qualifications

  • 14+ years of experience in Data Engineering or closely related Software Engineering roles with a data focus.
  • Expert-level SQL development and data analysis skills, including advanced query optimization and debugging.
  • Strong Python engineering skills and familiarity with software design principles and patterns (e.g., SOLID), unit testing, refactoring, and version control.
  • Hands-on experience building ELT/ETL pipelines and orchestration with tools such as Astronomer/Airflow; proficiency with Git and CI/CD.
  • Deep understanding of core data engineering patterns: ingestion, transformation, modeling (dimensional/SCDs), and delivery.
  • Proven experience with database physical design including partitioning and effective partition key selection; exposure to clustering and micro-partitioning on MPP/cloud data platforms.
  • Experience implementing data quality frameworks, observability/monitoring, and robust operational SLAs.
  • Experience with Lakehouse table formats (Apache Iceberg/Delta/Hudi) and columnar storage (Parquet) on object storage (e.g., AWS S3).
  • Strong communication skills with the ability to present complex technical concepts to both technical and business stakeholders.

Preferred Qualifications

  • Experience optimizing Snowflake workloads (compute sizing, tasks/streams, clustering, micro-partitioning).
  • Experience with dbt, Data Build Tool, or similar for transformation and testing.
  • Experience with event streaming (Kafka/Kinesis/Flink) and API-based data integration.
  • Experience with data catalog, governance, and lineage platforms.

Core Competencies

  • Architectural thinking and systems design.
  • Structured problem-solving and analytical rigor.
  • Clear written and verbal communication; stakeholder engagement.
  • Bias for automation, reliability, and maintainability.

Tools & Technologies (representative)

  • Databases & Warehouses: Snowflake, MPP databases; dimensional modeling/SCDs.
  • Lakehouse & Storage: Apache Iceberg/Delta/Hudi, Parquet, AWS S3/Object Storage.
  • Orchestration & CI/CD: Astronomer/Airflow, git, CI/CD pipelines.
  • Programming: Python, SQL.
  • Observability & Quality: data validation frameworks, monitoring/alerting tools.

Education & Work Conditions

Bachelor s degree in Computer Science, Data Engineering, Information Systems, or related field; advanced degree a plus.

Location: Houston, TX (in-office, no remote/hybrid).

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.