Overview
Skills
Job Details
Data Engineer (Python)
Location: Houston, TX (Onsite)
Duration: 6+ Months Contract
Position Overview
We are seeking a highly skilled Senior Python Data Engineer to join our Big Data and Advanced Analytics team. This role will be critical in designing and building a modern Enterprise Data Lakehouse to support advanced analytics initiatives for the midstream oil and gas sectors, including operations, engineering, and measurement units.
This is an exciting opportunity to work on cutting-edge data engineering projects that drive real business impact. If you're passionate about building modern data platforms and solving complex data problems in a fast-paced environment, we d love to hear from you.
The ideal candidate is a seasoned data engineer with extensive hands-on experience in Python, Data Lakehouse architectures, and cloud-native data engineering tools.
Key Responsibilities
Design, develop, and maintain scalable and reliable data pipelines to ingest and transform structured and unstructured data from various sources.
Implement data quality pipelines to validate, cleanse, and ensure the trustworthiness of business-critical datasets.
Architect and build a robust Data Lakehouse solution using Apache Iceberg or similar frameworks, aligned with business logic and operational requirements.
Optimize performance of the data platform including physical data modeling, partitioning, and compaction strategies.
Collaborate with business stakeholders to translate requirements into effective data engineering solutions.
Provide guidance on data visualization and reporting strategies to ensure alignment with business goals.
Participate in performance tuning, CI/CD implementation, and adhere to software engineering best practices.
Required Qualifications
12+ years of experience in software development or software engineering.
5+ years of hands-on experience with Python, including use of libraries like Pandas, NumPy, PyArrow, Pytest, Boto3, and Scikit-Learn.
Strong experience in SQL and modern data modeling techniques including Star Schema, Snowflake Schema, and Data Vault.
2+ years of hands-on experience using DBT (Data Build Tool) for data transformation.
Proven experience implementing Data Lakehouse solutions using Apache Iceberg or Delta Lake on S3 object storage.
Knowledge of data integration patterns including ELT, Change Data Capture (CDC), and Pub/Sub messaging.
Strong understanding of software development principles, including design patterns, testing, refactoring, CI/CD pipelines, and version control (e.g., Git).
Excellent communication skills, capable of conveying complex technical concepts to both technical and non-technical audiences.
Preferred Skills (Nice to Have)
Experience with Python-based UI frameworks, particularly Dash.
Exposure to Dremio, Apache Airflow, or Airbyte for orchestration and data access.
Familiarity with Kubernetes and AWS EKS.
Hands-on experience working with AWS Cloud services related to data storage, processing, and security.