Job Role: Google Cloud Platform Data Engineer
Location: Detroit, MI
Hire-type: Contract
Experience: 3–6 years | Detroit, MI (mandatory) — Remote up to 50% travel
Python | Google Cloud Platform Native | BigQuery | ETL / ELT Pipelines | Data Modeling | SQL |
ABOUT THE ROLE
As a Google Cloud Platform Data Engineer at DataFactZ you will design, build, and maintain cloud-native data pipelines and data warehouse solutions on Google Cloud. Working closely with data architects and analytics teams, you will deliver reliable ingestion, transformation, and serving pipelines that power enterprise reporting, analytics, and data products — handling structured and semi-structured data at scale using Python and Google Cloud Platform-native tooling.
KEY RESPONSIBILITIES
• Build and maintain Python-based ETL/ELT pipelines for ingesting and transforming structured (BigQuery, Cloud SQL, Spanner) and semi-structured (JSON, Avro, Parquet, CSV) data on Google Cloud Platform
• Develop batch and streaming data pipelines using Dataflow (Apache Beam) and Dataproc (PySpark) for large-scale data processing workloads
• Implement data models in BigQuery including star schema, snowflake, and flat wide-table designs with appropriate partitioning and clustering
• Write complex BigQuery SQL transformations, stored procedures, and scheduled queries for data warehouse population and aggregation layers
• Build and maintain dbt models for transformation layer development, testing, and documentation within BigQuery
• Orchestrate multi-step pipeline workflows using Cloud Composer (Airflow), handling dependencies, retries, and alerting
• Ingest data from diverse sources including APIs, relational databases (Cloud SQL, AlloyDB), flat files, and streaming topics (Pub/Sub)
• Monitor pipeline health, optimize query performance and costs in BigQuery, debug failures, and support production deployments
• Write unit tests, maintain technical documentation, and participate in architecture and code reviews
REQUIRED SKILLS
• Python: Strong proficiency for data pipeline development including pandas, PySpark, Apache Beam, and Google Cloud Platform client library usage
• Google Cloud Platform services: Hands-on experience with BigQuery, Cloud Storage, Dataflow or Dataproc, Pub/Sub, Cloud Composer, and Cloud SQL
• Data modeling: Practical experience implementing dimensional models (star/snowflake schema) and understanding of data warehousing concepts
• SQL: Strong BigQuery SQL skills including window functions, nested/repeated fields, partitioning, clustering, and performance tuning
• ETL/ELT pipelines: Experience building batch and streaming data pipelines for structured and semi-structured datasets
• Data formats: Proficiency working with Parquet, Avro, JSON, and CSV in distributed processing contexts
• Version control: Proficient with Git and collaborative development workflows
PREFERRED
• Google Cloud Platform Professional Data Engineer certification
• Experience with dbt for BigQuery transformation layer development
• Familiarity with data quality frameworks: Great Expectations, dbt tests, or custom validation pipelines
• Exposure to data catalog and lineage tooling: Google Cloud Platform Dataplex or Data Catalog
• Experience with analytical or BI tooling: Looker, Looker Studio, or Tableau connected to BigQuery