Google Cloud Platform Data Engineer

Remote • Posted 3 hours ago • Updated 3 hours ago
Full Time
No Travel Required
Remote
Depends on Experience
Fitment

Dice Job Match Score™

🔗 Matching skills to job...

Job Details

Skills

  • GCP Data Engineer
  • Databricks

Summary

Company Profile

Blackstraw.ai is an end-to-end technology services company specializing in Artificial Intelligence (AI)

and Engineering solutions across Data Science, Data Engineering, LLM/GenAI and LLMOps. Founded

in 2018, we help global enterprises across North America, Europe and Asia to build and

operationalize AI systems that create measurable business impact. Our mission is to make AI

adoption simpler, faster and scalable through a blend of deep domain expertise, reusable

accelerators and proven engineering practices.

 

With a 400+ strong team of engineers, data scientists and AI specialists, we partner with

organizations to deliver real-world outcomes in areas such as predictive analytics, computer vision,

natural language processing and Generative AI.

 

Headquartered in Florida (USA) with operations in USA, Canada and India, Blackstraw.ai continues to

empower global enterprises to unlock the true potential of AI.

 

Location: USA / Canada

Experience: 6 to 10 years

Employment Type: Full-time

 

Role Overview: This role focuses on the hands-on-code execution. They will build the ingestion pipelines from sources like SAP, Salesforce, or RDBMS into the Google Cloud Platform landing zone and transform them into high-quality data products.

 

Key Responsibilities

  • Pipeline Development: Build robust ETL/ELT pipelines using Delta Live Tables (DLT) / Spark Declarative Pipelines and PySpark on Google Cloud Platform.
  • Streaming & Ingestion: Implement real-time data ingestion using Cloud Pub/Sub and Databricks Structured Streaming.
  • Advanced Transformations: Expertly use PySpark for complex unstructured data and SparkSQL or BigQuery SQL for structured business logic.
  • Data Quality: Develop automated data validation frameworks using Great Expectations or Databricks Expectations within DLT.
  • DevOps/CI-CD: Automate environment provisioning and code deployment using Terraform and Google Cloud Build (or GitHub Actions).
  • Cost Management: Monitor and optimize DBU (Databricks Unit) consumption and Google Cloud Platform compute costs.

 

Technical Requirements

  • Languages: Expert-level PySpark, SQL, and Python.
  • Google Cloud Platform Ecosystem: Hands-on experience with Cloud Functions, Cloud Run, and BigQuery.
  • Frameworks: Experience with Spark Declarative Pipelines (formerly Delta Live Tables) for Spark-centric flows.
  • Tools: Proficiency with the Databricks CLI, Asset Bundles (DABs), and version control via Databricks Repos.

 

Soft Skills

  • Ability to translate complex technical concepts into actionable insights
  • Strong problemsolving mindset with a bias for experimentation and innovation.
  • Collaborative, proactive, and comfortable working in fastpaced environments.



We are an equal opportunity employer. Employment decisions are based on qualifications, merit, and business needs. We do not discriminate on any basis protected by applicable laws in the countries where we operate.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91162909
  • Position Id: 8917126
  • Posted 3 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

5d ago

Easy Apply

Full-time

Depends on Experience

Remote

Today

Easy Apply

Full-time

Remote

19d ago

Easy Apply

Full-time

130,000 - 140000

Remote

Today

Easy Apply

Contract, Third Party

Depends on Experience

Search all similar jobs