Google Cloud Platform DataProc Developer

Overview

Remote
Depends on Experience
Accepts corp to corp applications
Contract - Independent
Contract - W2
Contract - 12 Month(s)

Skills

API
Apache Hadoop
Apache Spark
Authentication
Cloud Computing
Cloud Storage
Collaboration
Communication
Conflict Resolution
Continuous Delivery
Continuous Integration
Data Engineering
Data Flow
Data Processing
Database
Documentation
FOCUS
GitHub
Good Clinical Practice
Google Cloud
Google Cloud Platform
Java
OAuth
Orchestration
Problem Solving
Python
RESTful
Scala
Scripting
Workflow

Job Details

Google Cloud Platform DataProc Developer

Remote

Job Summary:

We are seeking a skilled and proactive Google Cloud Platform Dataproc Developer to design and implement scalable data ingestion pipelines and integrate with REST APIs for data persistence. The ideal candidate will have hands-on experience with Google Cloud Platform (Google Cloud Platform), particularly Dataproc, and a strong understanding of distributed data processing, API communication, and cloud-native development practices.

Key Responsibilities:

Design and develop scalable file ingestion processes using Google Cloud Platform Dataproc (Apache Spark/Hadoop).

Implement data transformation and cleansing logic as part of ingestion workflows.

Integrate with RESTful APIs to persist processed data into downstream systems or databases.

Optimize performance and cost-efficiency of Dataproc clusters and jobs.

Automate pipeline orchestration using tools like Cloud Composer (Airflow) or custom scripts.

Ensure robust error handling, logging, and monitoring of ingestion and API processes.

Collaborate with data architects, analysts, and other engineering teams to align on data requirements and integration strategies.

Maintain documentation for data flows, API contracts, and operational procedures.

Required Skills & Qualifications:

7+ years of experience in data engineering or cloud development roles.

Strong hands-on experience with Google Cloud Platform Dataproc, Spark, and Hadoop ecosystem.

Proficiency in Python, Java, or Scala for data processing and API integration.

Experience with RESTful API design, consumption, and authentication (OAuth, API keys).

Familiarity with Google Cloud Platform services such as Cloud Storage, Pub/Sub, BigQuery, and IAM. Knowledge of CI/CD practices and tools (e.g., Cloud Build, GitHub Actions).

Excellent problem-solving and communication skills.

Display Settings Focus.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.