Data Engineer (Google Cloud Platform) - Need onsite - American IT Systems

Overview

On Site

Depends on Experience

Contract - Independent

Contract - W2

Contract - 12 Month(s)

Skills

Google Cloud Platform

PySpark

Python

SQL

GCS

Cloud Computing

Analytics

Apache Airflow

Apache Spark

Artificial Intelligence

Collaboration

PL/SQL

Data Processing

Extract

Transform

Load

Vertex

Workflow

Good Clinical Practice

Data Modeling

Apache Hive

Management

Root Cause Analysis

Technical Support

Job Details

Data Engineer with Google Cloud Platform
Onsite

Mandatory skills
Spark
Scala
Google Cloud Platform
Airflow
Dag
ETL
Pyspark

Job Description :
1. Design, develop, and automate data processing workflows using Airflow, PySpark, and Dataproc on Google Cloud Platform.
2. Develop ETL (Extract, Transform, Load) processes that handle diverse data sources and formats.
3. Manage and provision Google Cloud Platform resources including Dataproc clusters, serverless batches, Vertex AI instances, GCS buckets, and custom images.
4. Provide platform and pipeline support to analytics and product teams, troubleshooting issues related to Spark, Big Query, Airflow DAGs, and serverless workflows.
5. Collaborate with data scientists and analysts to understand data needs and deliver robust solutions.
6. Provide timely and effective technical support to internal users (e.g., data analysts, data scientists) addressing their data-related queries and problems
7. Optimize and fine-tune data systems for high performance, reliability, and cost efficiency.
8. Perform root cause analysis for recurring issues and collaborate with data analysts and scientists to implement preventative measures to minimize future occurrences.

Required Skills:
Strong programming skills in Python, SQL
Hands-on experience with cloud platforms
Expertise in Google Cloud Platform data tools: BigQuery, Dataproc, Vertex AI, Pub/Sub, Cloud Functions.
Strong hands-on experience with Apache Airflow (incl. Astronomer), PySpark, and Python.
Familiarity with SQL, SparkSQL, Hive, PL/SQL, and data modelling.
Comfortable supporting distributed data systems and large-scale batch/stream data processing.
Optimize and support Spark jobs and ETL pipelines running on Dataproc.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Data Engineer (Google Cloud Platform) - Need onsite

Job Details

About American IT Systems

Share