Overview
On Site
$60,000 - $80,000
Full Time
Skills
Python & Java
GCP
SQL
Spark & Apache Hadoop
BigQuery
Data Modeling
Data Warehousing
ETL/ELT
Data Pipelines
Dataflow
Dataproc
Cloud Storage
Pub/Sub
IaC
Data Governance
ML
Job Details
1. Programming and Scripting:
- Python, Java, Scala:Proficiency in these languages is essential for writing and deploying data processing pipelines and scripts.
- SQL:Strong SQL skills are vital for querying and manipulating data within BigQuery and other database systems.
2. Big Data Technologies:
- Spark, Beam, Apache Hadoop:Understanding these frameworks allows you to handle large datasets efficiently and design scalable data pipelines.
- BigQuery:In-depth knowledge of Google's BigQuery is essential for data warehousing and analysis.
3. Data Engineering Concepts:
- Data Modeling:Understanding different data models (relational, NoSQL) and their suitability for different use cases.
- Data Warehousing:Knowledge of data warehousing concepts and best practices for data storage and retrieval.
- ETL/ELT Processes:Expertise in designing and implementing ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) pipelines for data processing.
- Data Pipelines:Experience in designing, building, and maintaining data pipelines for both batch and real-time processing.
4. Google Cloud Platform Services:
- BigQuery: Understanding BigQuery's features, capabilities, and best practices for data warehousing and analysis.
- Dataflow: Familiarity with Dataflow for building and deploying batch and streaming data processing pipelines.
- Dataproc: Knowledge of Dataproc for running Hadoop and Spark clusters on Google Cloud Platform.
- Cloud Storage: Understanding how to use Cloud Storage for storing and accessing data.
- Pub/Sub: Knowledge of Pub/Sub for building real-time data streaming pipelines.
5. Other Relevant Skills:
- Infrastructure as Code (IaC):Experience with tools like Terraform or Cloud Deployment Manager for automating infrastructure provisioning.
- Data Governance and Security:Understanding data governance principles and implementing security measures on Google Cloud Platform.
- Machine Learning:Familiarity with machine learning concepts and Google Cloud Platform's machine learning services is increasingly important.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.