Overview
Skills
Job Details
Job Summary:
We are seeking a highly skilled and experienced Google Cloud Platform (Google Cloud Platform) Data Architect to lead the design and implementation of data architecture solutions on the Google Cloud ecosystem. The ideal candidate will have a strong background in data engineering, cloud computing, and architecture best practices, with a focus on scalable, secure, and high-performance data solutions.
Key Responsibilities:
Design and implement scalable data architecture on Google Cloud Platform to support analytics, data science, and reporting needs.
Develop and maintain data pipelines using Google Cloud Platform services such as BigQuery, Dataflow, Pub/Sub, Cloud Storage, and Dataproc.
Collaborate with data engineers, analysts, and business stakeholders to define data strategies and architecture roadmaps.
Ensure data governance, security, and compliance across cloud-based data platforms.
Optimize data storage, processing, and retrieval for performance and cost-effectiveness.
Guide the migration of on-premise data systems to Google Cloud Platform cloud infrastructure.
Provide technical leadership and mentoring to team members.
Required Skills & Qualifications:
8+ years of experience in data architecture or data engineering roles.
3+ years of hands-on experience with Google Cloud Platform.
Expertise in Google Cloud Platform services including BigQuery, Cloud Storage, Dataflow, Dataproc, Pub/Sub, and Cloud Composer.
Strong SQL skills and experience with Python, Java, or Scala.
Solid understanding of data modeling, ETL/ELT processes, and data warehousing principles.
Experience with data governance tools and frameworks.
Knowledge of security and compliance requirements in cloud environments.
Google Cloud Platform certifications (e.g., Professional Data Engineer, Professional Cloud Architect) are a plus.
Preferred Qualifications:
Experience with data catalog tools like Collibra or Google Data Catalog.
Familiarity with DevOps practices and CI/CD pipelines for data solutions.
Background in machine learning pipelines and real-time data processing.