Data Architect at Santa Clara, CA (Full-time)

  • Santa Clara, CA
  • Posted 2 days ago | Updated 2 days ago

Overview

On Site
Full Time

Skills

SQL
Data Warehousing
Data Architect
Python
metadata
Pyspark

Job Details

Job Title: Data Architect

Location: Santa Clara, CA (Onsite)

Duration: Full-time

JOB DESCRIPTION

  • Implement data warehousing and data lake architectures using major cloud platforms like AWS, Azure, Databricks using their services and best practices
  • Enable data virtualization solutions like delta sharing across clouds and platforms with deep understanding of security and networking fundamentals
  • design scalable data pipelines ingesting and transforming structured and unstructured data from multiple sources(file storage, S3, HANA, other databases, SaaS application).
  • Build robust, scalable, and reusable data pipelines that are modular ensuring that data sources, ingestion components, validation functions, transformation functions, and destination are well understood for implementation.
  • Deal with Schema management and evolution. File formats for object storage (Parquet, Avro). Stages of the data pipeline (e.g., Databrick's Bronze/Silver/Gold zones).
  • Understand the data challenges and business requirements and create solutions
  • Create ERDs, complex data model designs understanding the intricacies and relationships of data appropriate for staging stores / data lakes, data warehouses, and data marts.
  • Create and present the systems, data and pipeline designs and documentation to the respective stakeholders and peers for review and feedback.
  • Write and execute automated tests (unit, integration, end-to-end) to ensure code quality and reliability.
  • Build robust CI/CD design and pipelines using Pulumi, Git, including effective branching strategies, merging, and resolving conflicts.
  • Create migration paths to unify plethora of data systems to fully managed Databricks
  • Support all the nonfunctional requirements of data, building dashboards for observability, debuggability, alerting and performance monitoring
  • Understand data governance, quality control, policies around data duplication, data definitions, company-wide processes around security and privacy, access control, lineage
  • Coordinate with IAM and other teams to implement Oauth, SSO, data access control and policy enforcement solutions within data lakes and cloud environments enabling secure user access and cross application integrations.
  • Identify and solve complex technical problems effectively communicating and collaborating with stakeholders and explaining technical concepts.
  • Lead discussions with stakeholders and IT to identify and implement the right data strategy given data sources, data locations, and use cases.
  • Build/develop code, frameworks, and data enabling solutions that enable the Ops teams to make critical business decisions.
  • strong technical skills, leadership abilities, and communication skills, enabling the team to design, build, and maintain robust data platform while helping other team members and collaborating effectively across teams.

Must-Have Skills/Experience Required:

  • Master's or bachelor's degree in computer science or information system, or equivalent experience.
  • 12+ Years in big data and cloud data warehousing technologies
  • 8+ years of relevant experience including programming knowledge (i.e Pyspark, Python, SQL).
  • 5+ years of relevant experience in big data technologies and cloud platforms (i.e Spark, AWS, Databricks).
  • 3+ years of relevant experience in data lake technologies (i.e Iceberg, Delta, Huidi) and Metadata catalogs (e.g., AWS Glue, Hive, Unity)
  • 5+ years of experience in development best practices like CICD, Unit testing, Integration testing
  • 5+ years of experience grabbing data from source systems like REST APIs, other databases using JDBC, ODBC, SFTP servers etc.
  • Experience handling Planning, forecasting, logistics, fulfillment related Ops data from SAP, Anaplan, Agile PLM, etc.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.