Data Automation Engineer

  • San Francisco, CA
  • Posted 7 hours ago | Updated 7 hours ago

Overview

On Site
Depends on Experience
Contract - Independent
Contract - W2
Contract - 12 Month(s)

Skills

S3
Spark
Airflow
Lakehouse
Great Expectations
Deequ
Databricks
Collibra
Starburst
Python
PySpark
Data Mesh
CI/CD
data quality
unit testing
Jupyter notebooks
data architecture
data modelling
data pipelines
distributed systems

Job Details

Sr./Test Data Automation Engineer

Duration:- 12 Months

Location:- San Francisco, CA 94105 / Hybrid

Sr./Test Data Automation Engineer

Qualifications:

  • Experience with data engineering including experience with Data Mesh Frameworks.
  • Understanding of the automation strategies & techniques available for testing data ingestion using CI/CD infrastructure and data quality frameworks.
  • 2+ years experience with tools such as Databricks, Collibra, and Starburst.
  • 3+ years experience with Python and PySpark.
  • Experience using Jupyter notebooks including coding and unit testing.
  • Experience leveraging data quality frameworks such as Great Expectations or Deequ framework.
  • 4+ years of experience working in data architecture, data modelling, and building data pipelines & distributed systems at scale.
  • 2+ years of experience with a modern data stack (Object stores like S3, Spark, Airflow, Lakehouse architectures, real-time databases)
  • 2+ years of experience with automated test tools (Great Expectations, Deequ frameworks preferred.

Responsibilities:

  • As a member of the Agile delivery squad, actively participate in the various Agile rituals following the Scaled Agile processes and tools set forth by the CDP Program team.
  • Determine, coordinate, and create necessary tests to ensure data and pipeline quality.
  • Lead design, build and testing of automated frameworks including the coding of reusable components and functions, integration of test frameworks.
  • Lead automated test strategies and guide decisions to successful implementation and completion.
  • Provide traceability from business requirements and technical specifications to test conditions and defect reports.
  • Collect metrics to measure test progress and quality outcomes.
  • Coach and mentor others on test criteria, functional test, test automation and quality engineering methods and techniques.
  • Design, develop and maintain robust, and efficient data pipelines to ingest, transform, catalog, and deliver curated, trusted and quality data from disparate data sources into Common Data Platform and making it available for comprehensive business analysis.
  • Good understanding of data architecture, information security, data governance, develops processes to enforce standards.
  • Proactively identifying and resolving issues with data pipelines and analytical data stores
  • Deploying monitoring and alerting for data pipelines, data stores and implementing auto remediation wherever possible to ensure the availability and reliability of data systems.
  • Employing security, testing and automation first strategy and adhering to data engineering best practices.
  • Collaborating with cross-functional teams, including product mgmt., data scientists, analysts, and business stakeholders, to understand their data requirements and provide them with the necessary infrastructure and tools.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.