Role: Databricks Automation Engineer
Location: Seattle, WA/ Dallas, TX
Duration: 6 months (can extend)
• Define and own cross-team test architecture and frameworks for ETL/ELT, Databricks/Spark, APIs, and UIs that scale across multiple product teams and environments.
• Design and implement comprehensive data validation and data-quality strategies for large-scale telemetry processing (schema validation, row counts, statistical checks, delta comparisons, anomaly detection, drift detection).
• Architect and enforce CI/CD-integrated automated testing pipelines (Azure DevOps / GitHub Actions) supporting pre-deploy gating, contract testing, canary verification, and regression protection.
• Lead synthetic test data generation strategies and robust mocking/stubbing approaches for
downstream dependencies; ensure environment-aware and secure test harnesses for sensitive telemetry.
• Drive performance, scalability, and distributed load testing for Spark jobs and supportingmicroservices; quantify bottlenecks and propose system- and code-level remediation.
• Collaborate with data engineering, platform, DevOps, and product leadership to define acceptancecriteria, release quality gates, and automated verification SLAs.
• Implement and maintain observability for test executions; publish automation health metrics (passrates, flakiness, coverage, mean time to detection) and maintain executive-facing dashboards.
• Own test documentation, runbooks, failure triage playbooks, and on-call handover for production incidents related to data correctness and test infrastructure.
• Lead and grow a high-performing QA/test automation capability — hire, coach, establish standards, and promote best practices organization-wide
Must Have Skills:
• 8+ years of professional software engineering and testing experience, including technical ownership of QA/test practices across the SDLC.
• 3+ years leading or mentoring engineering/QA teams in technical areas (Test Lead, Principal QA Engineer, or similar).
• Deep knowledge of data engineering and lakehouse patterns: partitioning strategies, ACID/Delta semantics, schema evolution, delta table optimization, CDC, time-travel, and data lineage; able to design verification approaches for Databricks/Spark/Delta Lake systems.
• Proven experience architecting and building scalable automated test frameworks and embedding them into enterprise CI/CD using Azure DevOps or GitHub Actions.
• Strong software engineering skills with extensive coding experience in Python, Java, or Scala for test development and automation; ability to review and contribute production-grade code.
• Extensive experience testing data pipelines and big-data processing (Databricks, Spark/PySpark, Delta Lake, ADLS Gen2) or equivalent streaming/batch platforms at scale.