Remote - Data Tester (Databricks, PySpark, and Big Data) Position

Overview

Remote
Accepts corp to corp applications
Contract - Independent
Contract - W2
Contract - to 01/30/2026

Skills

Cloud-Data Management-Databricks

Job Details

TECHNOGEN, Inc. is a Proven Leader in providing full IT Services, Software Development and Solutions for 15 years.

TECHNOGEN is a Small & Woman Owned Minority Business with GSA Advantage Certification. We have offices in VA; MD & Offshore development centers in India. We have successfully executed 100+ projects for clients ranging from small business and non-profits to Fortune 50 companies and federal, state and local agencies.


Hi,

We are looking to Hire a Talented Professional for the below Job opportunity with one of our clients,

Position: Data Tester (Databricks, PySpark, and Big Data) Position

Location: Remote

Duration: 12+ Months (Long-Term Contract)

Job Description:

  • We are seeking an experienced Data Tester with strong expertise in Databricks, PySpark, and Big Data ecosystems.
  • The ideal candidate will have a solid background in testing data pipelines, ETL workflows, and analytical data models, ensuring data integrity, accuracy, and performance across large-scale distributed systems.
  • This role requires hands-on experience with Databricks, Spark-based data processing, and strong SQL validation skills, along with familiarity in data lake / Delta Lake testing, automation, and cloud environments (AWS, Azure, or Google Cloud Platform).

Required Qualifications:

  • 8+ years of overall experience in data testing / QA within large-scale enterprise data environments.
  • 5+ years of experience in testing ETL / Big Data pipelines, validating data transformations, and ensuring data integrity.
  • 4+ years of hands-on experience with Databricks, including notebook execution, job scheduling, and workspace management.
  • 4+ years of experience in PySpark (DataFrame APIs, UDFs, transformations, joins, and data validation logic).
  • 5+ years of strong proficiency in SQL (joins, aggregations, window functions, and analytical queries) for validating complex datasets.
  • 3+ years of experience with Delta Lake or data lake testing (schema evolution, ACID transactions, time travel, partition validation).
  • 3+ years of experience in Python scripting for automation and data validation tasks.
  • 3+ years of experience with cloud-based data platforms (Azure Data Lake, AWS S3, or Google Cloud Platform BigQuery).
  • 2+ years of experience in test automation for data pipelines using tools like pytest, PySpark test frameworks, or custom Python utilities.
  • 4+ years of Strong understanding of data warehousing concepts, data modeling (Star/Snowflake), and data quality frameworks.
  • 4+ years of experience with Agile / SAFe methodologies, including story-based QA and sprint deliverables.
  • 6+ years of experience in analytical and debugging skills for identifying data mismatches, performance issues, and pipeline failures.

Preferred Qualifications:

  • Experience with CI/CD for Databricks or data testing (GitHub Actions, Jenkins, Azure DevOps).
  • Exposure to BI validation (Power BI, Tableau, Looker) for verifying downstream reports.
  • Knowledge of REST APIs for metadata validation or system integration testing.
  • Familiarity with big data tools like Hive, Spark SQL, Snowflake, and Airflow.
  • Cloud certifications (e.g., Microsoft Azure Data Engineer Associate or AWS Big Data Specialty) are a plus.

Best Regards,

Mohammad Kashif Ali

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.